github - ghsa-4x9f-4x9p-28w2

ghsa-4x9f-4x9p-28w2

Vulnerability from github

Published

2025-07-28 12:30

Modified

2025-07-28 12:30

Details

In the Linux kernel, the following vulnerability has been resolved:

netfs: Fix race between cache write completion and ALL_QUEUED being set

When netfslib is issuing subrequests, the subrequests start processing immediately and may complete before we reach the end of the issuing function. At the end of the issuing function we set NETFS_RREQ_ALL_QUEUED to indicate to the collector that we aren't going to issue any more subreqs and that it can do the final notifications and cleanup.

Now, this isn't a problem if the request is synchronous (NETFS_RREQ_OFFLOAD_COLLECTION is unset) as the result collection will be done in-thread and we're guaranteed an opportunity to run the collector.

However, if the request is asynchronous, collection is primarily triggered by the termination of subrequests queuing it on a workqueue. Now, a race can occur here if the app thread sets ALL_QUEUED after the last subrequest terminates.

This can happen most easily with the copy2cache code (as used by Ceph) where, in the collection routine of a read request, an asynchronous write request is spawned to copy data to the cache. Folios are added to the write request as they're unlocked, but there may be a delay before ALL_QUEUED is set as the write subrequests may complete before we get there.

If all the write subreqs have finished by the ALL_QUEUED point, no further events happen and the collection never happens, leaving the request hanging.

Fix this by queuing the collector after setting ALL_QUEUED. This is a bit heavy-handed and it may be sufficient to do it only if there are no extant subreqs.

Also add a tracepoint to cross-reference both requests in a copy-to-request operation and add a trace to the netfs_rreq tracepoint to indicate the setting of ALL_QUEUED.

Show details on source website

JSON

To clipboard

{
  "affected": [],
  "aliases": [
    "CVE-2025-38492"
  ],
  "database_specific": {
    "cwe_ids": [],
    "github_reviewed": false,
    "github_reviewed_at": null,
    "nvd_published_at": "2025-07-28T12:15:31Z",
    "severity": null
  },
  "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nnetfs: Fix race between cache write completion and ALL_QUEUED being set\n\nWhen netfslib is issuing subrequests, the subrequests start processing\nimmediately and may complete before we reach the end of the issuing\nfunction.  At the end of the issuing function we set NETFS_RREQ_ALL_QUEUED\nto indicate to the collector that we aren\u0027t going to issue any more subreqs\nand that it can do the final notifications and cleanup.\n\nNow, this isn\u0027t a problem if the request is synchronous\n(NETFS_RREQ_OFFLOAD_COLLECTION is unset) as the result collection will be\ndone in-thread and we\u0027re guaranteed an opportunity to run the collector.\n\nHowever, if the request is asynchronous, collection is primarily triggered\nby the termination of subrequests queuing it on a workqueue.  Now, a race\ncan occur here if the app thread sets ALL_QUEUED after the last subrequest\nterminates.\n\nThis can happen most easily with the copy2cache code (as used by Ceph)\nwhere, in the collection routine of a read request, an asynchronous write\nrequest is spawned to copy data to the cache.  Folios are added to the\nwrite request as they\u0027re unlocked, but there may be a delay before\nALL_QUEUED is set as the write subrequests may complete before we get\nthere.\n\nIf all the write subreqs have finished by the ALL_QUEUED point, no further\nevents happen and the collection never happens, leaving the request\nhanging.\n\nFix this by queuing the collector after setting ALL_QUEUED.  This is a bit\nheavy-handed and it may be sufficient to do it only if there are no extant\nsubreqs.\n\nAlso add a tracepoint to cross-reference both requests in a copy-to-request\noperation and add a trace to the netfs_rreq tracepoint to indicate the\nsetting of ALL_QUEUED.",
  "id": "GHSA-4x9f-4x9p-28w2",
  "modified": "2025-07-28T12:30:36Z",
  "published": "2025-07-28T12:30:35Z",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2025-38492"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/110188a13c4853bd4c342e600ced4dfd26c3feb5"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/89635eae076cd8eaa5cb752f66538c9dc6c9fdc3"
    }
  ],
  "schema_version": "1.4.0",
  "severity": []
}

CVE-2025-38492 (GCVE-0-2025-38492)

Vulnerability from cvelistv5

Published

2025-07-28 11:22

Modified

2025-07-28 11:22

Severity ?

Summary

In the Linux kernel, the following vulnerability has been resolved: netfs: Fix race between cache write completion and ALL_QUEUED being set When netfslib is issuing subrequests, the subrequests start processing immediately and may complete before we reach the end of the issuing function. At the end of the issuing function we set NETFS_RREQ_ALL_QUEUED to indicate to the collector that we aren't going to issue any more subreqs and that it can do the final notifications and cleanup. Now, this isn't a problem if the request is synchronous (NETFS_RREQ_OFFLOAD_COLLECTION is unset) as the result collection will be done in-thread and we're guaranteed an opportunity to run the collector. However, if the request is asynchronous, collection is primarily triggered by the termination of subrequests queuing it on a workqueue. Now, a race can occur here if the app thread sets ALL_QUEUED after the last subrequest terminates. This can happen most easily with the copy2cache code (as used by Ceph) where, in the collection routine of a read request, an asynchronous write request is spawned to copy data to the cache. Folios are added to the write request as they're unlocked, but there may be a delay before ALL_QUEUED is set as the write subrequests may complete before we get there. If all the write subreqs have finished by the ALL_QUEUED point, no further events happen and the collection never happens, leaving the request hanging. Fix this by queuing the collector after setting ALL_QUEUED. This is a bit heavy-handed and it may be sufficient to do it only if there are no extant subreqs. Also add a tracepoint to cross-reference both requests in a copy-to-request operation and add a trace to the netfs_rreq tracepoint to indicate the setting of ALL_QUEUED.

References

►

URL

Tags

	https://git.kernel.org/stable/c/110188a13c4853bd4c342e600ced4dfd26c3feb5
	https://git.kernel.org/stable/c/89635eae076cd8eaa5cb752f66538c9dc6c9fdc3

Impacted products

Vendor

Product

Version

►

Linux

Version: e2d46f2ec332533816417b60933954173f602121
Version: e2d46f2ec332533816417b60933954173f602121

Linux

Version: 6.14

Show details on NVD website