ghsa-pgc5-4958-pw57
Vulnerability from github
Published
2025-05-20 18:30
Modified
2025-05-20 18:30
Details

In the Linux kernel, the following vulnerability has been resolved:

fix a couple of races in MNT_TREE_BENEATH handling by do_move_mount()

Normally do_lock_mount(path, _) is locking a mountpoint pinned by *path and at the time when matching unlock_mount() unlocks that location it is still pinned by the same thing.

Unfortunately, for 'beneath' case it's no longer that simple - the object being locked is not the one *path points to. It's the mountpoint of path->mnt. The thing is, without sufficient locking ->mnt_parent may change under us and none of the locks are held at that point. The rules are * mount_lock stabilizes m->mnt_parent for any mount m. * namespace_sem stabilizes m->mnt_parent, provided that m is mounted. * if either of the above holds and refcount of m is positive, we are guaranteed the same for refcount of m->mnt_parent.

namespace_sem nests inside inode_lock(), so do_lock_mount() has to take inode_lock() before grabbing namespace_sem. It does recheck that path->mnt is still mounted in the same place after getting namespace_sem, and it does take care to pin the dentry. It is needed, since otherwise we might end up with racing mount --move (or umount) happening while we were getting locks; in that case dentry would no longer be a mountpoint and could've been evicted on memory pressure along with its inode - not something you want when grabbing lock on that inode.

However, pinning a dentry is not enough - the matching mount is also pinned only by the fact that path->mnt is mounted on top it and at that point we are not holding any locks whatsoever, so the same kind of races could end up with all references to that mount gone just as we are about to enter inode_lock(). If that happens, we are left with filesystem being shut down while we are holding a dentry reference on it; results are not pretty.

What we need to do is grab both dentry and mount at the same time; that makes inode_lock() safe and avoids the problem with fs getting shut down under us. After taking namespace_sem we verify that path->mnt is still mounted (which stabilizes its ->mnt_parent) and check that it's still mounted at the same place. From that point on to the matching namespace_unlock() we are guaranteed that mount/dentry pair we'd grabbed are also pinned by being the mountpoint of path->mnt, so we can quietly drop both the dentry reference (as the current code does) and mnt one - it's OK to do under namespace_sem, since we are not dropping the final refs.

That solves the problem on do_lock_mount() side; unlock_mount() also has one, since dentry is guaranteed to stay pinned only until the namespace_unlock(). That's easy to fix - just have inode_unlock() done earlier, while it's still pinned by mp->m_dentry.

Show details on source website


{
  "affected": [],
  "aliases": [
    "CVE-2025-37988"
  ],
  "database_specific": {
    "cwe_ids": [],
    "github_reviewed": false,
    "github_reviewed_at": null,
    "nvd_published_at": "2025-05-20T18:15:45Z",
    "severity": null
  },
  "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nfix a couple of races in MNT_TREE_BENEATH handling by do_move_mount()\n\nNormally do_lock_mount(path, _) is locking a mountpoint pinned by\n*path and at the time when matching unlock_mount() unlocks that\nlocation it is still pinned by the same thing.\n\nUnfortunately, for \u0027beneath\u0027 case it\u0027s no longer that simple -\nthe object being locked is not the one *path points to.  It\u0027s the\nmountpoint of path-\u003emnt.  The thing is, without sufficient locking\n-\u003emnt_parent may change under us and none of the locks are held\nat that point.  The rules are\n\t* mount_lock stabilizes m-\u003emnt_parent for any mount m.\n\t* namespace_sem stabilizes m-\u003emnt_parent, provided that\nm is mounted.\n\t* if either of the above holds and refcount of m is positive,\nwe are guaranteed the same for refcount of m-\u003emnt_parent.\n\nnamespace_sem nests inside inode_lock(), so do_lock_mount() has\nto take inode_lock() before grabbing namespace_sem.  It does\nrecheck that path-\u003emnt is still mounted in the same place after\ngetting namespace_sem, and it does take care to pin the dentry.\nIt is needed, since otherwise we might end up with racing mount --move\n(or umount) happening while we were getting locks; in that case\ndentry would no longer be a mountpoint and could\u0027ve been evicted\non memory pressure along with its inode - not something you want\nwhen grabbing lock on that inode.\n\nHowever, pinning a dentry is not enough - the matching mount is\nalso pinned only by the fact that path-\u003emnt is mounted on top it\nand at that point we are not holding any locks whatsoever, so\nthe same kind of races could end up with all references to\nthat mount gone just as we are about to enter inode_lock().\nIf that happens, we are left with filesystem being shut down while\nwe are holding a dentry reference on it; results are not pretty.\n\nWhat we need to do is grab both dentry and mount at the same time;\nthat makes inode_lock() safe *and* avoids the problem with fs getting\nshut down under us.  After taking namespace_sem we verify that\npath-\u003emnt is still mounted (which stabilizes its -\u003emnt_parent) and\ncheck that it\u0027s still mounted at the same place.  From that point\non to the matching namespace_unlock() we are guaranteed that\nmount/dentry pair we\u0027d grabbed are also pinned by being the mountpoint\nof path-\u003emnt, so we can quietly drop both the dentry reference (as\nthe current code does) and mnt one - it\u0027s OK to do under namespace_sem,\nsince we are not dropping the final refs.\n\nThat solves the problem on do_lock_mount() side; unlock_mount()\nalso has one, since dentry is guaranteed to stay pinned only until\nthe namespace_unlock().  That\u0027s easy to fix - just have inode_unlock()\ndone earlier, while it\u0027s still pinned by mp-\u003em_dentry.",
  "id": "GHSA-pgc5-4958-pw57",
  "modified": "2025-05-20T18:30:58Z",
  "published": "2025-05-20T18:30:58Z",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2025-37988"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/0d039eac6e5950f9d1ecc9e410c2fd1feaeab3b6"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/4f435c1f4c48ff84968e2d9159f6fa41f46cf998"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/a61afd54826ac24c2c93845c4f441dbc344875b1"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/d4b21e8cd3d7efa2deb9cff534f0133e84f35086"
    }
  ],
  "schema_version": "1.4.0",
  "severity": []
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.


Loading…

Loading…