ghsa-jvv7-jhh3-m96v
Vulnerability from github
In the Linux kernel, the following vulnerability has been resolved:
mm: memcontrol: slab: fix obtain a reference to a freeing memcg
Patch series "Use obj_cgroup APIs to charge kmem pages", v5.
Since Roman's series "The new cgroup slab memory controller" applied. All slab objects are charged with the new APIs of obj_cgroup. The new APIs introduce a struct obj_cgroup to charge slab objects. It prevents long-living objects from pinning the original memory cgroup in the memory. But there are still some corner objects (e.g. allocations larger than order-1 page on SLUB) which are not charged with the new APIs. Those objects (include the pages which are allocated from buddy allocator directly) are charged as kmem pages which still hold a reference to the memory cgroup.
E.g. We know that the kernel stack is charged as kmem pages because the size of the kernel stack can be greater than 2 pages (e.g. 16KB on x86_64 or arm64). If we create a thread (suppose the thread stack is charged to memory cgroup A) and then move it from memory cgroup A to memory cgroup B. Because the kernel stack of the thread hold a reference to the memory cgroup A. The thread can pin the memory cgroup A in the memory even if we remove the cgroup A. If we want to see this scenario by using the following script. We can see that the system has added 500 dying cgroups (This is not a real world issue, just a script to show that the large kmallocs are charged as kmem pages which can pin the memory cgroup in the memory).
#!/bin/bash
cat /proc/cgroups | grep memory
cd /sys/fs/cgroup/memory
echo 1 > memory.move_charge_at_immigrate
for i in range{1..500}
do
mkdir kmem_test
echo $$ > kmem_test/cgroup.procs
sleep 3600 &
echo $$ > cgroup.procs
echo `cat kmem_test/cgroup.procs` > cgroup.procs
rmdir kmem_test
done
cat /proc/cgroups | grep memory
This patchset aims to make those kmem pages to drop the reference to memory cgroup by using the APIs of obj_cgroup. Finally, we can see that the number of the dying cgroups will not increase if we run the above test script.
This patch (of 7):
The rcu_read_lock/unlock only can guarantee that the memcg will not be freed, but it cannot guarantee the success of css_get (which is in the refill_stock when cached memcg changed) to memcg.
rcu_read_lock() memcg = obj_cgroup_memcg(old) __memcg_kmem_uncharge(memcg) refill_stock(memcg) if (stock->cached != memcg) // css_get can change the ref counter from 0 back to 1. css_get(&memcg->css) rcu_read_unlock()
This fix is very like the commit:
eefbfa7fd678 ("mm: memcg/slab: fix use after free in obj_cgroup_charge")
Fix this by holding a reference to the memcg which is passed to the __memcg_kmem_uncharge() before calling __memcg_kmem_uncharge().
{ "affected": [], "aliases": [ "CVE-2021-47011" ], "database_specific": { "cwe_ids": [], "github_reviewed": false, "github_reviewed_at": null, "nvd_published_at": "2024-02-28T09:15:38Z", "severity": "MODERATE" }, "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nmm: memcontrol: slab: fix obtain a reference to a freeing memcg\n\nPatch series \"Use obj_cgroup APIs to charge kmem pages\", v5.\n\nSince Roman\u0027s series \"The new cgroup slab memory controller\" applied.\nAll slab objects are charged with the new APIs of obj_cgroup. The new\nAPIs introduce a struct obj_cgroup to charge slab objects. It prevents\nlong-living objects from pinning the original memory cgroup in the\nmemory. But there are still some corner objects (e.g. allocations\nlarger than order-1 page on SLUB) which are not charged with the new\nAPIs. Those objects (include the pages which are allocated from buddy\nallocator directly) are charged as kmem pages which still hold a\nreference to the memory cgroup.\n\nE.g. We know that the kernel stack is charged as kmem pages because the\nsize of the kernel stack can be greater than 2 pages (e.g. 16KB on\nx86_64 or arm64). If we create a thread (suppose the thread stack is\ncharged to memory cgroup A) and then move it from memory cgroup A to\nmemory cgroup B. Because the kernel stack of the thread hold a\nreference to the memory cgroup A. The thread can pin the memory cgroup\nA in the memory even if we remove the cgroup A. If we want to see this\nscenario by using the following script. We can see that the system has\nadded 500 dying cgroups (This is not a real world issue, just a script\nto show that the large kmallocs are charged as kmem pages which can pin\nthe memory cgroup in the memory).\n\n\t#!/bin/bash\n\n\tcat /proc/cgroups | grep memory\n\n\tcd /sys/fs/cgroup/memory\n\techo 1 \u003e memory.move_charge_at_immigrate\n\n\tfor i in range{1..500}\n\tdo\n\t\tmkdir kmem_test\n\t\techo $$ \u003e kmem_test/cgroup.procs\n\t\tsleep 3600 \u0026\n\t\techo $$ \u003e cgroup.procs\n\t\techo `cat kmem_test/cgroup.procs` \u003e cgroup.procs\n\t\trmdir kmem_test\n\tdone\n\n\tcat /proc/cgroups | grep memory\n\nThis patchset aims to make those kmem pages to drop the reference to\nmemory cgroup by using the APIs of obj_cgroup. Finally, we can see that\nthe number of the dying cgroups will not increase if we run the above test\nscript.\n\nThis patch (of 7):\n\nThe rcu_read_lock/unlock only can guarantee that the memcg will not be\nfreed, but it cannot guarantee the success of css_get (which is in the\nrefill_stock when cached memcg changed) to memcg.\n\n rcu_read_lock()\n memcg = obj_cgroup_memcg(old)\n __memcg_kmem_uncharge(memcg)\n refill_stock(memcg)\n if (stock-\u003ecached != memcg)\n // css_get can change the ref counter from 0 back to 1.\n css_get(\u0026memcg-\u003ecss)\n rcu_read_unlock()\n\nThis fix is very like the commit:\n\n eefbfa7fd678 (\"mm: memcg/slab: fix use after free in obj_cgroup_charge\")\n\nFix this by holding a reference to the memcg which is passed to the\n__memcg_kmem_uncharge() before calling __memcg_kmem_uncharge().", "id": "GHSA-jvv7-jhh3-m96v", "modified": "2025-01-08T18:30:41Z", "published": "2024-02-28T09:30:37Z", "references": [ { "type": "ADVISORY", "url": "https://nvd.nist.gov/vuln/detail/CVE-2021-47011" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/31df8bc4d3feca9f9c6b2cd06fd64a111ae1a0e6" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/89b1ed358e01e1b0417f5d3b0082359a23355552" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/9f38f03ae8d5f57371b71aa6b4275765b65454fd" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/c3ae6a3f3ca4f02f6ccddf213c027302586580d0" } ], "schema_version": "1.4.0", "severity": [ { "score": "CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "type": "CVSS_V3" } ] }
Sightings
Author | Source | Type | Date |
---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
- Confirmed: The vulnerability is confirmed from an analyst perspective.
- Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
- Patched: This vulnerability was successfully patched by the user reporting the sighting.
- Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
- Not confirmed: The user expresses doubt about the veracity of the vulnerability.
- Not patched: This vulnerability was not successfully patched by the user reporting the sighting.