gsd-2021-47209
Vulnerability from gsd
Modified
2024-04-11 05:05
Details
In the Linux kernel, the following vulnerability has been resolved:
sched/fair: Prevent dead task groups from regaining cfs_rq's
Kevin is reporting crashes which point to a use-after-free of a cfs_rq
in update_blocked_averages(). Initial debugging revealed that we've
live cfs_rq's (on_list=1) in an about to be kfree()'d task group in
free_fair_sched_group(). However, it was unclear how that can happen.
His kernel config happened to lead to a layout of struct sched_entity
that put the 'my_q' member directly into the middle of the object
which makes it incidentally overlap with SLUB's freelist pointer.
That, in combination with SLAB_FREELIST_HARDENED's freelist pointer
mangling, leads to a reliable access violation in form of a #GP which
made the UAF fail fast.
Michal seems to have run into the same issue[1]. He already correctly
diagnosed that commit a7b359fc6a37 ("sched/fair: Correctly insert
cfs_rq's to list on unthrottle") is causing the preconditions for the
UAF to happen by re-adding cfs_rq's also to task groups that have no
more running tasks, i.e. also to dead ones. His analysis, however,
misses the real root cause and it cannot be seen from the crash
backtrace only, as the real offender is tg_unthrottle_up() getting
called via sched_cfs_period_timer() via the timer interrupt at an
inconvenient time.
When unregister_fair_sched_group() unlinks all cfs_rq's from the dying
task group, it doesn't protect itself from getting interrupted. If the
timer interrupt triggers while we iterate over all CPUs or after
unregister_fair_sched_group() has finished but prior to unlinking the
task group, sched_cfs_period_timer() will execute and walk the list of
task groups, trying to unthrottle cfs_rq's, i.e. re-add them to the
dying task group. These will later -- in free_fair_sched_group() -- be
kfree()'ed while still being linked, leading to the fireworks Kevin
and Michal are seeing.
To fix this race, ensure the dying task group gets unlinked first.
However, simply switching the order of unregistering and unlinking the
task group isn't sufficient, as concurrent RCU walkers might still see
it, as can be seen below:
CPU1: CPU2:
: timer IRQ:
: do_sched_cfs_period_timer():
: :
: distribute_cfs_runtime():
: rcu_read_lock();
: :
: unthrottle_cfs_rq():
sched_offline_group(): :
: walk_tg_tree_from(…,tg_unthrottle_up,…):
list_del_rcu(&tg->list); :
(1) : list_for_each_entry_rcu(child, &parent->children, siblings)
: :
(2) list_del_rcu(&tg->siblings); :
: tg_unthrottle_up():
unregister_fair_sched_group(): struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
: :
list_del_leaf_cfs_rq(tg->cfs_rq[cpu]); :
: :
: if (!cfs_rq_is_decayed(cfs_rq) || cfs_rq->nr_running)
(3) : list_add_leaf_cfs_rq(cfs_rq);
: :
: :
: :
: :
:
---truncated---
Aliases
{ "gsd": { "metadata": { "exploitCode": "unknown", "remediation": "unknown", "reportConfidence": "confirmed", "type": "vulnerability" }, "osvSchema": { "aliases": [ "CVE-2021-47209" ], "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nsched/fair: Prevent dead task groups from regaining cfs_rq\u0027s\n\nKevin is reporting crashes which point to a use-after-free of a cfs_rq\nin update_blocked_averages(). Initial debugging revealed that we\u0027ve\nlive cfs_rq\u0027s (on_list=1) in an about to be kfree()\u0027d task group in\nfree_fair_sched_group(). However, it was unclear how that can happen.\n\nHis kernel config happened to lead to a layout of struct sched_entity\nthat put the \u0027my_q\u0027 member directly into the middle of the object\nwhich makes it incidentally overlap with SLUB\u0027s freelist pointer.\nThat, in combination with SLAB_FREELIST_HARDENED\u0027s freelist pointer\nmangling, leads to a reliable access violation in form of a #GP which\nmade the UAF fail fast.\n\nMichal seems to have run into the same issue[1]. He already correctly\ndiagnosed that commit a7b359fc6a37 (\"sched/fair: Correctly insert\ncfs_rq\u0027s to list on unthrottle\") is causing the preconditions for the\nUAF to happen by re-adding cfs_rq\u0027s also to task groups that have no\nmore running tasks, i.e. also to dead ones. His analysis, however,\nmisses the real root cause and it cannot be seen from the crash\nbacktrace only, as the real offender is tg_unthrottle_up() getting\ncalled via sched_cfs_period_timer() via the timer interrupt at an\ninconvenient time.\n\nWhen unregister_fair_sched_group() unlinks all cfs_rq\u0027s from the dying\ntask group, it doesn\u0027t protect itself from getting interrupted. If the\ntimer interrupt triggers while we iterate over all CPUs or after\nunregister_fair_sched_group() has finished but prior to unlinking the\ntask group, sched_cfs_period_timer() will execute and walk the list of\ntask groups, trying to unthrottle cfs_rq\u0027s, i.e. re-add them to the\ndying task group. These will later -- in free_fair_sched_group() -- be\nkfree()\u0027ed while still being linked, leading to the fireworks Kevin\nand Michal are seeing.\n\nTo fix this race, ensure the dying task group gets unlinked first.\nHowever, simply switching the order of unregistering and unlinking the\ntask group isn\u0027t sufficient, as concurrent RCU walkers might still see\nit, as can be seen below:\n\n CPU1: CPU2:\n : timer IRQ:\n : do_sched_cfs_period_timer():\n : :\n : distribute_cfs_runtime():\n : rcu_read_lock();\n : :\n : unthrottle_cfs_rq():\n sched_offline_group(): :\n : walk_tg_tree_from(\u2026,tg_unthrottle_up,\u2026):\n list_del_rcu(\u0026tg-\u003elist); :\n (1) : list_for_each_entry_rcu(child, \u0026parent-\u003echildren, siblings)\n : :\n (2) list_del_rcu(\u0026tg-\u003esiblings); :\n : tg_unthrottle_up():\n unregister_fair_sched_group(): struct cfs_rq *cfs_rq = tg-\u003ecfs_rq[cpu_of(rq)];\n : :\n list_del_leaf_cfs_rq(tg-\u003ecfs_rq[cpu]); :\n : :\n : if (!cfs_rq_is_decayed(cfs_rq) || cfs_rq-\u003enr_running)\n (3) : list_add_leaf_cfs_rq(cfs_rq);\n : :\n : :\n : :\n : :\n : \n---truncated---", "id": "GSD-2021-47209", "modified": "2024-04-11T05:05:09.539439Z", "schema_version": "1.4.0" } }, "namespaces": { "cve.org": { "CVE_data_meta": { "ASSIGNER": "cve@kernel.org", "ID": "CVE-2021-47209", "STATE": "PUBLIC" }, "affects": { "vendor": { "vendor_data": [ { "product": { "product_data": [ { "product_name": "Linux", "version": { "version_data": [ { "version_affected": "\u003c", "version_name": "a7b359fc6a37", "version_value": "512e21c150c1" }, { "version_value": "not down converted", "x_cve_json_5_version_data": { "defaultStatus": "affected", "versions": [ { "status": "affected", "version": "5.13" }, { "lessThan": "5.13", "status": "unaffected", "version": "0", "versionType": "custom" }, { "lessThanOrEqual": "5.15.*", "status": "unaffected", "version": "5.15.5", "versionType": "custom" }, { "lessThanOrEqual": "*", "status": "unaffected", "version": "5.16", "versionType": "original_commit_for_fix" } ] } } ] } } ] }, "vendor_name": "Linux" } ] } }, "data_format": "MITRE", "data_type": "CVE", "data_version": "4.0", "description": { "description_data": [ { "lang": "eng", "value": "In the Linux kernel, the following vulnerability has been resolved:\n\nsched/fair: Prevent dead task groups from regaining cfs_rq\u0027s\n\nKevin is reporting crashes which point to a use-after-free of a cfs_rq\nin update_blocked_averages(). Initial debugging revealed that we\u0027ve\nlive cfs_rq\u0027s (on_list=1) in an about to be kfree()\u0027d task group in\nfree_fair_sched_group(). However, it was unclear how that can happen.\n\nHis kernel config happened to lead to a layout of struct sched_entity\nthat put the \u0027my_q\u0027 member directly into the middle of the object\nwhich makes it incidentally overlap with SLUB\u0027s freelist pointer.\nThat, in combination with SLAB_FREELIST_HARDENED\u0027s freelist pointer\nmangling, leads to a reliable access violation in form of a #GP which\nmade the UAF fail fast.\n\nMichal seems to have run into the same issue[1]. He already correctly\ndiagnosed that commit a7b359fc6a37 (\"sched/fair: Correctly insert\ncfs_rq\u0027s to list on unthrottle\") is causing the preconditions for the\nUAF to happen by re-adding cfs_rq\u0027s also to task groups that have no\nmore running tasks, i.e. also to dead ones. His analysis, however,\nmisses the real root cause and it cannot be seen from the crash\nbacktrace only, as the real offender is tg_unthrottle_up() getting\ncalled via sched_cfs_period_timer() via the timer interrupt at an\ninconvenient time.\n\nWhen unregister_fair_sched_group() unlinks all cfs_rq\u0027s from the dying\ntask group, it doesn\u0027t protect itself from getting interrupted. If the\ntimer interrupt triggers while we iterate over all CPUs or after\nunregister_fair_sched_group() has finished but prior to unlinking the\ntask group, sched_cfs_period_timer() will execute and walk the list of\ntask groups, trying to unthrottle cfs_rq\u0027s, i.e. re-add them to the\ndying task group. These will later -- in free_fair_sched_group() -- be\nkfree()\u0027ed while still being linked, leading to the fireworks Kevin\nand Michal are seeing.\n\nTo fix this race, ensure the dying task group gets unlinked first.\nHowever, simply switching the order of unregistering and unlinking the\ntask group isn\u0027t sufficient, as concurrent RCU walkers might still see\nit, as can be seen below:\n\n CPU1: CPU2:\n : timer IRQ:\n : do_sched_cfs_period_timer():\n : :\n : distribute_cfs_runtime():\n : rcu_read_lock();\n : :\n : unthrottle_cfs_rq():\n sched_offline_group(): :\n : walk_tg_tree_from(\u2026,tg_unthrottle_up,\u2026):\n list_del_rcu(\u0026tg-\u003elist); :\n (1) : list_for_each_entry_rcu(child, \u0026parent-\u003echildren, siblings)\n : :\n (2) list_del_rcu(\u0026tg-\u003esiblings); :\n : tg_unthrottle_up():\n unregister_fair_sched_group(): struct cfs_rq *cfs_rq = tg-\u003ecfs_rq[cpu_of(rq)];\n : :\n list_del_leaf_cfs_rq(tg-\u003ecfs_rq[cpu]); :\n : :\n : if (!cfs_rq_is_decayed(cfs_rq) || cfs_rq-\u003enr_running)\n (3) : list_add_leaf_cfs_rq(cfs_rq);\n : :\n : :\n : :\n : :\n : \n---truncated---" } ] }, "generator": { "engine": "bippy-d175d3acf727" }, "problemtype": { "problemtype_data": [ { "description": [ { "lang": "eng", "value": "n/a" } ] } ] }, "references": { "reference_data": [ { "name": "https://git.kernel.org/stable/c/512e21c150c1c3ee298852660f3a796e267e62ec", "refsource": "MISC", "url": "https://git.kernel.org/stable/c/512e21c150c1c3ee298852660f3a796e267e62ec" }, { "name": "https://git.kernel.org/stable/c/b027789e5e50494c2325cc70c8642e7fd6059479", "refsource": "MISC", "url": "https://git.kernel.org/stable/c/b027789e5e50494c2325cc70c8642e7fd6059479" } ] } }, "nvd.nist.gov": { "cve": { "descriptions": [ { "lang": "en", "value": "In the Linux kernel, the following vulnerability has been resolved:\n\nsched/fair: Prevent dead task groups from regaining cfs_rq\u0027s\n\nKevin is reporting crashes which point to a use-after-free of a cfs_rq\nin update_blocked_averages(). Initial debugging revealed that we\u0027ve\nlive cfs_rq\u0027s (on_list=1) in an about to be kfree()\u0027d task group in\nfree_fair_sched_group(). However, it was unclear how that can happen.\n\nHis kernel config happened to lead to a layout of struct sched_entity\nthat put the \u0027my_q\u0027 member directly into the middle of the object\nwhich makes it incidentally overlap with SLUB\u0027s freelist pointer.\nThat, in combination with SLAB_FREELIST_HARDENED\u0027s freelist pointer\nmangling, leads to a reliable access violation in form of a #GP which\nmade the UAF fail fast.\n\nMichal seems to have run into the same issue[1]. He already correctly\ndiagnosed that commit a7b359fc6a37 (\"sched/fair: Correctly insert\ncfs_rq\u0027s to list on unthrottle\") is causing the preconditions for the\nUAF to happen by re-adding cfs_rq\u0027s also to task groups that have no\nmore running tasks, i.e. also to dead ones. His analysis, however,\nmisses the real root cause and it cannot be seen from the crash\nbacktrace only, as the real offender is tg_unthrottle_up() getting\ncalled via sched_cfs_period_timer() via the timer interrupt at an\ninconvenient time.\n\nWhen unregister_fair_sched_group() unlinks all cfs_rq\u0027s from the dying\ntask group, it doesn\u0027t protect itself from getting interrupted. If the\ntimer interrupt triggers while we iterate over all CPUs or after\nunregister_fair_sched_group() has finished but prior to unlinking the\ntask group, sched_cfs_period_timer() will execute and walk the list of\ntask groups, trying to unthrottle cfs_rq\u0027s, i.e. re-add them to the\ndying task group. These will later -- in free_fair_sched_group() -- be\nkfree()\u0027ed while still being linked, leading to the fireworks Kevin\nand Michal are seeing.\n\nTo fix this race, ensure the dying task group gets unlinked first.\nHowever, simply switching the order of unregistering and unlinking the\ntask group isn\u0027t sufficient, as concurrent RCU walkers might still see\nit, as can be seen below:\n\n CPU1: CPU2:\n : timer IRQ:\n : do_sched_cfs_period_timer():\n : :\n : distribute_cfs_runtime():\n : rcu_read_lock();\n : :\n : unthrottle_cfs_rq():\n sched_offline_group(): :\n : walk_tg_tree_from(\u2026,tg_unthrottle_up,\u2026):\n list_del_rcu(\u0026tg-\u003elist); :\n (1) : list_for_each_entry_rcu(child, \u0026parent-\u003echildren, siblings)\n : :\n (2) list_del_rcu(\u0026tg-\u003esiblings); :\n : tg_unthrottle_up():\n unregister_fair_sched_group(): struct cfs_rq *cfs_rq = tg-\u003ecfs_rq[cpu_of(rq)];\n : :\n list_del_leaf_cfs_rq(tg-\u003ecfs_rq[cpu]); :\n : :\n : if (!cfs_rq_is_decayed(cfs_rq) || cfs_rq-\u003enr_running)\n (3) : list_add_leaf_cfs_rq(cfs_rq);\n : :\n : :\n : :\n : :\n : \n---truncated---" } ], "id": "CVE-2021-47209", "lastModified": "2024-04-10T19:49:51.183", "metrics": {}, "published": "2024-04-10T19:15:48.447", "references": [ { "source": "416baaa9-dc9f-4396-8d5f-8c081fb06d67", "url": "https://git.kernel.org/stable/c/512e21c150c1c3ee298852660f3a796e267e62ec" }, { "source": "416baaa9-dc9f-4396-8d5f-8c081fb06d67", "url": "https://git.kernel.org/stable/c/b027789e5e50494c2325cc70c8642e7fd6059479" } ], "sourceIdentifier": "416baaa9-dc9f-4396-8d5f-8c081fb06d67", "vulnStatus": "Awaiting Analysis" } } } }
Loading…
Loading…
Sightings
Author | Source | Type | Date |
---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
- Confirmed: The vulnerability is confirmed from an analyst perspective.
- Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
- Patched: This vulnerability was successfully patched by the user reporting the sighting.
- Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
- Not confirmed: The user expresses doubt about the veracity of the vulnerability.
- Not patched: This vulnerability was not successfully patched by the user reporting the sighting.
Loading…
Loading…