ghsa-rqc3-hghm-q589
Vulnerability from github
In the Linux kernel, the following vulnerability has been resolved:
rcu/nocb: Fix missed RCU barrier on deoffloading
Currently, running rcutorture test with torture_type=rcu fwd_progress=8 n_barrier_cbs=8 nocbs_nthreads=8 nocbs_toggle=100 onoff_interval=60 test_boost=2, will trigger the following warning:
WARNING: CPU: 19 PID: 100 at kernel/rcu/tree_nocb.h:1061 rcu_nocb_rdp_deoffload+0x292/0x2a0
RIP: 0010:rcu_nocb_rdp_deoffload+0x292/0x2a0
Call Trace:
<TASK>
? __warn+0x7e/0x120
? rcu_nocb_rdp_deoffload+0x292/0x2a0
? report_bug+0x18e/0x1a0
? handle_bug+0x3d/0x70
? exc_invalid_op+0x18/0x70
? asm_exc_invalid_op+0x1a/0x20
? rcu_nocb_rdp_deoffload+0x292/0x2a0
rcu_nocb_cpu_deoffload+0x70/0xa0
rcu_nocb_toggle+0x136/0x1c0
? __pfx_rcu_nocb_toggle+0x10/0x10
kthread+0xd1/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2f/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
CPU0 CPU2 CPU3 //rcu_nocb_toggle //nocb_cb_wait //rcutorture
// deoffload CPU1 // process CPU1's rdp rcu_barrier() rcu_segcblist_entrain() rcu_segcblist_add_len(1); // len == 2 // enqueue barrier // callback to CPU1's // rdp->cblist rcu_do_batch() // invoke CPU1's rdp->cblist // callback rcu_barrier_callback() rcu_barrier() mutex_lock(&rcu_state.barrier_mutex); // still see len == 2 // enqueue barrier callback // to CPU1's rdp->cblist rcu_segcblist_entrain() rcu_segcblist_add_len(1); // len == 3 // decrement len rcu_segcblist_add_len(-2); kthread_parkme()
// CPU1's rdp->cblist len == 1 // Warn because there is // still a pending barrier // trigger warning WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist)); cpus_read_unlock();
// wait CPU1 to comes online and
// invoke barrier callback on
// CPU1 rdp's->cblist
wait_for_completion(&rcu_state.barrier_completion);
// deoffload CPU4 cpus_read_lock() rcu_barrier() mutex_lock(&rcu_state.barrier_mutex); // block on barrier_mutex // wait rcu_barrier() on // CPU3 to unlock barrier_mutex // but CPU3 unlock barrier_mutex // need to wait CPU1 comes online // when CPU1 going online will block on cpus_write_lock
The above scenario will not only trigger a WARN_ON_ONCE(), but also trigger a deadlock.
Thanks to nocb locking, a second racing rcu_barrier() on an offline CPU will either observe the decremented callback counter down to 0 and spare the callback enqueue, or rcuo will observe the new callback and keep rdp->nocb_cb_sleep to false.
Therefore check rdp->nocb_cb_sleep before parking to make sure no further rcu_barrier() is waiting on the rdp.
{ "affected": [], "aliases": [ "CVE-2024-56547" ], "database_specific": { "cwe_ids": [], "github_reviewed": false, "github_reviewed_at": null, "nvd_published_at": "2024-12-27T14:15:34Z", "severity": null }, "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nrcu/nocb: Fix missed RCU barrier on deoffloading\n\nCurrently, running rcutorture test with torture_type=rcu fwd_progress=8\nn_barrier_cbs=8 nocbs_nthreads=8 nocbs_toggle=100 onoff_interval=60\ntest_boost=2, will trigger the following warning:\n\n\tWARNING: CPU: 19 PID: 100 at kernel/rcu/tree_nocb.h:1061 rcu_nocb_rdp_deoffload+0x292/0x2a0\n\tRIP: 0010:rcu_nocb_rdp_deoffload+0x292/0x2a0\n\t Call Trace:\n\t \u003cTASK\u003e\n\t ? __warn+0x7e/0x120\n\t ? rcu_nocb_rdp_deoffload+0x292/0x2a0\n\t ? report_bug+0x18e/0x1a0\n\t ? handle_bug+0x3d/0x70\n\t ? exc_invalid_op+0x18/0x70\n\t ? asm_exc_invalid_op+0x1a/0x20\n\t ? rcu_nocb_rdp_deoffload+0x292/0x2a0\n\t rcu_nocb_cpu_deoffload+0x70/0xa0\n\t rcu_nocb_toggle+0x136/0x1c0\n\t ? __pfx_rcu_nocb_toggle+0x10/0x10\n\t kthread+0xd1/0x100\n\t ? __pfx_kthread+0x10/0x10\n\t ret_from_fork+0x2f/0x50\n\t ? __pfx_kthread+0x10/0x10\n\t ret_from_fork_asm+0x1a/0x30\n\t \u003c/TASK\u003e\n\nCPU0 CPU2 CPU3\n//rcu_nocb_toggle //nocb_cb_wait //rcutorture\n\n// deoffload CPU1 // process CPU1\u0027s rdp\nrcu_barrier()\n rcu_segcblist_entrain()\n rcu_segcblist_add_len(1);\n // len == 2\n // enqueue barrier\n // callback to CPU1\u0027s\n // rdp-\u003ecblist\n rcu_do_batch()\n // invoke CPU1\u0027s rdp-\u003ecblist\n // callback\n rcu_barrier_callback()\n rcu_barrier()\n mutex_lock(\u0026rcu_state.barrier_mutex);\n // still see len == 2\n // enqueue barrier callback\n // to CPU1\u0027s rdp-\u003ecblist\n rcu_segcblist_entrain()\n rcu_segcblist_add_len(1);\n // len == 3\n // decrement len\n rcu_segcblist_add_len(-2);\n kthread_parkme()\n\n// CPU1\u0027s rdp-\u003ecblist len == 1\n// Warn because there is\n// still a pending barrier\n// trigger warning\nWARN_ON_ONCE(rcu_segcblist_n_cbs(\u0026rdp-\u003ecblist));\ncpus_read_unlock();\n\n // wait CPU1 to comes online and\n // invoke barrier callback on\n // CPU1 rdp\u0027s-\u003ecblist\n wait_for_completion(\u0026rcu_state.barrier_completion);\n// deoffload CPU4\ncpus_read_lock()\n rcu_barrier()\n mutex_lock(\u0026rcu_state.barrier_mutex);\n // block on barrier_mutex\n // wait rcu_barrier() on\n // CPU3 to unlock barrier_mutex\n // but CPU3 unlock barrier_mutex\n // need to wait CPU1 comes online\n // when CPU1 going online will block on cpus_write_lock\n\nThe above scenario will not only trigger a WARN_ON_ONCE(), but also\ntrigger a deadlock.\n\nThanks to nocb locking, a second racing rcu_barrier() on an offline CPU\nwill either observe the decremented callback counter down to 0 and spare\nthe callback enqueue, or rcuo will observe the new callback and keep\nrdp-\u003enocb_cb_sleep to false.\n\nTherefore check rdp-\u003enocb_cb_sleep before parking to make sure no\nfurther rcu_barrier() is waiting on the rdp.", "id": "GHSA-rqc3-hghm-q589", "modified": "2024-12-27T15:31:53Z", "published": "2024-12-27T15:31:53Z", "references": [ { "type": "ADVISORY", "url": "https://nvd.nist.gov/vuln/detail/CVE-2024-56547" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/224b62028959858294789772d372dcb36cf5f820" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/2996980e20b7a54a1869df15b3445374b850b155" } ], "schema_version": "1.4.0", "severity": [] }
Sightings
Author | Source | Type | Date |
---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
- Confirmed: The vulnerability is confirmed from an analyst perspective.
- Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
- Patched: This vulnerability was successfully patched by the user reporting the sighting.
- Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
- Not confirmed: The user expresses doubt about the veracity of the vulnerability.
- Not patched: This vulnerability was not successfully patched by the user reporting the sighting.