fkie_cve-2025-38104
Vulnerability from fkie_nvd
Published
2025-04-18 07:15
Modified
2025-07-17 17:15
Severity ?
Summary
In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: Replace Mutex with Spinlock for RLCG register access to avoid Priority Inversion in SRIOV
RLCG Register Access is a way for virtual functions to safely access GPU
registers in a virtualized environment., including TLB flushes and
register reads. When multiple threads or VFs try to access the same
registers simultaneously, it can lead to race conditions. By using the
RLCG interface, the driver can serialize access to the registers. This
means that only one thread can access the registers at a time,
preventing conflicts and ensuring that operations are performed
correctly. Additionally, when a low-priority task holds a mutex that a
high-priority task needs, ie., If a thread holding a spinlock tries to
acquire a mutex, it can lead to priority inversion. register access in
amdgpu_virt_rlcg_reg_rw especially in a fast code path is critical.
The call stack shows that the function amdgpu_virt_rlcg_reg_rw is being
called, which attempts to acquire the mutex. This function is invoked
from amdgpu_sriov_wreg, which in turn is called from
gmc_v11_0_flush_gpu_tlb.
The [ BUG: Invalid wait context ] indicates that a thread is trying to
acquire a mutex while it is in a context that does not allow it to sleep
(like holding a spinlock).
Fixes the below:
[ 253.013423] =============================
[ 253.013434] [ BUG: Invalid wait context ]
[ 253.013446] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G U OE
[ 253.013464] -----------------------------
[ 253.013475] kworker/0:1/10 is trying to lock:
[ 253.013487] ffff9f30542e3cf8 (&adev->virt.rlcg_reg_lock){+.+.}-{3:3}, at: amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]
[ 253.013815] other info that might help us debug this:
[ 253.013827] context-{4:4}
[ 253.013835] 3 locks held by kworker/0:1/10:
[ 253.013847] #0: ffff9f3040050f58 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x3f5/0x680
[ 253.013877] #1: ffffb789c008be40 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_one_work+0x1d6/0x680
[ 253.013905] #2: ffff9f3054281838 (&adev->gmc.invalidate_lock){+.+.}-{2:2}, at: gmc_v11_0_flush_gpu_tlb+0x198/0x4f0 [amdgpu]
[ 253.014154] stack backtrace:
[ 253.014164] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G U OE 6.12.0-amdstaging-drm-next-lol-050225 #14
[ 253.014189] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 253.014203] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/18/2024
[ 253.014224] Workqueue: events work_for_cpu_fn
[ 253.014241] Call Trace:
[ 253.014250] <TASK>
[ 253.014260] dump_stack_lvl+0x9b/0xf0
[ 253.014275] dump_stack+0x10/0x20
[ 253.014287] __lock_acquire+0xa47/0x2810
[ 253.014303] ? srso_alias_return_thunk+0x5/0xfbef5
[ 253.014321] lock_acquire+0xd1/0x300
[ 253.014333] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]
[ 253.014562] ? __lock_acquire+0xa6b/0x2810
[ 253.014578] __mutex_lock+0x85/0xe20
[ 253.014591] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]
[ 253.014782] ? sched_clock_noinstr+0x9/0x10
[ 253.014795] ? srso_alias_return_thunk+0x5/0xfbef5
[ 253.014808] ? local_clock_noinstr+0xe/0xc0
[ 253.014822] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]
[ 253.015012] ? srso_alias_return_thunk+0x5/0xfbef5
[ 253.015029] mutex_lock_nested+0x1b/0x30
[ 253.015044] ? mutex_lock_nested+0x1b/0x30
[ 253.015057] amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]
[ 253.015249] amdgpu_sriov_wreg+0xc5/0xd0 [amdgpu]
[ 253.015435] gmc_v11_0_flush_gpu_tlb+0x44b/0x4f0 [amdgpu]
[ 253.015667] gfx_v11_0_hw_init+0x499/0x29c0 [amdgpu]
[ 253.015901] ? __pfx_smu_v13_0_update_pcie_parameters+0x10/0x10 [amdgpu]
[ 253.016159] ? srso_alias_return_thunk+0x5/0xfbef5
[ 253.016173] ? smu_hw_init+0x18d/0x300 [amdgpu]
[ 253.016403] amdgpu_device_init+0x29ad/0x36a0 [amdgpu]
[ 253.016614] amdgpu_driver_load_kms+0x1a/0xc0 [amdgpu]
[ 253.0170
---truncated---
References
Impacted products
Vendor | Product | Version |
---|
{ "cveTags": [], "descriptions": [ { "lang": "en", "value": "In the Linux kernel, the following vulnerability has been resolved:\n\ndrm/amdgpu: Replace Mutex with Spinlock for RLCG register access to avoid Priority Inversion in SRIOV\n\nRLCG Register Access is a way for virtual functions to safely access GPU\nregisters in a virtualized environment., including TLB flushes and\nregister reads. When multiple threads or VFs try to access the same\nregisters simultaneously, it can lead to race conditions. By using the\nRLCG interface, the driver can serialize access to the registers. This\nmeans that only one thread can access the registers at a time,\npreventing conflicts and ensuring that operations are performed\ncorrectly. Additionally, when a low-priority task holds a mutex that a\nhigh-priority task needs, ie., If a thread holding a spinlock tries to\nacquire a mutex, it can lead to priority inversion. register access in\namdgpu_virt_rlcg_reg_rw especially in a fast code path is critical.\n\nThe call stack shows that the function amdgpu_virt_rlcg_reg_rw is being\ncalled, which attempts to acquire the mutex. This function is invoked\nfrom amdgpu_sriov_wreg, which in turn is called from\ngmc_v11_0_flush_gpu_tlb.\n\nThe [ BUG: Invalid wait context ] indicates that a thread is trying to\nacquire a mutex while it is in a context that does not allow it to sleep\n(like holding a spinlock).\n\nFixes the below:\n\n[ 253.013423] =============================\n[ 253.013434] [ BUG: Invalid wait context ]\n[ 253.013446] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G U OE\n[ 253.013464] -----------------------------\n[ 253.013475] kworker/0:1/10 is trying to lock:\n[ 253.013487] ffff9f30542e3cf8 (\u0026adev-\u003evirt.rlcg_reg_lock){+.+.}-{3:3}, at: amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]\n[ 253.013815] other info that might help us debug this:\n[ 253.013827] context-{4:4}\n[ 253.013835] 3 locks held by kworker/0:1/10:\n[ 253.013847] #0: ffff9f3040050f58 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x3f5/0x680\n[ 253.013877] #1: ffffb789c008be40 ((work_completion)(\u0026wfc.work)){+.+.}-{0:0}, at: process_one_work+0x1d6/0x680\n[ 253.013905] #2: ffff9f3054281838 (\u0026adev-\u003egmc.invalidate_lock){+.+.}-{2:2}, at: gmc_v11_0_flush_gpu_tlb+0x198/0x4f0 [amdgpu]\n[ 253.014154] stack backtrace:\n[ 253.014164] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G U OE 6.12.0-amdstaging-drm-next-lol-050225 #14\n[ 253.014189] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE\n[ 253.014203] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/18/2024\n[ 253.014224] Workqueue: events work_for_cpu_fn\n[ 253.014241] Call Trace:\n[ 253.014250] \u003cTASK\u003e\n[ 253.014260] dump_stack_lvl+0x9b/0xf0\n[ 253.014275] dump_stack+0x10/0x20\n[ 253.014287] __lock_acquire+0xa47/0x2810\n[ 253.014303] ? srso_alias_return_thunk+0x5/0xfbef5\n[ 253.014321] lock_acquire+0xd1/0x300\n[ 253.014333] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]\n[ 253.014562] ? __lock_acquire+0xa6b/0x2810\n[ 253.014578] __mutex_lock+0x85/0xe20\n[ 253.014591] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]\n[ 253.014782] ? sched_clock_noinstr+0x9/0x10\n[ 253.014795] ? srso_alias_return_thunk+0x5/0xfbef5\n[ 253.014808] ? local_clock_noinstr+0xe/0xc0\n[ 253.014822] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]\n[ 253.015012] ? srso_alias_return_thunk+0x5/0xfbef5\n[ 253.015029] mutex_lock_nested+0x1b/0x30\n[ 253.015044] ? mutex_lock_nested+0x1b/0x30\n[ 253.015057] amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]\n[ 253.015249] amdgpu_sriov_wreg+0xc5/0xd0 [amdgpu]\n[ 253.015435] gmc_v11_0_flush_gpu_tlb+0x44b/0x4f0 [amdgpu]\n[ 253.015667] gfx_v11_0_hw_init+0x499/0x29c0 [amdgpu]\n[ 253.015901] ? __pfx_smu_v13_0_update_pcie_parameters+0x10/0x10 [amdgpu]\n[ 253.016159] ? srso_alias_return_thunk+0x5/0xfbef5\n[ 253.016173] ? smu_hw_init+0x18d/0x300 [amdgpu]\n[ 253.016403] amdgpu_device_init+0x29ad/0x36a0 [amdgpu]\n[ 253.016614] amdgpu_driver_load_kms+0x1a/0xc0 [amdgpu]\n[ 253.0170\n---truncated---" }, { "lang": "es", "value": "En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: drm/amdgpu: Reemplazar Mutex con Spinlock para el acceso a registros RLCG para evitar la inversi\u00f3n de prioridad en SRIOV. El acceso a registros RLCG es una forma de que las funciones virtuales accedan de forma segura a los registros de la GPU en un entorno virtualizado, incluyendo vaciados de TLB y lecturas de registros. Cuando varios subprocesos o VF intentan acceder a los mismos registros simult\u00e1neamente, puede provocar condiciones de ejecuci\u00f3n. Al usar la interfaz RLCG, el controlador puede serializar el acceso a los registros. Esto significa que solo un subproceso puede acceder a los registros a la vez, lo que evita conflictos y garantiza que las operaciones se realicen correctamente. Adem\u00e1s, cuando una tarea de baja prioridad contiene un mutex que necesita una tarea de alta prioridad (es decir, si un subproceso con un spinlock intenta adquirir un mutex), puede provocar una inversi\u00f3n de prioridad. El acceso a registros en amdgpu_virt_rlcg_reg_rw, especialmente en una ruta de c\u00f3digo r\u00e1pida, es cr\u00edtico. La pila de llamadas muestra que se est\u00e1 llamando a la funci\u00f3n amdgpu_virt_rlcg_reg_rw, que intenta adquirir el mutex. Esta funci\u00f3n se invoca desde amdgpu_sriov_wreg, que a su vez se llama desde gmc_v11_0_flush_gpu_tlb. El error [ERROR: Contexto de espera no v\u00e1lido] indica que un subproceso intenta adquirir un mutex mientras se encuentra en un contexto que no le permite dormir (como mantener un spinlock). Corrige lo siguiente: [253.013423] ============================== [253.013434] [ERROR: Contexto de espera no v\u00e1lido] [253.013446] 6.12.0-amdstaging-drm-next-lol-050225 #14 Contaminado: G U OE [ 253.013464] ----------------------------- [ 253.013475] kworker/0:1/10 is trying to lock: [ 253.013487] ffff9f30542e3cf8 (\u0026amp;adev-\u0026gt;virt.rlcg_reg_lock){+.+.}-{3:3}, at: amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.013815] other info that might help us debug this: [ 253.013827] context-{4:4} [ 253.013835] 3 locks held by kworker/0:1/10: [ 253.013847] #0: ffff9f3040050f58 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x3f5/0x680 [ 253.013877] #1: ffffb789c008be40 ((work_completion)(\u0026amp;wfc.work)){+.+.}-{0:0}, at: process_one_work+0x1d6/0x680 [ 253.013905] #2: ffff9f3054281838 (\u0026amp;adev-\u0026gt;gmc.invalidate_lock){+.+.}-{2:2}, at: gmc_v11_0_flush_gpu_tlb+0x198/0x4f0 [amdgpu] [ 253.014154] stack backtrace: [ 253.014164] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G U OE 6.12.0-amdstaging-drm-next-lol-050225 #14 [ 253.014189] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 253.014203] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/18/2024 [ 253.014224] Workqueue: events work_for_cpu_fn [ 253.014241] Call Trace: [ 253.014250] [ 253.014260] dump_stack_lvl+0x9b/0xf0 [ 253.014275] dump_stack+0x10/0x20 [ 253.014287] __lock_acquire+0xa47/0x2810 [ 253.014303] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.014321] lock_acquire+0xd1/0x300 [ 253.014333] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.014562] ? __lock_acquire+0xa6b/0x2810 [ 253.014578] __mutex_lock+0x85/0xe20 [ 253.014591] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.014782] ? sched_clock_noinstr+0x9/0x10 [ 253.014795] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.014808] ? local_clock_noinstr+0xe/0xc0 [ 253.014822] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.015012] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.015029] mutex_lock_nested+0x1b/0x30 [ 253.015044] ? mutex_lock_nested+0x1b/0x30 [ 253.015057] amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.015249] amdgpu_sriov_wreg+0xc5/0xd0 [amdgpu] [ 253.015435] gmc_v11_0_flush_gpu_tlb+0x44b/0x4f0 [amdgpu] [ 253.015667] gfx_v11_0_hw_init+0x499/0x29c0 [amdgpu] [ 253.015901] ? __pfx_smu_v13_0_update_pcie_parameters+0x10/0x10 [amdgpu] [ 253.016159] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.016173] ? smu_hw_init+0x18d/0x300 [amdgpu] [ 253.016403] amdgpu_device_init+0x29ad/0x36a0 [amdgpu] [ 253.016614] amdgpu_driver_load_kms+0x1a/0xc0 [amdgpu] ---truncado---" } ], "id": "CVE-2025-38104", "lastModified": "2025-07-17T17:15:36.830", "metrics": {}, "published": "2025-04-18T07:15:43.290", "references": [ { "source": "416baaa9-dc9f-4396-8d5f-8c081fb06d67", "url": "https://git.kernel.org/stable/c/07ed75bfa7ede8bfcfa303fd6efc85db1c8684c7" }, { "source": "416baaa9-dc9f-4396-8d5f-8c081fb06d67", "url": "https://git.kernel.org/stable/c/1c0378830e42c98acd69e0289882c8637d92f285" }, { "source": "416baaa9-dc9f-4396-8d5f-8c081fb06d67", "url": "https://git.kernel.org/stable/c/5c1741a0c176ae11675a64cb7f2dd21d72db6b91" }, { "source": "416baaa9-dc9f-4396-8d5f-8c081fb06d67", "url": "https://git.kernel.org/stable/c/dc0297f3198bd60108ccbd167ee5d9fa4af31ed0" } ], "sourceIdentifier": "416baaa9-dc9f-4396-8d5f-8c081fb06d67", "vulnStatus": "Awaiting Analysis" }
Loading…
Loading…
Sightings
Author | Source | Type | Date |
---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
- Confirmed: The vulnerability is confirmed from an analyst perspective.
- Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
- Patched: This vulnerability was successfully patched by the user reporting the sighting.
- Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
- Not confirmed: The user expresses doubt about the veracity of the vulnerability.
- Not patched: This vulnerability was not successfully patched by the user reporting the sighting.
Loading…
Loading…