CVE-2025-21892 (GCVE-0-2025-21892)
Vulnerability from cvelistv5
Published
2025-03-27 14:57
Modified
2025-05-04 13:06
Severity ?
Summary
In the Linux kernel, the following vulnerability has been resolved: RDMA/mlx5: Fix the recovery flow of the UMR QP This patch addresses an issue in the recovery flow of the UMR QP, ensuring tasks do not get stuck, as highlighted by the call trace [1]. During recovery, before transitioning the QP to the RESET state, the software must wait for all outstanding WRs to complete. Failing to do so can cause the firmware to skip sending some flushed CQEs with errors and simply discard them upon the RESET, as per the IB specification. This race condition can result in lost CQEs and tasks becoming stuck. To resolve this, the patch sends a final WR which serves only as a barrier before moving the QP state to RESET. Once a CQE is received for that final WR, it guarantees that no outstanding WRs remain, making it safe to transition the QP to RESET and subsequently back to RTS, restoring proper functionality. Note: For the barrier WR, we simply reuse the failed and ready WR. Since the QP is in an error state, it will only receive IB_WC_WR_FLUSH_ERR. However, as it serves only as a barrier we don't care about its status. [1] INFO: task rdma_resource_l:1922 blocked for more than 120 seconds. Tainted: G W 6.12.0-rc7+ #1626 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:rdma_resource_l state:D stack:0 pid:1922 tgid:1922 ppid:1369 flags:0x00004004 Call Trace: <TASK> __schedule+0x420/0xd30 schedule+0x47/0x130 schedule_timeout+0x280/0x300 ? mark_held_locks+0x48/0x80 ? lockdep_hardirqs_on_prepare+0xe5/0x1a0 wait_for_completion+0x75/0x130 mlx5r_umr_post_send_wait+0x3c2/0x5b0 [mlx5_ib] ? __pfx_mlx5r_umr_done+0x10/0x10 [mlx5_ib] mlx5r_umr_revoke_mr+0x93/0xc0 [mlx5_ib] __mlx5_ib_dereg_mr+0x299/0x520 [mlx5_ib] ? _raw_spin_unlock_irq+0x24/0x40 ? wait_for_completion+0xfe/0x130 ? rdma_restrack_put+0x63/0xe0 [ib_core] ib_dereg_mr_user+0x5f/0x120 [ib_core] ? lock_release+0xc6/0x280 destroy_hw_idr_uobject+0x1d/0x60 [ib_uverbs] uverbs_destroy_uobject+0x58/0x1d0 [ib_uverbs] uobj_destroy+0x3f/0x70 [ib_uverbs] ib_uverbs_cmd_verbs+0x3e4/0xbb0 [ib_uverbs] ? __pfx_uverbs_destroy_def_handler+0x10/0x10 [ib_uverbs] ? __lock_acquire+0x64e/0x2080 ? mark_held_locks+0x48/0x80 ? find_held_lock+0x2d/0xa0 ? lock_acquire+0xc1/0x2f0 ? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs] ? __fget_files+0xc3/0x1b0 ib_uverbs_ioctl+0xe7/0x170 [ib_uverbs] ? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs] __x64_sys_ioctl+0x1b0/0xa70 do_syscall_64+0x6b/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f99c918b17b RSP: 002b:00007ffc766d0468 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffc766d0578 RCX: 00007f99c918b17b RDX: 00007ffc766d0560 RSI: 00000000c0181b01 RDI: 0000000000000003 RBP: 00007ffc766d0540 R08: 00007f99c8f99010 R09: 000000000000bd7e R10: 00007f99c94c1c70 R11: 0000000000000246 R12: 00007ffc766d0530 R13: 000000000000001c R14: 0000000040246a80 R15: 0000000000000000 </TASK>
Impacted products
Vendor Product Version
Linux Linux Version: 158e71bb69e368b8b33e8b7c4ac8c111da0c1ae2
Version: 158e71bb69e368b8b33e8b7c4ac8c111da0c1ae2
Version: 158e71bb69e368b8b33e8b7c4ac8c111da0c1ae2
Version: d8f7bff9a42627d37f4ecffeb01e44db42167175
Create a notification for this product.
Show details on NVD website


{
  "containers": {
    "cna": {
      "affected": [
        {
          "defaultStatus": "unaffected",
          "product": "Linux",
          "programFiles": [
            "drivers/infiniband/hw/mlx5/umr.c"
          ],
          "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
          "vendor": "Linux",
          "versions": [
            {
              "lessThan": "3e3bf255992cc02404e9d209b127c1c9944239cf",
              "status": "affected",
              "version": "158e71bb69e368b8b33e8b7c4ac8c111da0c1ae2",
              "versionType": "git"
            },
            {
              "lessThan": "1d2b84d8d054313deed2b2fcafe1168bbcb9e99f",
              "status": "affected",
              "version": "158e71bb69e368b8b33e8b7c4ac8c111da0c1ae2",
              "versionType": "git"
            },
            {
              "lessThan": "d97505baea64d93538b16baf14ce7b8c1fbad746",
              "status": "affected",
              "version": "158e71bb69e368b8b33e8b7c4ac8c111da0c1ae2",
              "versionType": "git"
            },
            {
              "status": "affected",
              "version": "d8f7bff9a42627d37f4ecffeb01e44db42167175",
              "versionType": "git"
            }
          ]
        },
        {
          "defaultStatus": "affected",
          "product": "Linux",
          "programFiles": [
            "drivers/infiniband/hw/mlx5/umr.c"
          ],
          "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
          "vendor": "Linux",
          "versions": [
            {
              "status": "affected",
              "version": "6.0"
            },
            {
              "lessThan": "6.0",
              "status": "unaffected",
              "version": "0",
              "versionType": "semver"
            },
            {
              "lessThanOrEqual": "6.12.*",
              "status": "unaffected",
              "version": "6.12.18",
              "versionType": "semver"
            },
            {
              "lessThanOrEqual": "6.13.*",
              "status": "unaffected",
              "version": "6.13.6",
              "versionType": "semver"
            },
            {
              "lessThanOrEqual": "*",
              "status": "unaffected",
              "version": "6.14",
              "versionType": "original_commit_for_fix"
            }
          ]
        }
      ],
      "cpeApplicability": [
        {
          "nodes": [
            {
              "cpeMatch": [
                {
                  "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
                  "versionEndExcluding": "6.12.18",
                  "versionStartIncluding": "6.0",
                  "vulnerable": true
                },
                {
                  "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
                  "versionEndExcluding": "6.13.6",
                  "versionStartIncluding": "6.0",
                  "vulnerable": true
                },
                {
                  "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
                  "versionEndExcluding": "6.14",
                  "versionStartIncluding": "6.0",
                  "vulnerable": true
                },
                {
                  "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
                  "versionStartIncluding": "5.19.10",
                  "vulnerable": true
                }
              ],
              "negate": false,
              "operator": "OR"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "In the Linux kernel, the following vulnerability has been resolved:\n\nRDMA/mlx5: Fix the recovery flow of the UMR QP\n\nThis patch addresses an issue in the recovery flow of the UMR QP,\nensuring tasks do not get stuck, as highlighted by the call trace [1].\n\nDuring recovery, before transitioning the QP to the RESET state, the\nsoftware must wait for all outstanding WRs to complete.\n\nFailing to do so can cause the firmware to skip sending some flushed\nCQEs with errors and simply discard them upon the RESET, as per the IB\nspecification.\n\nThis race condition can result in lost CQEs and tasks becoming stuck.\n\nTo resolve this, the patch sends a final WR which serves only as a\nbarrier before moving the QP state to RESET.\n\nOnce a CQE is received for that final WR, it guarantees that no\noutstanding WRs remain, making it safe to transition the QP to RESET and\nsubsequently back to RTS, restoring proper functionality.\n\nNote:\nFor the barrier WR, we simply reuse the failed and ready WR.\nSince the QP is in an error state, it will only receive\nIB_WC_WR_FLUSH_ERR. However, as it serves only as a barrier we don\u0027t\ncare about its status.\n\n[1]\nINFO: task rdma_resource_l:1922 blocked for more than 120 seconds.\nTainted: G        W          6.12.0-rc7+ #1626\n\"echo 0 \u003e /proc/sys/kernel/hung_task_timeout_secs\" disables this message.\ntask:rdma_resource_l state:D stack:0  pid:1922 tgid:1922  ppid:1369\n     flags:0x00004004\nCall Trace:\n\u003cTASK\u003e\n__schedule+0x420/0xd30\nschedule+0x47/0x130\nschedule_timeout+0x280/0x300\n? mark_held_locks+0x48/0x80\n? lockdep_hardirqs_on_prepare+0xe5/0x1a0\nwait_for_completion+0x75/0x130\nmlx5r_umr_post_send_wait+0x3c2/0x5b0 [mlx5_ib]\n? __pfx_mlx5r_umr_done+0x10/0x10 [mlx5_ib]\nmlx5r_umr_revoke_mr+0x93/0xc0 [mlx5_ib]\n__mlx5_ib_dereg_mr+0x299/0x520 [mlx5_ib]\n? _raw_spin_unlock_irq+0x24/0x40\n? wait_for_completion+0xfe/0x130\n? rdma_restrack_put+0x63/0xe0 [ib_core]\nib_dereg_mr_user+0x5f/0x120 [ib_core]\n? lock_release+0xc6/0x280\ndestroy_hw_idr_uobject+0x1d/0x60 [ib_uverbs]\nuverbs_destroy_uobject+0x58/0x1d0 [ib_uverbs]\nuobj_destroy+0x3f/0x70 [ib_uverbs]\nib_uverbs_cmd_verbs+0x3e4/0xbb0 [ib_uverbs]\n? __pfx_uverbs_destroy_def_handler+0x10/0x10 [ib_uverbs]\n? __lock_acquire+0x64e/0x2080\n? mark_held_locks+0x48/0x80\n? find_held_lock+0x2d/0xa0\n? lock_acquire+0xc1/0x2f0\n? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs]\n? __fget_files+0xc3/0x1b0\nib_uverbs_ioctl+0xe7/0x170 [ib_uverbs]\n? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs]\n__x64_sys_ioctl+0x1b0/0xa70\ndo_syscall_64+0x6b/0x140\nentry_SYSCALL_64_after_hwframe+0x76/0x7e\nRIP: 0033:0x7f99c918b17b\nRSP: 002b:00007ffc766d0468 EFLAGS: 00000246 ORIG_RAX:\n     0000000000000010\nRAX: ffffffffffffffda RBX: 00007ffc766d0578 RCX:\n     00007f99c918b17b\nRDX: 00007ffc766d0560 RSI: 00000000c0181b01 RDI:\n     0000000000000003\nRBP: 00007ffc766d0540 R08: 00007f99c8f99010 R09:\n     000000000000bd7e\nR10: 00007f99c94c1c70 R11: 0000000000000246 R12:\n     00007ffc766d0530\nR13: 000000000000001c R14: 0000000040246a80 R15:\n     0000000000000000\n\u003c/TASK\u003e"
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2025-05-04T13:06:41.507Z",
        "orgId": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
        "shortName": "Linux"
      },
      "references": [
        {
          "url": "https://git.kernel.org/stable/c/3e3bf255992cc02404e9d209b127c1c9944239cf"
        },
        {
          "url": "https://git.kernel.org/stable/c/1d2b84d8d054313deed2b2fcafe1168bbcb9e99f"
        },
        {
          "url": "https://git.kernel.org/stable/c/d97505baea64d93538b16baf14ce7b8c1fbad746"
        }
      ],
      "title": "RDMA/mlx5: Fix the recovery flow of the UMR QP",
      "x_generator": {
        "engine": "bippy-1.2.0"
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
    "assignerShortName": "Linux",
    "cveId": "CVE-2025-21892",
    "datePublished": "2025-03-27T14:57:17.885Z",
    "dateReserved": "2024-12-29T08:45:45.783Z",
    "dateUpdated": "2025-05-04T13:06:41.507Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1",
  "vulnerability-lookup:meta": {
    "nvd": "{\"cve\":{\"id\":\"CVE-2025-21892\",\"sourceIdentifier\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\",\"published\":\"2025-03-27T15:15:57.143\",\"lastModified\":\"2025-03-27T16:45:12.210\",\"vulnStatus\":\"Awaiting Analysis\",\"cveTags\":[],\"descriptions\":[{\"lang\":\"en\",\"value\":\"In the Linux kernel, the following vulnerability has been resolved:\\n\\nRDMA/mlx5: Fix the recovery flow of the UMR QP\\n\\nThis patch addresses an issue in the recovery flow of the UMR QP,\\nensuring tasks do not get stuck, as highlighted by the call trace [1].\\n\\nDuring recovery, before transitioning the QP to the RESET state, the\\nsoftware must wait for all outstanding WRs to complete.\\n\\nFailing to do so can cause the firmware to skip sending some flushed\\nCQEs with errors and simply discard them upon the RESET, as per the IB\\nspecification.\\n\\nThis race condition can result in lost CQEs and tasks becoming stuck.\\n\\nTo resolve this, the patch sends a final WR which serves only as a\\nbarrier before moving the QP state to RESET.\\n\\nOnce a CQE is received for that final WR, it guarantees that no\\noutstanding WRs remain, making it safe to transition the QP to RESET and\\nsubsequently back to RTS, restoring proper functionality.\\n\\nNote:\\nFor the barrier WR, we simply reuse the failed and ready WR.\\nSince the QP is in an error state, it will only receive\\nIB_WC_WR_FLUSH_ERR. However, as it serves only as a barrier we don\u0027t\\ncare about its status.\\n\\n[1]\\nINFO: task rdma_resource_l:1922 blocked for more than 120 seconds.\\nTainted: G        W          6.12.0-rc7+ #1626\\n\\\"echo 0 \u003e /proc/sys/kernel/hung_task_timeout_secs\\\" disables this message.\\ntask:rdma_resource_l state:D stack:0  pid:1922 tgid:1922  ppid:1369\\n     flags:0x00004004\\nCall Trace:\\n\u003cTASK\u003e\\n__schedule+0x420/0xd30\\nschedule+0x47/0x130\\nschedule_timeout+0x280/0x300\\n? mark_held_locks+0x48/0x80\\n? lockdep_hardirqs_on_prepare+0xe5/0x1a0\\nwait_for_completion+0x75/0x130\\nmlx5r_umr_post_send_wait+0x3c2/0x5b0 [mlx5_ib]\\n? __pfx_mlx5r_umr_done+0x10/0x10 [mlx5_ib]\\nmlx5r_umr_revoke_mr+0x93/0xc0 [mlx5_ib]\\n__mlx5_ib_dereg_mr+0x299/0x520 [mlx5_ib]\\n? _raw_spin_unlock_irq+0x24/0x40\\n? wait_for_completion+0xfe/0x130\\n? rdma_restrack_put+0x63/0xe0 [ib_core]\\nib_dereg_mr_user+0x5f/0x120 [ib_core]\\n? lock_release+0xc6/0x280\\ndestroy_hw_idr_uobject+0x1d/0x60 [ib_uverbs]\\nuverbs_destroy_uobject+0x58/0x1d0 [ib_uverbs]\\nuobj_destroy+0x3f/0x70 [ib_uverbs]\\nib_uverbs_cmd_verbs+0x3e4/0xbb0 [ib_uverbs]\\n? __pfx_uverbs_destroy_def_handler+0x10/0x10 [ib_uverbs]\\n? __lock_acquire+0x64e/0x2080\\n? mark_held_locks+0x48/0x80\\n? find_held_lock+0x2d/0xa0\\n? lock_acquire+0xc1/0x2f0\\n? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs]\\n? __fget_files+0xc3/0x1b0\\nib_uverbs_ioctl+0xe7/0x170 [ib_uverbs]\\n? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs]\\n__x64_sys_ioctl+0x1b0/0xa70\\ndo_syscall_64+0x6b/0x140\\nentry_SYSCALL_64_after_hwframe+0x76/0x7e\\nRIP: 0033:0x7f99c918b17b\\nRSP: 002b:00007ffc766d0468 EFLAGS: 00000246 ORIG_RAX:\\n     0000000000000010\\nRAX: ffffffffffffffda RBX: 00007ffc766d0578 RCX:\\n     00007f99c918b17b\\nRDX: 00007ffc766d0560 RSI: 00000000c0181b01 RDI:\\n     0000000000000003\\nRBP: 00007ffc766d0540 R08: 00007f99c8f99010 R09:\\n     000000000000bd7e\\nR10: 00007f99c94c1c70 R11: 0000000000000246 R12:\\n     00007ffc766d0530\\nR13: 000000000000001c R14: 0000000040246a80 R15:\\n     0000000000000000\\n\u003c/TASK\u003e\"},{\"lang\":\"es\",\"value\":\"En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: RDMA/mlx5: Arregla el flujo de recuperaci\u00f3n del QP UMR. Este parche soluciona un problema en el flujo de recuperaci\u00f3n del QP UMR, asegurando que las tareas no se atasquen, como se destaca en el seguimiento de llamadas [1]. Durante la recuperaci\u00f3n, antes de la transici\u00f3n del QP al estado RESET, el software debe esperar a que se completen todos los WR pendientes. De lo contrario, el firmware puede omitir el env\u00edo de algunos CQE vaciados con errores y simplemente descartarlos al RESET, seg\u00fan la especificaci\u00f3n IB. Esta condici\u00f3n de ejecuci\u00f3n puede resultar en la p\u00e9rdida de CQE y el atasque de las tareas. Para resolver esto, el parche env\u00eda un WR final que sirve solo como barrera antes de mover el estado del QP a RESET. Una vez que se recibe un CQE para ese WR final, se garantiza que no queden WR pendientes, lo que hace que sea seguro transicionar el QP a RESET y, posteriormente, volver a RTS, restaurando la funcionalidad adecuada. Nota: Para el WR de barrera, simplemente reutilizamos el WR fallido y listo. Dado que el QP se encuentra en estado de error, solo recibir\u00e1 IB_WC_WR_FLUSH_ERR. Sin embargo, dado que solo funciona como barrera, su estado no nos importa. [1] INFORMACI\u00d3N: La tarea rdma_resource_l:1922 se bloque\u00f3 durante m\u00e1s de 120 segundos. Contaminado: GW 6.12.0-rc7+ #1626 \\\"echo 0 \u0026gt; /proc/sys/kernel/hung_task_timeout_secs\\\" deshabilita este mensaje. tarea:rdma_resource_l estado:D pila:0 pid:1922 tgid:1922 ppid:1369 indicadores:0x00004004 Rastreo de llamadas: __schedule+0x420/0xd30 schedule+0x47/0x130 schedule_timeout+0x280/0x300 ? mark_held_locks+0x48/0x80 ? lockdep_hardirqs_on_prepare+0xe5/0x1a0 wait_for_completion+0x75/0x130 mlx5r_umr_post_send_wait+0x3c2/0x5b0 [mlx5_ib] ? __pfx_mlx5r_umr_done+0x10/0x10 [mlx5_ib] mlx5r_umr_revoke_mr+0x93/0xc0 [mlx5_ib] __mlx5_ib_dereg_mr+0x299/0x520 [mlx5_ib] ? _raw_spin_unlock_irq+0x24/0x40 ? wait_for_completion+0xfe/0x130 ? rdma_restrack_put+0x63/0xe0 [ib_core] ib_dereg_mr_user+0x5f/0x120 [ib_core] ? lock_release+0xc6/0x280 destroy_hw_idr_uobject+0x1d/0x60 [ib_uverbs] uverbs_destroy_uobject+0x58/0x1d0 [ib_uverbs] uobj_destroy+0x3f/0x70 [ib_uverbs] ib_uverbs_cmd_verbs+0x3e4/0xbb0 [ib_uverbs] ? __pfx_uverbs_destroy_def_handler+0x10/0x10 [ib_uverbs] ? __lock_acquire+0x64e/0x2080 ? mark_held_locks+0x48/0x80 ? find_held_lock+0x2d/0xa0 ? lock_acquire+0xc1/0x2f0 ? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs] ? __fget_files+0xc3/0x1b0 ib_uverbs_ioctl+0xe7/0x170 [ib_uverbs] ? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs] __x64_sys_ioctl+0x1b0/0xa70 do_syscall_64+0x6b/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f99c918b17b RSP: 002b:00007ffc766d0468 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffc766d0578 RCX: 00007f99c918b17b RDX: 00007ffc766d0560 RSI: 00000000c0181b01 RDI: 0000000000000003 RBP: 00007ffc766d0540 R08: 00007f99c8f99010 R09: 000000000000bd7e R10: 00007f99c94c1c70 R11: 0000000000000246 R12: 00007ffc766d0530 R13: 000000000000001c R14: 0000000040246a80 R15: 0000000000000000  \"}],\"metrics\":{},\"references\":[{\"url\":\"https://git.kernel.org/stable/c/1d2b84d8d054313deed2b2fcafe1168bbcb9e99f\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"},{\"url\":\"https://git.kernel.org/stable/c/3e3bf255992cc02404e9d209b127c1c9944239cf\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"},{\"url\":\"https://git.kernel.org/stable/c/d97505baea64d93538b16baf14ce7b8c1fbad746\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"}]}}"
  }
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.


Loading…

Loading…