ghsa-6fj2-q3x5-whq9
Vulnerability from github
Published
2025-03-27 15:31
Modified
2025-03-27 15:31
Details

In the Linux kernel, the following vulnerability has been resolved:

drm/xe/userptr: fix EFAULT handling

Currently we treat EFAULT from hmm_range_fault() as a non-fatal error when called from xe_vm_userptr_pin() with the idea that we want to avoid killing the entire vm and chucking an error, under the assumption that the user just did an unmap or something, and has no intention of actually touching that memory from the GPU. At this point we have already zapped the PTEs so any access should generate a page fault, and if the pin fails there also it will then become fatal.

However it looks like it's possible for the userptr vma to still be on the rebind list in preempt_rebind_work_func(), if we had to retry the pin again due to something happening in the caller before we did the rebind step, but in the meantime needing to re-validate the userptr and this time hitting the EFAULT.

This explains an internal user report of hitting:

[ 191.738349] WARNING: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe] [ 191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738690] Call Trace: [ 191.738692] [ 191.738694] ? show_regs+0x69/0x80 [ 191.738698] ? __warn+0x93/0x1a0 [ 191.738703] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738759] ? report_bug+0x18f/0x1a0 [ 191.738764] ? handle_bug+0x63/0xa0 [ 191.738767] ? exc_invalid_op+0x19/0x70 [ 191.738770] ? asm_exc_invalid_op+0x1b/0x20 [ 191.738777] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738834] ? ret_from_fork_asm+0x1a/0x30 [ 191.738849] bind_op_prepare+0x105/0x7b0 [xe] [ 191.738906] ? dma_resv_reserve_fences+0x301/0x380 [ 191.738912] xe_pt_update_ops_prepare+0x28c/0x4b0 [xe] [ 191.738966] ? kmemleak_alloc+0x4b/0x80 [ 191.738973] ops_execute+0x188/0x9d0 [xe] [ 191.739036] xe_vm_rebind+0x4ce/0x5a0 [xe] [ 191.739098] ? trace_hardirqs_on+0x4d/0x60 [ 191.739112] preempt_rebind_work_func+0x76f/0xd00 [xe]

Followed by NPD, when running some workload, since the sg was never actually populated but the vma is still marked for rebind when it should be skipped for this special EFAULT case. This is confirmed to fix the user report.

v2 (MattB): - Move earlier. v3 (MattB): - Update the commit message to make it clear that this indeed fixes the issue.

(cherry picked from commit 6b93cb98910c826c2e2004942f8b060311e43618)

Show details on source website


{
  "affected": [],
  "aliases": [
    "CVE-2025-21880"
  ],
  "database_specific": {
    "cwe_ids": [],
    "github_reviewed": false,
    "github_reviewed_at": null,
    "nvd_published_at": "2025-03-27T15:15:55Z",
    "severity": null
  },
  "details": "In the Linux kernel, the following vulnerability has been resolved:\n\ndrm/xe/userptr: fix EFAULT handling\n\nCurrently we treat EFAULT from hmm_range_fault() as a non-fatal error\nwhen called from xe_vm_userptr_pin() with the idea that we want to avoid\nkilling the entire vm and chucking an error, under the assumption that\nthe user just did an unmap or something, and has no intention of\nactually touching that memory from the GPU.  At this point we have\nalready zapped the PTEs so any access should generate a page fault, and\nif the pin fails there also it will then become fatal.\n\nHowever it looks like it\u0027s possible for the userptr vma to still be on\nthe rebind list in preempt_rebind_work_func(), if we had to retry the\npin again due to something happening in the caller before we did the\nrebind step, but in the meantime needing to re-validate the userptr and\nthis time hitting the EFAULT.\n\nThis explains an internal user report of hitting:\n\n[  191.738349] WARNING: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]\n[  191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe]\n[  191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]\n[  191.738690] Call Trace:\n[  191.738692]  \u003cTASK\u003e\n[  191.738694]  ? show_regs+0x69/0x80\n[  191.738698]  ? __warn+0x93/0x1a0\n[  191.738703]  ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]\n[  191.738759]  ? report_bug+0x18f/0x1a0\n[  191.738764]  ? handle_bug+0x63/0xa0\n[  191.738767]  ? exc_invalid_op+0x19/0x70\n[  191.738770]  ? asm_exc_invalid_op+0x1b/0x20\n[  191.738777]  ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]\n[  191.738834]  ? ret_from_fork_asm+0x1a/0x30\n[  191.738849]  bind_op_prepare+0x105/0x7b0 [xe]\n[  191.738906]  ? dma_resv_reserve_fences+0x301/0x380\n[  191.738912]  xe_pt_update_ops_prepare+0x28c/0x4b0 [xe]\n[  191.738966]  ? kmemleak_alloc+0x4b/0x80\n[  191.738973]  ops_execute+0x188/0x9d0 [xe]\n[  191.739036]  xe_vm_rebind+0x4ce/0x5a0 [xe]\n[  191.739098]  ? trace_hardirqs_on+0x4d/0x60\n[  191.739112]  preempt_rebind_work_func+0x76f/0xd00 [xe]\n\nFollowed by NPD, when running some workload, since the sg was never\nactually populated but the vma is still marked for rebind when it should\nbe skipped for this special EFAULT case. This is confirmed to fix the\nuser report.\n\nv2 (MattB):\n - Move earlier.\nv3 (MattB):\n - Update the commit message to make it clear that this indeed fixes the\n   issue.\n\n(cherry picked from commit 6b93cb98910c826c2e2004942f8b060311e43618)",
  "id": "GHSA-6fj2-q3x5-whq9",
  "modified": "2025-03-27T15:31:11Z",
  "published": "2025-03-27T15:31:11Z",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2025-21880"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/51cc278f8ffacd5f9dc7d13191b81b912829db59"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/a9f4fa3a7efa65615ff7db13023ac84516e99e21"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/daad16d0a538fa938e344fd83927bbcfcd8a66ec"
    }
  ],
  "schema_version": "1.4.0",
  "severity": []
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.


Loading…

Loading…