CVE-2023-52934 (GCVE-0-2023-52934)
Vulnerability from cvelistv5
Published
2025-03-27 16:37
Modified
2025-05-04 07:46
Severity ?
VLAI Severity ?
EPSS score ?
Summary
In the Linux kernel, the following vulnerability has been resolved:
mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups
In commit 34488399fa08 ("mm/madvise: add file and shmem support to
MADV_COLLAPSE") we make the following change to find_pmd_or_thp_or_none():
- if (!pmd_present(pmde))
- return SCAN_PMD_NULL;
+ if (pmd_none(pmde))
+ return SCAN_PMD_NONE;
This was for-use by MADV_COLLAPSE file/shmem codepaths, where
MADV_COLLAPSE might identify a pte-mapped hugepage, only to have
khugepaged race-in, free the pte table, and clear the pmd. Such codepaths
include:
A) If we find a suitably-aligned compound page of order HPAGE_PMD_ORDER
already in the pagecache.
B) In retract_page_tables(), if we fail to grab mmap_lock for the target
mm/address.
In these cases, collapse_pte_mapped_thp() really does expect a none (not
just !present) pmd, and we want to suitably identify that case separate
from the case where no pmd is found, or it's a bad-pmd (of course, many
things could happen once we drop mmap_lock, and the pmd could plausibly
undergo multiple transitions due to intervening fault, split, etc).
Regardless, the code is prepared install a huge-pmd only when the existing
pmd entry is either a genuine pte-table-mapping-pmd, or the none-pmd.
However, the commit introduces a logical hole; namely, that we've allowed
!none- && !huge- && !bad-pmds to be classified as genuine
pte-table-mapping-pmds. One such example that could leak through are swap
entries. The pmd values aren't checked again before use in
pte_offset_map_lock(), which is expecting nothing less than a genuine
pte-table-mapping-pmd.
We want to put back the !pmd_present() check (below the pmd_none() check),
but need to be careful to deal with subtleties in pmd transitions and
treatments by various arch.
The issue is that __split_huge_pmd_locked() temporarily clears the present
bit (or otherwise marks the entry as invalid), but pmd_present() and
pmd_trans_huge() still need to return true while the pmd is in this
transitory state. For example, x86's pmd_present() also checks the
_PAGE_PSE , riscv's version also checks the _PAGE_LEAF bit, and arm64 also
checks a PMD_PRESENT_INVALID bit.
Covering all 4 cases for x86 (all checks done on the same pmd value):
1) pmd_present() && pmd_trans_huge()
All we actually know here is that the PSE bit is set. Either:
a) We aren't racing with __split_huge_page(), and PRESENT or PROTNONE
is set.
=> huge-pmd
b) We are currently racing with __split_huge_page(). The danger here
is that we proceed as-if we have a huge-pmd, but really we are
looking at a pte-mapping-pmd. So, what is the risk of this
danger?
The only relevant path is:
madvise_collapse() -> collapse_pte_mapped_thp()
Where we might just incorrectly report back "success", when really
the memory isn't pmd-backed. This is fine, since split could
happen immediately after (actually) successful madvise_collapse().
So, it should be safe to just assume huge-pmd here.
2) pmd_present() && !pmd_trans_huge()
Either:
a) PSE not set and either PRESENT or PROTNONE is.
=> pte-table-mapping pmd (or PROT_NONE)
b) devmap. This routine can be called immediately after
unlocking/locking mmap_lock -- or called with no locks held (see
khugepaged_scan_mm_slot()), so previous VMA checks have since been
invalidated.
3) !pmd_present() && pmd_trans_huge()
Not possible.
4) !pmd_present() && !pmd_trans_huge()
Neither PRESENT nor PROTNONE set
=> not present
I've checked all archs that implement pmd_trans_huge() (arm64, riscv,
powerpc, longarch, x86, mips, s390) and this logic roughly translates
(though devmap treatment is unique to x86 and powerpc, and (3) doesn't
necessarily hold in general -- but that doesn't matter since
!pmd_present() always takes failure path).
Also, add a comment above find_pmd_or_thp_or_none()
---truncated---
References
Impacted products
{ "containers": { "cna": { "affected": [ { "defaultStatus": "unaffected", "product": "Linux", "programFiles": [ "mm/khugepaged.c" ], "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git", "vendor": "Linux", "versions": [ { "lessThan": "96aaaf8666010a39430cecf8a65c7ce2908a030f", "status": "affected", "version": "34488399fa08faaf664743fa54b271eb6f9e1321", "versionType": "git" }, { "lessThan": "edb5d0cf5525357652aff6eacd9850b8ced07143", "status": "affected", "version": "34488399fa08faaf664743fa54b271eb6f9e1321", "versionType": "git" } ] }, { "defaultStatus": "affected", "product": "Linux", "programFiles": [ "mm/khugepaged.c" ], "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git", "vendor": "Linux", "versions": [ { "status": "affected", "version": "6.1" }, { "lessThan": "6.1", "status": "unaffected", "version": "0", "versionType": "semver" }, { "lessThanOrEqual": "6.1.*", "status": "unaffected", "version": "6.1.11", "versionType": "semver" }, { "lessThanOrEqual": "*", "status": "unaffected", "version": "6.2", "versionType": "original_commit_for_fix" } ] } ], "cpeApplicability": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*", "versionEndExcluding": "6.1.11", "versionStartIncluding": "6.1", "vulnerable": true }, { "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*", "versionEndExcluding": "6.2", "versionStartIncluding": "6.1", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "descriptions": [ { "lang": "en", "value": "In the Linux kernel, the following vulnerability has been resolved:\n\nmm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups\n\nIn commit 34488399fa08 (\"mm/madvise: add file and shmem support to\nMADV_COLLAPSE\") we make the following change to find_pmd_or_thp_or_none():\n\n\t- if (!pmd_present(pmde))\n\t- return SCAN_PMD_NULL;\n\t+ if (pmd_none(pmde))\n\t+ return SCAN_PMD_NONE;\n\nThis was for-use by MADV_COLLAPSE file/shmem codepaths, where\nMADV_COLLAPSE might identify a pte-mapped hugepage, only to have\nkhugepaged race-in, free the pte table, and clear the pmd. Such codepaths\ninclude:\n\nA) If we find a suitably-aligned compound page of order HPAGE_PMD_ORDER\n already in the pagecache.\nB) In retract_page_tables(), if we fail to grab mmap_lock for the target\n mm/address.\n\nIn these cases, collapse_pte_mapped_thp() really does expect a none (not\njust !present) pmd, and we want to suitably identify that case separate\nfrom the case where no pmd is found, or it\u0027s a bad-pmd (of course, many\nthings could happen once we drop mmap_lock, and the pmd could plausibly\nundergo multiple transitions due to intervening fault, split, etc). \nRegardless, the code is prepared install a huge-pmd only when the existing\npmd entry is either a genuine pte-table-mapping-pmd, or the none-pmd.\n\nHowever, the commit introduces a logical hole; namely, that we\u0027ve allowed\n!none- \u0026\u0026 !huge- \u0026\u0026 !bad-pmds to be classified as genuine\npte-table-mapping-pmds. One such example that could leak through are swap\nentries. The pmd values aren\u0027t checked again before use in\npte_offset_map_lock(), which is expecting nothing less than a genuine\npte-table-mapping-pmd.\n\nWe want to put back the !pmd_present() check (below the pmd_none() check),\nbut need to be careful to deal with subtleties in pmd transitions and\ntreatments by various arch.\n\nThe issue is that __split_huge_pmd_locked() temporarily clears the present\nbit (or otherwise marks the entry as invalid), but pmd_present() and\npmd_trans_huge() still need to return true while the pmd is in this\ntransitory state. For example, x86\u0027s pmd_present() also checks the\n_PAGE_PSE , riscv\u0027s version also checks the _PAGE_LEAF bit, and arm64 also\nchecks a PMD_PRESENT_INVALID bit.\n\nCovering all 4 cases for x86 (all checks done on the same pmd value):\n\n1) pmd_present() \u0026\u0026 pmd_trans_huge()\n All we actually know here is that the PSE bit is set. Either:\n a) We aren\u0027t racing with __split_huge_page(), and PRESENT or PROTNONE\n is set.\n =\u003e huge-pmd\n b) We are currently racing with __split_huge_page(). The danger here\n is that we proceed as-if we have a huge-pmd, but really we are\n looking at a pte-mapping-pmd. So, what is the risk of this\n danger?\n\n The only relevant path is:\n\n\tmadvise_collapse() -\u003e collapse_pte_mapped_thp()\n\n Where we might just incorrectly report back \"success\", when really\n the memory isn\u0027t pmd-backed. This is fine, since split could\n happen immediately after (actually) successful madvise_collapse().\n So, it should be safe to just assume huge-pmd here.\n\n2) pmd_present() \u0026\u0026 !pmd_trans_huge()\n Either:\n a) PSE not set and either PRESENT or PROTNONE is.\n =\u003e pte-table-mapping pmd (or PROT_NONE)\n b) devmap. This routine can be called immediately after\n unlocking/locking mmap_lock -- or called with no locks held (see\n khugepaged_scan_mm_slot()), so previous VMA checks have since been\n invalidated.\n\n3) !pmd_present() \u0026\u0026 pmd_trans_huge()\n Not possible.\n\n4) !pmd_present() \u0026\u0026 !pmd_trans_huge()\n Neither PRESENT nor PROTNONE set\n =\u003e not present\n\nI\u0027ve checked all archs that implement pmd_trans_huge() (arm64, riscv,\npowerpc, longarch, x86, mips, s390) and this logic roughly translates\n(though devmap treatment is unique to x86 and powerpc, and (3) doesn\u0027t\nnecessarily hold in general -- but that doesn\u0027t matter since\n!pmd_present() always takes failure path).\n\nAlso, add a comment above find_pmd_or_thp_or_none()\n---truncated---" } ], "providerMetadata": { "dateUpdated": "2025-05-04T07:46:19.066Z", "orgId": "416baaa9-dc9f-4396-8d5f-8c081fb06d67", "shortName": "Linux" }, "references": [ { "url": "https://git.kernel.org/stable/c/96aaaf8666010a39430cecf8a65c7ce2908a030f" }, { "url": "https://git.kernel.org/stable/c/edb5d0cf5525357652aff6eacd9850b8ced07143" } ], "title": "mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups", "x_generator": { "engine": "bippy-1.2.0" } } }, "cveMetadata": { "assignerOrgId": "416baaa9-dc9f-4396-8d5f-8c081fb06d67", "assignerShortName": "Linux", "cveId": "CVE-2023-52934", "datePublished": "2025-03-27T16:37:14.857Z", "dateReserved": "2024-08-21T06:07:11.020Z", "dateUpdated": "2025-05-04T07:46:19.066Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1", "vulnerability-lookup:meta": { "nvd": "{\"cve\":{\"id\":\"CVE-2023-52934\",\"sourceIdentifier\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\",\"published\":\"2025-03-27T17:15:43.207\",\"lastModified\":\"2025-03-28T18:11:49.747\",\"vulnStatus\":\"Awaiting Analysis\",\"cveTags\":[],\"descriptions\":[{\"lang\":\"en\",\"value\":\"In the Linux kernel, the following vulnerability has been resolved:\\n\\nmm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups\\n\\nIn commit 34488399fa08 (\\\"mm/madvise: add file and shmem support to\\nMADV_COLLAPSE\\\") we make the following change to find_pmd_or_thp_or_none():\\n\\n\\t- if (!pmd_present(pmde))\\n\\t- return SCAN_PMD_NULL;\\n\\t+ if (pmd_none(pmde))\\n\\t+ return SCAN_PMD_NONE;\\n\\nThis was for-use by MADV_COLLAPSE file/shmem codepaths, where\\nMADV_COLLAPSE might identify a pte-mapped hugepage, only to have\\nkhugepaged race-in, free the pte table, and clear the pmd. Such codepaths\\ninclude:\\n\\nA) If we find a suitably-aligned compound page of order HPAGE_PMD_ORDER\\n already in the pagecache.\\nB) In retract_page_tables(), if we fail to grab mmap_lock for the target\\n mm/address.\\n\\nIn these cases, collapse_pte_mapped_thp() really does expect a none (not\\njust !present) pmd, and we want to suitably identify that case separate\\nfrom the case where no pmd is found, or it\u0027s a bad-pmd (of course, many\\nthings could happen once we drop mmap_lock, and the pmd could plausibly\\nundergo multiple transitions due to intervening fault, split, etc). \\nRegardless, the code is prepared install a huge-pmd only when the existing\\npmd entry is either a genuine pte-table-mapping-pmd, or the none-pmd.\\n\\nHowever, the commit introduces a logical hole; namely, that we\u0027ve allowed\\n!none- \u0026\u0026 !huge- \u0026\u0026 !bad-pmds to be classified as genuine\\npte-table-mapping-pmds. One such example that could leak through are swap\\nentries. The pmd values aren\u0027t checked again before use in\\npte_offset_map_lock(), which is expecting nothing less than a genuine\\npte-table-mapping-pmd.\\n\\nWe want to put back the !pmd_present() check (below the pmd_none() check),\\nbut need to be careful to deal with subtleties in pmd transitions and\\ntreatments by various arch.\\n\\nThe issue is that __split_huge_pmd_locked() temporarily clears the present\\nbit (or otherwise marks the entry as invalid), but pmd_present() and\\npmd_trans_huge() still need to return true while the pmd is in this\\ntransitory state. For example, x86\u0027s pmd_present() also checks the\\n_PAGE_PSE , riscv\u0027s version also checks the _PAGE_LEAF bit, and arm64 also\\nchecks a PMD_PRESENT_INVALID bit.\\n\\nCovering all 4 cases for x86 (all checks done on the same pmd value):\\n\\n1) pmd_present() \u0026\u0026 pmd_trans_huge()\\n All we actually know here is that the PSE bit is set. Either:\\n a) We aren\u0027t racing with __split_huge_page(), and PRESENT or PROTNONE\\n is set.\\n =\u003e huge-pmd\\n b) We are currently racing with __split_huge_page(). The danger here\\n is that we proceed as-if we have a huge-pmd, but really we are\\n looking at a pte-mapping-pmd. So, what is the risk of this\\n danger?\\n\\n The only relevant path is:\\n\\n\\tmadvise_collapse() -\u003e collapse_pte_mapped_thp()\\n\\n Where we might just incorrectly report back \\\"success\\\", when really\\n the memory isn\u0027t pmd-backed. This is fine, since split could\\n happen immediately after (actually) successful madvise_collapse().\\n So, it should be safe to just assume huge-pmd here.\\n\\n2) pmd_present() \u0026\u0026 !pmd_trans_huge()\\n Either:\\n a) PSE not set and either PRESENT or PROTNONE is.\\n =\u003e pte-table-mapping pmd (or PROT_NONE)\\n b) devmap. This routine can be called immediately after\\n unlocking/locking mmap_lock -- or called with no locks held (see\\n khugepaged_scan_mm_slot()), so previous VMA checks have since been\\n invalidated.\\n\\n3) !pmd_present() \u0026\u0026 pmd_trans_huge()\\n Not possible.\\n\\n4) !pmd_present() \u0026\u0026 !pmd_trans_huge()\\n Neither PRESENT nor PROTNONE set\\n =\u003e not present\\n\\nI\u0027ve checked all archs that implement pmd_trans_huge() (arm64, riscv,\\npowerpc, longarch, x86, mips, s390) and this logic roughly translates\\n(though devmap treatment is unique to x86 and powerpc, and (3) doesn\u0027t\\nnecessarily hold in general -- but that doesn\u0027t matter since\\n!pmd_present() always takes failure path).\\n\\nAlso, add a comment above find_pmd_or_thp_or_none()\\n---truncated---\"},{\"lang\":\"es\",\"value\":\"En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups En el commit 34488399fa08 (\\\"mm/madvise: agregar soporte de archivo y shmem a MADV_COLLAPSE\\\") realizamos el siguiente cambio en find_pmd_or_thp_or_none(): - if (!pmd_present(pmde)) - return SCAN_PMD_NULL; + if (pmd_none(pmde)) + return SCAN_PMD_NONE; Esto era para uso de las rutas de c\u00f3digo de archivo/shmem MADV_COLLAPSE, donde MADV_COLLAPSE podr\u00eda identificar una hugepage asignada a pte, solo para que khugepaged entrara en carrera, liberara la tabla pte y borrara el pmd. Tales rutas de c\u00f3digo incluyen: A) Si encontramos una p\u00e1gina compuesta adecuadamente alineada de orden HPAGE_PMD_ORDER ya en el pagecache. B) En retract_page_tables(), si no logramos obtener mmap_lock para el mm/direcci\u00f3n objetivo. En estos casos, collapse_pte_mapped_thp() realmente espera un pmd none (no solo !present), y queremos identificar adecuadamente ese caso separado del caso donde no se encuentra ning\u00fan pmd, o es un bad-pmd (por supuesto, muchas cosas podr\u00edan suceder una vez que eliminamos mmap_lock, y el pmd podr\u00eda plausiblemente sufrir m\u00faltiples transiciones debido a la intervenci\u00f3n de un fallo, divisi\u00f3n, etc.). En cualquier caso, el c\u00f3digo est\u00e1 preparado para instalar un huge-pmd solo cuando la entrada pmd existente es un pte-table-mapping-pmd genuino, o el none-pmd. Sin embargo, la confirmaci\u00f3n introduce un agujero l\u00f3gico; Es decir, hemos permitido que los !none- \u0026amp;\u0026amp; !huge- \u0026amp;\u0026amp; !bad-pmds se clasifiquen como pte-table-mapping-pmds genuinos. Un ejemplo de fugas de informaci\u00f3n son las entradas de intercambio. Los valores de pmd no se comprueban de nuevo antes de su uso en pte_offset_map_lock(), que espera nada menos que un pte-table-mapping-pmd genuino. Queremos restablecer la comprobaci\u00f3n de !pmd_present() (debajo de la comprobaci\u00f3n de pmd_none()), pero debemos tener cuidado con las sutilezas en las transiciones y los tratamientos de pmd por parte de varias arquitecturas. El problema es que __split_huge_pmd_locked() borra temporalmente el bit presente (o marca la entrada como inv\u00e1lida), pero pmd_present() y pmd_trans_huge() a\u00fan deben devolver verdadero mientras el pmd est\u00e9 en este estado transitorio. Por ejemplo, pmd_present() de x86 tambi\u00e9n verifica _PAGE_PSE, la versi\u00f3n de riscv tambi\u00e9n verifica el bit _PAGE_LEAF y arm64 tambi\u00e9n verifica el bit PMD_PRESENT_INVALID. Cubriendo los 4 casos para x86 (todas las verificaciones realizadas en el mismo valor pmd): 1) pmd_present() y pmd_trans_huge(). Lo \u00fanico que sabemos es que el bit PSE est\u00e1 establecido. O bien: a) No estamos compitiendo con __split_huge_page(), y PRESENT o PROTNONE est\u00e1n establecidos. =\u0026gt; huge-pmd. b) Actualmente estamos compitiendo con __split_huge_page(). El peligro aqu\u00ed es que procedamos como si tuvi\u00e9ramos un huge-pmd, pero en realidad estamos viendo un pte-mapping-pmd. Entonces, \u00bfcu\u00e1l es el riesgo de este peligro? La \u00fanica ruta relevante es: madvise_collapse() -\u0026gt; colapso_pte_mapped_thp(). Donde podr\u00edamos informar incorrectamente de \\\"\u00e9xito\\\", cuando en realidad la memoria no est\u00e1 respaldada por pmd. Esto no tiene problema, ya que la divisi\u00f3n podr\u00eda ocurrir inmediatamente despu\u00e9s de una ejecuci\u00f3n (realmente) exitosa de madvise_collapse(). Por lo tanto, se puede asumir con seguridad que es huge-pmd. 2) pmd_present() \u0026amp;\u0026amp; !pmd_trans_huge(): a) PSE no definido y PRESENT o PROTNONE s\u00ed lo est\u00e1n. =\u0026gt; pte-table-mapping pmd (o PROT_NONE). b) devmap. Esta rutina puede llamarse inmediatamente despu\u00e9s de desbloquear/bloquear mmap_lock, o sin bloqueos (v\u00e9ase khugepaged_scan_mm_slot()), por lo que las comprobaciones VMA anteriores han sido invalidadas. 3) !pmd_present() \u0026amp;\u0026amp; pmd_trans_huge(): No es posible. 4) !pmd_present() \u0026amp;\u0026amp; !pmd_trans_huge() Ni PRESENT ni PROTNONE se establecen =\u0026gt; no presente. He revisado todas las arquitecturas que implementan pmd_trans_huge() (arm64, riscv, powerpc, longarch, x86, mips, s390) y esta l\u00f3gica se traduce aproximadamente ---truncated---\"}],\"metrics\":{},\"references\":[{\"url\":\"https://git.kernel.org/stable/c/96aaaf8666010a39430cecf8a65c7ce2908a030f\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"},{\"url\":\"https://git.kernel.org/stable/c/edb5d0cf5525357652aff6eacd9850b8ced07143\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"}]}}" } }
Loading…
Loading…
Sightings
Author | Source | Type | Date |
---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
- Confirmed: The vulnerability is confirmed from an analyst perspective.
- Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
- Patched: This vulnerability was successfully patched by the user reporting the sighting.
- Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
- Not confirmed: The user expresses doubt about the veracity of the vulnerability.
- Not patched: This vulnerability was not successfully patched by the user reporting the sighting.
Loading…
Loading…