CVE-2025-37821 (GCVE-0-2025-37821)
Vulnerability from cvelistv5
Published
2025-05-08 06:26
Modified
2025-05-26 05:21
Severity ?
Summary
In the Linux kernel, the following vulnerability has been resolved: sched/eevdf: Fix se->slice being set to U64_MAX and resulting crash There is a code path in dequeue_entities() that can set the slice of a sched_entity to U64_MAX, which sometimes results in a crash. The offending case is when dequeue_entities() is called to dequeue a delayed group entity, and then the entity's parent's dequeue is delayed. In that case: 1. In the if (entity_is_task(se)) else block at the beginning of dequeue_entities(), slice is set to cfs_rq_min_slice(group_cfs_rq(se)). If the entity was delayed, then it has no queued tasks, so cfs_rq_min_slice() returns U64_MAX. 2. The first for_each_sched_entity() loop dequeues the entity. 3. If the entity was its parent's only child, then the next iteration tries to dequeue the parent. 4. If the parent's dequeue needs to be delayed, then it breaks from the first for_each_sched_entity() loop _without updating slice_. 5. The second for_each_sched_entity() loop sets the parent's ->slice to the saved slice, which is still U64_MAX. This throws off subsequent calculations with potentially catastrophic results. A manifestation we saw in production was: 6. In update_entity_lag(), se->slice is used to calculate limit, which ends up as a huge negative number. 7. limit is used in se->vlag = clamp(vlag, -limit, limit). Because limit is negative, vlag > limit, so se->vlag is set to the same huge negative number. 8. In place_entity(), se->vlag is scaled, which overflows and results in another huge (positive or negative) number. 9. The adjusted lag is subtracted from se->vruntime, which increases or decreases se->vruntime by a huge number. 10. pick_eevdf() calls entity_eligible()/vruntime_eligible(), which incorrectly returns false because the vruntime is so far from the other vruntimes on the queue, causing the (vruntime - cfs_rq->min_vruntime) * load calulation to overflow. 11. Nothing appears to be eligible, so pick_eevdf() returns NULL. 12. pick_next_entity() tries to dereference the return value of pick_eevdf() and crashes. Dumping the cfs_rq states from the core dumps with drgn showed tell-tale huge vruntime ranges and bogus vlag values, and I also traced se->slice being set to U64_MAX on live systems (which was usually "benign" since the rest of the runqueue needed to be in a particular state to crash). Fix it in dequeue_entities() by always setting slice from the first non-empty cfs_rq.
Impacted products
Vendor Product Version
Linux Linux Version: aef6987d89544d63a47753cf3741cabff0b5574c
Version: aef6987d89544d63a47753cf3741cabff0b5574c
Version: aef6987d89544d63a47753cf3741cabff0b5574c
Create a notification for this product.
Show details on NVD website


{
  "containers": {
    "cna": {
      "affected": [
        {
          "defaultStatus": "unaffected",
          "product": "Linux",
          "programFiles": [
            "kernel/sched/fair.c"
          ],
          "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
          "vendor": "Linux",
          "versions": [
            {
              "lessThan": "86b37810fa1e40b93171da023070b99ccbb4ea04",
              "status": "affected",
              "version": "aef6987d89544d63a47753cf3741cabff0b5574c",
              "versionType": "git"
            },
            {
              "lessThan": "50a665496881262519f115f1bfe5822f30580eb0",
              "status": "affected",
              "version": "aef6987d89544d63a47753cf3741cabff0b5574c",
              "versionType": "git"
            },
            {
              "lessThan": "bbce3de72be56e4b5f68924b7da9630cc89aa1a8",
              "status": "affected",
              "version": "aef6987d89544d63a47753cf3741cabff0b5574c",
              "versionType": "git"
            }
          ]
        },
        {
          "defaultStatus": "affected",
          "product": "Linux",
          "programFiles": [
            "kernel/sched/fair.c"
          ],
          "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
          "vendor": "Linux",
          "versions": [
            {
              "status": "affected",
              "version": "6.12"
            },
            {
              "lessThan": "6.12",
              "status": "unaffected",
              "version": "0",
              "versionType": "semver"
            },
            {
              "lessThanOrEqual": "6.12.*",
              "status": "unaffected",
              "version": "6.12.29",
              "versionType": "semver"
            },
            {
              "lessThanOrEqual": "6.14.*",
              "status": "unaffected",
              "version": "6.14.5",
              "versionType": "semver"
            },
            {
              "lessThanOrEqual": "*",
              "status": "unaffected",
              "version": "6.15",
              "versionType": "original_commit_for_fix"
            }
          ]
        }
      ],
      "cpeApplicability": [
        {
          "nodes": [
            {
              "cpeMatch": [
                {
                  "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
                  "versionEndExcluding": "6.12.29",
                  "versionStartIncluding": "6.12",
                  "vulnerable": true
                },
                {
                  "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
                  "versionEndExcluding": "6.14.5",
                  "versionStartIncluding": "6.12",
                  "vulnerable": true
                },
                {
                  "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
                  "versionEndExcluding": "6.15",
                  "versionStartIncluding": "6.12",
                  "vulnerable": true
                }
              ],
              "negate": false,
              "operator": "OR"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "In the Linux kernel, the following vulnerability has been resolved:\n\nsched/eevdf: Fix se-\u003eslice being set to U64_MAX and resulting crash\n\nThere is a code path in dequeue_entities() that can set the slice of a\nsched_entity to U64_MAX, which sometimes results in a crash.\n\nThe offending case is when dequeue_entities() is called to dequeue a\ndelayed group entity, and then the entity\u0027s parent\u0027s dequeue is delayed.\nIn that case:\n\n1. In the if (entity_is_task(se)) else block at the beginning of\n   dequeue_entities(), slice is set to\n   cfs_rq_min_slice(group_cfs_rq(se)). If the entity was delayed, then\n   it has no queued tasks, so cfs_rq_min_slice() returns U64_MAX.\n2. The first for_each_sched_entity() loop dequeues the entity.\n3. If the entity was its parent\u0027s only child, then the next iteration\n   tries to dequeue the parent.\n4. If the parent\u0027s dequeue needs to be delayed, then it breaks from the\n   first for_each_sched_entity() loop _without updating slice_.\n5. The second for_each_sched_entity() loop sets the parent\u0027s -\u003eslice to\n   the saved slice, which is still U64_MAX.\n\nThis throws off subsequent calculations with potentially catastrophic\nresults. A manifestation we saw in production was:\n\n6. In update_entity_lag(), se-\u003eslice is used to calculate limit, which\n   ends up as a huge negative number.\n7. limit is used in se-\u003evlag = clamp(vlag, -limit, limit). Because limit\n   is negative, vlag \u003e limit, so se-\u003evlag is set to the same huge\n   negative number.\n8. In place_entity(), se-\u003evlag is scaled, which overflows and results in\n   another huge (positive or negative) number.\n9. The adjusted lag is subtracted from se-\u003evruntime, which increases or\n   decreases se-\u003evruntime by a huge number.\n10. pick_eevdf() calls entity_eligible()/vruntime_eligible(), which\n    incorrectly returns false because the vruntime is so far from the\n    other vruntimes on the queue, causing the\n    (vruntime - cfs_rq-\u003emin_vruntime) * load calulation to overflow.\n11. Nothing appears to be eligible, so pick_eevdf() returns NULL.\n12. pick_next_entity() tries to dereference the return value of\n    pick_eevdf() and crashes.\n\nDumping the cfs_rq states from the core dumps with drgn showed tell-tale\nhuge vruntime ranges and bogus vlag values, and I also traced se-\u003eslice\nbeing set to U64_MAX on live systems (which was usually \"benign\" since\nthe rest of the runqueue needed to be in a particular state to crash).\n\nFix it in dequeue_entities() by always setting slice from the first\nnon-empty cfs_rq."
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2025-05-26T05:21:36.453Z",
        "orgId": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
        "shortName": "Linux"
      },
      "references": [
        {
          "url": "https://git.kernel.org/stable/c/86b37810fa1e40b93171da023070b99ccbb4ea04"
        },
        {
          "url": "https://git.kernel.org/stable/c/50a665496881262519f115f1bfe5822f30580eb0"
        },
        {
          "url": "https://git.kernel.org/stable/c/bbce3de72be56e4b5f68924b7da9630cc89aa1a8"
        }
      ],
      "title": "sched/eevdf: Fix se-\u003eslice being set to U64_MAX and resulting crash",
      "x_generator": {
        "engine": "bippy-1.2.0"
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
    "assignerShortName": "Linux",
    "cveId": "CVE-2025-37821",
    "datePublished": "2025-05-08T06:26:15.535Z",
    "dateReserved": "2025-04-16T04:51:23.947Z",
    "dateUpdated": "2025-05-26T05:21:36.453Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1",
  "vulnerability-lookup:meta": {
    "nvd": "{\"cve\":{\"id\":\"CVE-2025-37821\",\"sourceIdentifier\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\",\"published\":\"2025-05-08T07:15:53.333\",\"lastModified\":\"2025-05-18T07:15:17.197\",\"vulnStatus\":\"Awaiting Analysis\",\"cveTags\":[],\"descriptions\":[{\"lang\":\"en\",\"value\":\"In the Linux kernel, the following vulnerability has been resolved:\\n\\nsched/eevdf: Fix se-\u003eslice being set to U64_MAX and resulting crash\\n\\nThere is a code path in dequeue_entities() that can set the slice of a\\nsched_entity to U64_MAX, which sometimes results in a crash.\\n\\nThe offending case is when dequeue_entities() is called to dequeue a\\ndelayed group entity, and then the entity\u0027s parent\u0027s dequeue is delayed.\\nIn that case:\\n\\n1. In the if (entity_is_task(se)) else block at the beginning of\\n   dequeue_entities(), slice is set to\\n   cfs_rq_min_slice(group_cfs_rq(se)). If the entity was delayed, then\\n   it has no queued tasks, so cfs_rq_min_slice() returns U64_MAX.\\n2. The first for_each_sched_entity() loop dequeues the entity.\\n3. If the entity was its parent\u0027s only child, then the next iteration\\n   tries to dequeue the parent.\\n4. If the parent\u0027s dequeue needs to be delayed, then it breaks from the\\n   first for_each_sched_entity() loop _without updating slice_.\\n5. The second for_each_sched_entity() loop sets the parent\u0027s -\u003eslice to\\n   the saved slice, which is still U64_MAX.\\n\\nThis throws off subsequent calculations with potentially catastrophic\\nresults. A manifestation we saw in production was:\\n\\n6. In update_entity_lag(), se-\u003eslice is used to calculate limit, which\\n   ends up as a huge negative number.\\n7. limit is used in se-\u003evlag = clamp(vlag, -limit, limit). Because limit\\n   is negative, vlag \u003e limit, so se-\u003evlag is set to the same huge\\n   negative number.\\n8. In place_entity(), se-\u003evlag is scaled, which overflows and results in\\n   another huge (positive or negative) number.\\n9. The adjusted lag is subtracted from se-\u003evruntime, which increases or\\n   decreases se-\u003evruntime by a huge number.\\n10. pick_eevdf() calls entity_eligible()/vruntime_eligible(), which\\n    incorrectly returns false because the vruntime is so far from the\\n    other vruntimes on the queue, causing the\\n    (vruntime - cfs_rq-\u003emin_vruntime) * load calulation to overflow.\\n11. Nothing appears to be eligible, so pick_eevdf() returns NULL.\\n12. pick_next_entity() tries to dereference the return value of\\n    pick_eevdf() and crashes.\\n\\nDumping the cfs_rq states from the core dumps with drgn showed tell-tale\\nhuge vruntime ranges and bogus vlag values, and I also traced se-\u003eslice\\nbeing set to U64_MAX on live systems (which was usually \\\"benign\\\" since\\nthe rest of the runqueue needed to be in a particular state to crash).\\n\\nFix it in dequeue_entities() by always setting slice from the first\\nnon-empty cfs_rq.\"},{\"lang\":\"es\",\"value\":\"En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: sched/eevdf: Se soluciona que se-\u0026gt;slice se establezca en U64_MAX y el bloqueo resultante Hay una ruta de c\u00f3digo en dequeue_entities() que puede establecer el slice de un sched_entity en U64_MAX, lo que a veces resulta en un bloqueo. El caso ofensivo es cuando se llama a dequeue_entities() para sacar de la cola una entidad de grupo retrasada y luego se retrasa la salida de la cola del padre de la entidad. En ese caso: 1. En el bloque if (entity_is_task(se)) else al comienzo de dequeue_entities(), slice se establece en cfs_rq_min_slice(group_cfs_rq(se)). Si la entidad se retras\u00f3, entonces no tiene tareas en cola, por lo que cfs_rq_min_slice() devuelve U64_MAX. 2. El primer bucle for_each_sched_entity() saca de la cola la entidad. 3. Si la entidad era la \u00fanica hija de su padre, la siguiente iteraci\u00f3n intenta sacar de la cola al padre. 4. Si es necesario retrasar la salida de la cola del padre, se interrumpe desde el primer bucle for_each_sched_entity() _sin actualizar la porci\u00f3n_. 5. El segundo bucle for_each_sched_entity() establece la porci\u00f3n -\u0026gt; del padre en la porci\u00f3n guardada, que sigue siendo U64_MAX. Esto altera los c\u00e1lculos posteriores con resultados potencialmente catastr\u00f3ficos. Una manifestaci\u00f3n que vimos en producci\u00f3n fue: 6. En update_entity_lag(), se-\u0026gt;slice se usa para calcular el l\u00edmite, que termina como un n\u00famero negativo enorme. 7. limit se usa en se-\u0026gt;vlag = clamp(vlag, -limit, limit). Como limit es negativo, vlag \u0026gt; limit, por lo que se-\u0026gt;vlag se establece en el mismo n\u00famero negativo enorme. 8. En place_entity(), se-\u0026gt;vlag se escala, lo que se desborda y da como resultado otro n\u00famero enorme (positivo o negativo). 9. El retardo ajustado se resta de se-\u0026gt;vruntime, lo que aumenta o disminuye se-\u0026gt;vruntime considerablemente. 10. pick_eevdf() llama a entity_eligible()/vruntime_eligible(), que devuelve incorrectamente \\\"false\\\" debido a la gran distancia entre el vruntime y los dem\u00e1s vruntimes de la cola, lo que provoca un desbordamiento del c\u00e1lculo de carga (vruntime - cfs_rq-\u0026gt;min_vruntime) *. 11. Nada parece ser elegible, por lo que pick_eevdf() devuelve NULL. 12. pick_next_entity() intenta desreferenciar el valor de retorno de pick_eevdf() y se bloquea. Volcar los estados de cfs_rq desde los volcados de n\u00facleo con drgn mostr\u00f3 rangos de tiempo de ejecuci\u00f3n virtuales enormes y valores de vlag falsos, y tambi\u00e9n rastre\u00e9 que se-\u0026gt;slice se configuraba en U64_MAX en sistemas en vivo (lo cual sol\u00eda ser \\\"benigno\\\", ya que el resto de la cola de ejecuci\u00f3n deb\u00eda estar en un estado espec\u00edfico para fallar). Corr\u00edjalo en dequeue_entities() configurando siempre slice desde el primer cfs_rq no vac\u00edo.\"}],\"metrics\":{},\"references\":[{\"url\":\"https://git.kernel.org/stable/c/50a665496881262519f115f1bfe5822f30580eb0\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"},{\"url\":\"https://git.kernel.org/stable/c/86b37810fa1e40b93171da023070b99ccbb4ea04\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"},{\"url\":\"https://git.kernel.org/stable/c/bbce3de72be56e4b5f68924b7da9630cc89aa1a8\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"}]}}"
  }
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.


Loading…

Loading…