fkie_nvd - fkie_cve-2021-47209

fkie_cve-2021-47209

Vulnerability from fkie_nvd

Published

2024-04-10 19:15

Modified

2025-03-27 21:16

Severity ?

5.5 (Medium) - CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Summary

In the Linux kernel, the following vulnerability has been resolved: sched/fair: Prevent dead task groups from regaining cfs_rq's Kevin is reporting crashes which point to a use-after-free of a cfs_rq in update_blocked_averages(). Initial debugging revealed that we've live cfs_rq's (on_list=1) in an about to be kfree()'d task group in free_fair_sched_group(). However, it was unclear how that can happen. His kernel config happened to lead to a layout of struct sched_entity that put the 'my_q' member directly into the middle of the object which makes it incidentally overlap with SLUB's freelist pointer. That, in combination with SLAB_FREELIST_HARDENED's freelist pointer mangling, leads to a reliable access violation in form of a #GP which made the UAF fail fast. Michal seems to have run into the same issue[1]. He already correctly diagnosed that commit a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle") is causing the preconditions for the UAF to happen by re-adding cfs_rq's also to task groups that have no more running tasks, i.e. also to dead ones. His analysis, however, misses the real root cause and it cannot be seen from the crash backtrace only, as the real offender is tg_unthrottle_up() getting called via sched_cfs_period_timer() via the timer interrupt at an inconvenient time. When unregister_fair_sched_group() unlinks all cfs_rq's from the dying task group, it doesn't protect itself from getting interrupted. If the timer interrupt triggers while we iterate over all CPUs or after unregister_fair_sched_group() has finished but prior to unlinking the task group, sched_cfs_period_timer() will execute and walk the list of task groups, trying to unthrottle cfs_rq's, i.e. re-add them to the dying task group. These will later -- in free_fair_sched_group() -- be kfree()'ed while still being linked, leading to the fireworks Kevin and Michal are seeing. To fix this race, ensure the dying task group gets unlinked first. However, simply switching the order of unregistering and unlinking the task group isn't sufficient, as concurrent RCU walkers might still see it, as can be seen below: CPU1: CPU2: : timer IRQ: : do_sched_cfs_period_timer(): : : : distribute_cfs_runtime(): : rcu_read_lock(); : : : unthrottle_cfs_rq(): sched_offline_group(): : : walk_tg_tree_from(…,tg_unthrottle_up,…): list_del_rcu(&tg->list); : (1) : list_for_each_entry_rcu(child, &parent->children, siblings) : : (2) list_del_rcu(&tg->siblings); : : tg_unthrottle_up(): unregister_fair_sched_group(): struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)]; : : list_del_leaf_cfs_rq(tg->cfs_rq[cpu]); : : : : if (!cfs_rq_is_decayed(cfs_rq) || cfs_rq->nr_running) (3) : list_add_leaf_cfs_rq(cfs_rq); : : : : : : : : : ---truncated---

References

URL	Tags
416baaa9-dc9f-4396-8d5f-8c081fb06d67	https://git.kernel.org/stable/c/512e21c150c1c3ee298852660f3a796e267e62ec	Patch
416baaa9-dc9f-4396-8d5f-8c081fb06d67	https://git.kernel.org/stable/c/b027789e5e50494c2325cc70c8642e7fd6059479	Patch
af854a3a-2127-422b-91ae-364da2661108	https://git.kernel.org/stable/c/512e21c150c1c3ee298852660f3a796e267e62ec	Patch
af854a3a-2127-422b-91ae-364da2661108	https://git.kernel.org/stable/c/b027789e5e50494c2325cc70c8642e7fd6059479	Patch

Impacted products

	Vendor	Product	Version
	linux	linux_kernel	*

JSON

To clipboard

{
  "configurations": [
    {
      "nodes": [
        {
          "cpeMatch": [
            {
              "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
              "matchCriteriaId": "172C15F0-CF2B-47F2-8931-3368DC97E4E2",
              "versionEndExcluding": "5.15.5",
              "versionStartIncluding": "5.13",
              "vulnerable": true
            }
          ],
          "negate": false,
          "operator": "OR"
        }
      ]
    }
  ],
  "cveTags": [],
  "descriptions": [
    {
      "lang": "en",
      "value": "In the Linux kernel, the following vulnerability has been resolved:\n\nsched/fair: Prevent dead task groups from regaining cfs_rq\u0027s\n\nKevin is reporting crashes which point to a use-after-free of a cfs_rq\nin update_blocked_averages(). Initial debugging revealed that we\u0027ve\nlive cfs_rq\u0027s (on_list=1) in an about to be kfree()\u0027d task group in\nfree_fair_sched_group(). However, it was unclear how that can happen.\n\nHis kernel config happened to lead to a layout of struct sched_entity\nthat put the \u0027my_q\u0027 member directly into the middle of the object\nwhich makes it incidentally overlap with SLUB\u0027s freelist pointer.\nThat, in combination with SLAB_FREELIST_HARDENED\u0027s freelist pointer\nmangling, leads to a reliable access violation in form of a #GP which\nmade the UAF fail fast.\n\nMichal seems to have run into the same issue[1]. He already correctly\ndiagnosed that commit a7b359fc6a37 (\"sched/fair: Correctly insert\ncfs_rq\u0027s to list on unthrottle\") is causing the preconditions for the\nUAF to happen by re-adding cfs_rq\u0027s also to task groups that have no\nmore running tasks, i.e. also to dead ones. His analysis, however,\nmisses the real root cause and it cannot be seen from the crash\nbacktrace only, as the real offender is tg_unthrottle_up() getting\ncalled via sched_cfs_period_timer() via the timer interrupt at an\ninconvenient time.\n\nWhen unregister_fair_sched_group() unlinks all cfs_rq\u0027s from the dying\ntask group, it doesn\u0027t protect itself from getting interrupted. If the\ntimer interrupt triggers while we iterate over all CPUs or after\nunregister_fair_sched_group() has finished but prior to unlinking the\ntask group, sched_cfs_period_timer() will execute and walk the list of\ntask groups, trying to unthrottle cfs_rq\u0027s, i.e. re-add them to the\ndying task group. These will later -- in free_fair_sched_group() -- be\nkfree()\u0027ed while still being linked, leading to the fireworks Kevin\nand Michal are seeing.\n\nTo fix this race, ensure the dying task group gets unlinked first.\nHowever, simply switching the order of unregistering and unlinking the\ntask group isn\u0027t sufficient, as concurrent RCU walkers might still see\nit, as can be seen below:\n\n    CPU1:                                      CPU2:\n      :                                        timer IRQ:\n      :                                          do_sched_cfs_period_timer():\n      :                                            :\n      :                                            distribute_cfs_runtime():\n      :                                              rcu_read_lock();\n      :                                              :\n      :                                              unthrottle_cfs_rq():\n    sched_offline_group():                             :\n      :                                                walk_tg_tree_from(\u2026,tg_unthrottle_up,\u2026):\n      list_del_rcu(\u0026tg-\u003elist);                           :\n (1)  :                                                  list_for_each_entry_rcu(child, \u0026parent-\u003echildren, siblings)\n      :                                                    :\n (2)  list_del_rcu(\u0026tg-\u003esiblings);                         :\n      :                                                    tg_unthrottle_up():\n      unregister_fair_sched_group():                         struct cfs_rq *cfs_rq = tg-\u003ecfs_rq[cpu_of(rq)];\n        :                                                    :\n        list_del_leaf_cfs_rq(tg-\u003ecfs_rq[cpu]);               :\n        :                                                    :\n        :                                                    if (!cfs_rq_is_decayed(cfs_rq) || cfs_rq-\u003enr_running)\n (3)    :                                                        list_add_leaf_cfs_rq(cfs_rq);\n      :                                                      :\n      :                                                    :\n      :                                                  :\n      :                                                :\n      :                           \n---truncated---"
    },
    {
      "lang": "es",
      "value": "En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: sched/fair: Evitar que los grupos de tareas inactivos recuperen cfs_rq Kevin informa fallos que apuntan a un use-after-free de un cfs_rq en update_blocked_averages(). La depuraci\u00f3n inicial revel\u00f3 que tenemos cfs_rq activos (on_list=1) en un grupo de tareas a punto de ser kfree() en free_fair_sched_group(). Sin embargo, no estaba claro c\u00f3mo puede suceder eso. Su configuraci\u00f3n del kernel result\u00f3 en un dise\u00f1o de struct sched_entity que coloca el miembro \u0027my_q\u0027 directamente en el medio del objeto, lo que hace que se superponga incidentalmente con el puntero de lista libre de SLUB. Eso, en combinaci\u00f3n con la manipulaci\u00f3n del puntero de lista libre de SLAB_FREELIST_HARDENED, conduce a una violaci\u00f3n de acceso confiable en forma de un #GP que hizo que el UAF fallara r\u00e1pidamente. Michal parece haberse topado con el mismo problema[1]. \u00c9l ya diagnostic\u00f3 correctamente que el commit a7b359fc6a37 (\"sched/fair: Insertar correctamente cfs_rq en la lista al desregular\") est\u00e1 causando que se cumplan las condiciones previas para que se produzca la UAF al volver a agregar cfs_rq tambi\u00e9n a los grupos de tareas que ya no tienen tareas en ejecuci\u00f3n, es decir, tambi\u00e9n a los que est\u00e1n inactivos. Sin embargo, su an\u00e1lisis no detecta la causa ra\u00edz real y no se puede ver solo desde el backtrace del bloqueo, ya que el verdadero infractor es tg_unthrottle_up() que se llama a trav\u00e9s de sched_cfs_period_timer() mediante la interrupci\u00f3n del temporizador en un momento inconveniente. Cuando unregister_fair_sched_group() desvincula todos los cfs_rq del grupo de tareas que est\u00e1 inactivo, no se protege a s\u00ed mismo de ser interrumpido. Si la interrupci\u00f3n del temporizador se activa mientras iteramos sobre todas las CPU o despu\u00e9s de que unregister_fair_sched_group() haya terminado pero antes de desvincular el grupo de tareas, sched_cfs_period_timer() se ejecutar\u00e1 y recorrer\u00e1 la lista de grupos de tareas, intentando liberar cfs_rq, es decir, volver a agregarlos al grupo de tareas moribundo. Estos ser\u00e1n posteriormente -- en free_fair_sched_group() -- kfree()\u0027ed mientras siguen vinculados, lo que lleva a los fuegos artificiales que Kevin y Michal est\u00e1n viendo. Para solucionar esta ejecuci\u00f3n, aseg\u00farese de que el grupo de tareas moribundo se desvincule primero. Sin embargo, simplemente cambiar el orden de anulaci\u00f3n del registro y desvinculaci\u00f3n del grupo de tareas no es suficiente, ya que los caminantes de RCU concurrentes a\u00fan podr\u00edan verlo, como se puede ver a continuaci\u00f3n: CPU1: CPU2: : timer IRQ: : do_sched_cfs_period_timer(): : : : distributed_cfs_runtime(): : rcu_read_lock(); : : : unthrottle_cfs_rq(): sched_offline_group(): : : walk_tg_tree_from(\u2026,tg_unthrottle_up,\u2026): list_del_rcu(\u0026amp;tg-\u0026gt;list); : (1) : list_for_each_entry_rcu(child, \u0026amp;parent-\u0026gt;children, brothers) : : (2) list_del_rcu(\u0026amp;tg-\u0026gt;siblings); : : tg_unthrottle_up(): anular_registro_justo_sched_group(): struct cfs_rq *cfs_rq = tg-\u0026gt;cfs_rq[cpu_of(rq)]; : : list_del_leaf_cfs_rq(tg-\u0026gt;cfs_rq[cpu]); : : : : si (!cfs_rq_est\u00e1_deca\u00eddo(cfs_rq) || cfs_rq-\u0026gt;nr_en_ejecuci\u00f3n) (3) : lista_agregar_hoja_cfs_rq(cfs_rq); : : : : : : : : : ---truncado---"
    }
  ],
  "id": "CVE-2021-47209",
  "lastModified": "2025-03-27T21:16:39.163",
  "metrics": {
    "cvssMetricV31": [
      {
        "cvssData": {
          "attackComplexity": "LOW",
          "attackVector": "LOCAL",
          "availabilityImpact": "HIGH",
          "baseScore": 5.5,
          "baseSeverity": "MEDIUM",
          "confidentialityImpact": "NONE",
          "integrityImpact": "NONE",
          "privilegesRequired": "LOW",
          "scope": "UNCHANGED",
          "userInteraction": "NONE",
          "vectorString": "CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
          "version": "3.1"
        },
        "exploitabilityScore": 1.8,
        "impactScore": 3.6,
        "source": "nvd@nist.gov",
        "type": "Primary"
      }
    ]
  },
  "published": "2024-04-10T19:15:48.447",
  "references": [
    {
      "source": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
      "tags": [
        "Patch"
      ],
      "url": "https://git.kernel.org/stable/c/512e21c150c1c3ee298852660f3a796e267e62ec"
    },
    {
      "source": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
      "tags": [
        "Patch"
      ],
      "url": "https://git.kernel.org/stable/c/b027789e5e50494c2325cc70c8642e7fd6059479"
    },
    {
      "source": "af854a3a-2127-422b-91ae-364da2661108",
      "tags": [
        "Patch"
      ],
      "url": "https://git.kernel.org/stable/c/512e21c150c1c3ee298852660f3a796e267e62ec"
    },
    {
      "source": "af854a3a-2127-422b-91ae-364da2661108",
      "tags": [
        "Patch"
      ],
      "url": "https://git.kernel.org/stable/c/b027789e5e50494c2325cc70c8642e7fd6059479"
    }
  ],
  "sourceIdentifier": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
  "vulnStatus": "Analyzed",
  "weaknesses": [
    {
      "description": [
        {
          "lang": "en",
          "value": "CWE-416"
        }
      ],
      "source": "nvd@nist.gov",
      "type": "Primary"
    }
  ]
}

CVE-2021-47209 (GCVE-0-2021-47209)

Vulnerability from cvelistv5

Published

2024-04-10 19:01

Modified

2025-05-04 07:06

Severity ?

Summary

References

►

URL

Tags

	https://git.kernel.org/stable/c/512e21c150c1c3ee298852660f3a796e267e62ec
	https://git.kernel.org/stable/c/b027789e5e50494c2325cc70c8642e7fd6059479

Impacted products

Vendor

Product

Version

►

Linux

Version: a7b359fc6a37faaf472125867c8dc5a068c90982
Version: a7b359fc6a37faaf472125867c8dc5a068c90982

Linux

Version: 5.13

Show details on NVD website