Skip to content
Commit bf5678b5 authored by Laszlo Ersek's avatar Laszlo Ersek Committed by mergify[bot]
Browse files

OvmfPkg/PlatformInitLib: catch QEMU's CPU hotplug reg block regression

In QEMU v5.1.0, the CPU hotplug register block misbehaves: the negotiation
protocol is (effectively) broken such that it suggests that switching from
the legacy interface to the modern interface works, but in reality the
switch never happens. The symptom has been witnessed when using TCG
acceleration; KVM seems to mask the issue. The issue persists with the
following (latest) stable QEMU releases: v5.2.0, v6.2.0, v7.2.0. Currently
there is no stable release that addresses the problem.

The QEMU bug confuses the Present and Possible counting in function
PlatformMaxCpuCountInitialization(), in
"OvmfPkg/Library/PlatformInitLib/Platform.c". OVMF ends up with Present=0
Possible=1. This in turn further confuses MpInitLib in UefiCpuPkg (hence
firmware-time multiprocessing will be broken). Worse, CPU hot(un)plug with
SMI will be summarily broken in OvmfPkg/CpuHotplugSmm, which (considering
the privilege level of SMM) is not that great.

Detect the issue in PlatformCpuCountBugCheck(), and print an error message
and *hang* if the issue is present.

Users willing to take risks can override the hang with the experimental
QEMU command line option

  -fw_cfg name=opt/org.tianocore/X-Cpuhp-Bugcheck-Override,string=yes

(The "-fw_cfg" QEMU option itself is not experimental; its above argument,
as far it concerns the firmware, is experimental.)

The problem was originally reported by Ard [0]. We analyzed it at [1] and
[2]. A QEMU patch was sent at [3]; now merged as commit dab30fbef389
("acpi: cpuhp: fix guest-visible maximum access size to the legacy reg
block", 2023-01-08), to be included in QEMU v8.0.0.

[0] https://bugzilla.tianocore.org/show_bug.cgi?id=4234#c2

[1] https://bugzilla.tianocore.org/show_bug.cgi?id=4234#c3

[2] IO port write width clamping differs between TCG and KVM
    http://mid.mail-archive.com/aaedee84-d3ed-a4f9-21e7-d221a28d1683@redhat.com
    https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html

[3] acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block
    http://mid.mail-archive.com/20230104090138.214862-1-lersek@redhat.com
    https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00278.html

NOTE: PlatformInitLib is used in the following platform DSCs:

  OvmfPkg/AmdSev/AmdSevX64.dsc
  OvmfPkg/CloudHv/CloudHvX64.dsc
  OvmfPkg/IntelTdx/IntelTdxX64.dsc
  OvmfPkg/Microvm/MicrovmX64.dsc
  OvmfPkg/OvmfPkgIa32.dsc
  OvmfPkg/OvmfPkgIa32X64.dsc
  OvmfPkg/OvmfPkgX64.dsc

but I can only test this change with the last three platforms, running on
QEMU.

Test results:

  TCG  QEMU     OVMF     override  result
       patched  patched
  ---  -------  -------  --------  --------------------------------------
  0    0        0        0         CPU counts OK (KVM masks the QEMU bug)
  0    0        1        0         CPU counts OK (KVM masks the QEMU bug)
  0    1        0        0         CPU counts OK (QEMU fix, but KVM masks
                                   the QEMU bug anyway)
  0    1        1        0         CPU counts OK (QEMU fix, but KVM masks
                                   the QEMU bug anyway)
  1    0        0        0         boot with broken CPU counts (original
                                   QEMU bug)
  1    0        1        0         broken CPU count caught (boot hangs)
  1    0        1        1         broken CPU count caught, bug check
                                   overridden, boot continues
  1    1        0        0         CPU counts OK (QEMU fix)
  1    1        1        0         CPU counts OK (QEMU fix)

Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Michael Brown <mcb30@ipxe.org>
Cc: Min Xu <min.m.xu@intel.com>
Cc: Oliver Steffen <osteffen@redhat.com>
Cc: Sebastien Boeuf <sebastien.boeuf@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=4250


Signed-off-by: default avatarLaszlo Ersek <lersek@redhat.com>
Message-Id: <20230119110131.91923-3-lersek@redhat.com>
Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
Hugely-appreciated-by: default avatarMichael Brown <mcb30@ipxe.org>
Acked-by: default avatarGerd Hoffmann <kraxel@redhat.com>
parent c3e128a4
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment