Bug#983923: linux-image-4.19.0-13-cloud-amd64: Please add CONFIG_MAXSMP to the linux-image-cloud-amd64 kernel
Package: src:linux
Version: 4.19.160-2
Severity: important
Dear maintainer,
Please enable the MAXSMP kernel config parameter for the
linux-image-cloud-amd64 kernel image. This is the configuration
currently used in kernel 5.10 package.
When being used on a server with the latest AMD EPYC 7543 processors
(Milan), starting a QEMU VM with more than 64 processors will trigger a
kernel panic in the VM.
This has been tested on a Ubuntu Bionic host with QEMU versions 1.2.11
as well as 4.2.3 (sorry but all of our QEMU hosts use Ubuntu)
Here is an example of the situation :
### Host configuration
# dmidecode -s processor-version
AMD EPYC 7543 32-Core Processor
# nproc
128
# wget
#
http://cloud.debian.org/images/cloud/buster/daily/20210303-565/debian-10-genericcloud-amd64-daily-20210303-565.qcow2
# qemu-img create -F qcow2 -f qcow2 -b
# debian-10-genericcloud-amd64-daily-20210303-565.qcow2 buster.qcow2 10G
# cat startit
cp /usr/share/OVMF/OVMF_CODE.fd /tmp
cp /usr/share/OVMF/OVMF_VARS.fd /tmp
/usr/bin/qemu-system-x86_64 \
-enable-kvm \
-display none \
-monitor none \
-nodefaults \
-nographic \
-serial mon:stdio \
-cpu host \
-smp $2 \
-m 8G \
-drive if=pflash,format=raw,readonly,file=/tmp/OVMF_CODE.fd \
-drive if=pflash,format=raw,file=/tmp/OVMF_VARS.fd \
-drive file=./$1,if=none,id=disk0 \
-device virtio-blk-pci,drive=disk0,id=virblk0,bootindex=1 \
-netdev user,id=n1 \
-device virtio-net-pci,netdev=n1
### Proof that the script is working as expected
# ./startit buster.qcow2 64
...
Debian GNU/Linux 10 debian ttyS0
debian login: root
password:
root@debian:~# nproc
64
### Example of the kernel panic situation with more dans 64 CPUS
$ ./startit buster.qcow2 66
[ 0.007858] RSP: 0000:ffffaf5d40ebfea0 EFLAGS: 00010202
[ 0.007858] RAX: ffffffffa57bd440 RBX: 00000000000001ed RCX:
0000000000000002
[ 0.007858] RDX: ffff9bf33643c7b0 RSI: ffffaf5d40ebfec8 RDI:
00000000000001ed
[ 0.007858] RBP: ffffaf5d40ebff38 R08: ffff9bf336412000 R09:
ffff9bf3360006d8
[ 0.007858] R10: ffff9bf336000700 R11: 0000000000000000 R12:
ffffaf5d40ebfec8
[ 0.007858] R13: ffffffffa423e706 R14: 0000000000000000 R15:
0000000000000000
[ 0.007858] ? start_secondary+0x156/0x200
[ 0.007858] ? start_secondary+0x156/0x200
[ 0.007858] _get_random_bytes+0x7d/0x1b0
[ 0.007858] ? rcu_cpu_starting+0x136/0x150
[ 0.007858] ? cpumask_next+0x16/0x20
[ 0.007858] ? speculative_store_bypass_ht_init+0x6d/0xb0
[ 0.007858] start_secondary+0x156/0x200
[ 0.007858] secondary_startup_64+0xa4/0xb0
[ 0.007858] Modules linked in:
[ 0.007858] CR2: 000000000000022d
[ 0.007858] BUG: unable to handle kernel NULL pointer dereference at
000000000000022d
[ 0.007858] PGD 0 P4D 0 [ 0.007858] Oops: 0000 [#26] SMP NOPTI
[ 0.007858] CPU: 64 PID: 0 Comm: swapper/64 Tainted: G D
4.19.0-14-cloud-amd64 #1 Debian 4.19.171-2
[ 0.007858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 0.0.0 02/06/2015
### Example with a locally compiled kernel using the documentation at
https://wiki.debian.org/DebianKernel/GitBisect with MAXCPU enabled :
$ ./startit buster.qcow2 128
...
Debian GNU/Linux 10 debian ttyS0
debian login: root
password:
root@debian:~# nproc
128
root@debian:~# uname -a
Linux debian 4.19.171-maxcpu #88 SMP Wed Mar 3 10:43:25 UTC 2021 x86_64
GNU/Linux
root@debian:~# egrep "NR_CPUS|MAXSMP" /boot/config-4.19.171-maxcpu
CONFIG_MAXSMP=y
CONFIG_NR_CPUS_RANGE_BEGIN=8192
CONFIG_NR_CPUS_RANGE_END=8192
CONFIG_NR_CPUS_DEFAULT=8192
CONFIG_NR_CPUS=8192
Please be aware that the same test done on an older version of the AMD
EPYC CPU (namely AMD EPYC 7401P 24-Core Processor) will not trigger this
problem.
-- Package-specific info:
** Version:
Linux version 4.19.0-13-cloud-amd64 (debian-kernel@lists.debian.org)
(gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.160-2 (2020-11-28)
** Command line:
BOOT_IMAGE=/boot/vmlinuz-4.19.0-13-cloud-amd64
root=UUID=daf85f0d-98b3-4a1c-870f-cd2b3cd58684 ro console=tty0
console=ttyS0,115200 earlyprintk=ttyS0,115200 scsi_mod.use_blk_mq=Y
** Not tainted
** Kernel log:
** Model information
[ 0.717177] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[ 0.718238] PCI: CLS 0 bytes, default 64
[ 0.718336] Unpacking initramfs...
[ 1.129371] Freeing initrd memory: 12788K
[ 1.131669] Initialise system trusted keyrings
[ 1.132620] Key type blacklist registered
[ 1.134165] workingset: timestamp_bits=40 max_order=19 bucket_order=0
[ 1.138488] zbud: loaded
[ 1.328162] Key type asymmetric registered
[ 1.333078] Asymmetric key parser 'x509' registered
[ 1.333768] Block layer SCSI generic (bsg) driver version 0.4 loaded
(major 250)
[ 1.334783] io scheduler noop registered (default)
[ 1.335519] io scheduler deadline registered
[ 1.336409] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[ 1.359657] 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200)
is a 16550A
[ 1.361880] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at
0x60,0x64 irq 1,12
[ 1.363695] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 1.364301] serio: i8042 AUX port at 0x60,0x64 irq 12
[ 1.365594] mousedev: PS/2 mouse device common for all mice
[ 1.366956] rtc_cmos 00:00: RTC can wake from S4
[ 1.369809] input: AT Translated Set 2 keyboard as
/devices/platform/i8042/serio0/input/input0
[ 1.371193] rtc_cmos 00:00: registered as rtc0
[ 1.371807] rtc_cmos 00:00: alarms up to one day, y3k, 114 bytes
nvram, hpet irqs
[ 1.373543] NET: Registered protocol family 10
[ 1.387199] Segment Routing with IPv6
[ 1.387915] mip6: Mobile IPv6
[ 1.388446] NET: Registered protocol family 17
[ 1.389619] mpls_gso: MPLS GSO support
[ 1.390365] sched_clock: Marking stable (1380063413,
9532741)->(1405923239, -16327085)
[ 1.392299] registered taskstats version 1
[ 1.393317] Loading compiled-in X.509 certificates
[ 1.438484] Loaded X.509 cert 'Debian Secure Boot CA:
6ccece7e4c6c0d1f6149f3dd27dfcc5cbb419ea1'
[ 1.439866] Loaded X.509 cert 'Debian Secure Boot Signer 2020:
00b55eb3b9'
[ 1.441296] AppArmor: AppArmor sha1 policy hashing enabled
[ 1.442664] rtc_cmos 00:00: setting system clock to 2021-03-03
13:18:02 UTC (1614777482)
[ 1.447643] Freeing unused kernel image memory: 1476K
[ 1.461328] Write protecting the kernel read-only data: 16384k
[ 1.465783] Freeing unused kernel image memory: 2028K
[ 1.468555] Freeing unused kernel image memory: 1340K
[ 1.470112] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 1.471715] Run /init as init process
[ 1.552150] cryptd: max_cpu_qlen set to 1000
[ 1.556847] SCSI subsystem initialized
[ 1.563239] AVX2 version of gcm_enc/dec engaged.
[ 1.563923] AES CTR mode by8 optimization enabled
[ 1.585243] libata version 3.00 loaded.
[ 1.587012] ata_piix 0000:00:01.1: version 2.13
[ 1.590667] PCI Interrupt Link [LNKB] enabled at IRQ 11
[ 1.601236] scsi host0: ata_piix
[ 1.617256] scsi host1: ata_piix
[ 1.618007] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc0a0 irq 14
[ 1.619083] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc0a8 irq 15
[ 1.621574] ata1: port disabled--ignoring
[ 1.621737] ata2: port disabled--ignoring
[ 1.622379] PCI Interrupt Link [LNKC] enabled at IRQ 10
[ 1.656281] PCI Interrupt Link [LNKD] enabled at IRQ 10
[ 1.664813] virtio_blk virtio2: [vda] 39062500 512-byte logical
blocks (20.0 GB/18.6 GiB)
[ 1.670353] scsi host2: Virtio SCSI HBA
[ 1.685505] virtio_net virtio0 ens2: renamed from eth0
[ 1.687573] GPT:Primary header thinks Alt. header is not at the end
of the disk.
[ 1.689664] GPT:19531249 != 39062499
[ 1.690683] GPT:Alternate GPT header not at the end of the disk.
[ 1.692325] GPT:19531249 != 39062499
[ 1.693429] GPT: Use GNU Parted to correct GPT errors.
[ 1.694393] vda: vda1 vda14 vda15
[ 1.819269] EXT4-fs (vda1): mounted filesystem with ordered data
mode. Opts: (null)
[ 2.233670] vda: vda1 vda14 vda15
[ 2.303780] EXT4-fs (vda1): mounted filesystem with ordered data
mode. Opts: (null)
[ 2.462096] systemd[1]: Inserted module 'autofs4'
[ 2.504107] systemd[1]: systemd 241 running in system mode. (+PAM
+AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP
+GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN
-PCRE2 default-hierarchy=hybrid)
[ 2.508210] systemd[1]: Detected virtualization kvm.
[ 2.509468] systemd[1]: Detected architecture x86-64.
[ 2.522989] systemd[1]: Set hostname to <debian>.
[ 2.525304] systemd[1]: Initializing machine ID from KVM UUID.
[ 2.526224] systemd[1]: Installed transient /etc/machine-id file.
[ 2.747923] systemd[1]: Reached target Swap.
[ 2.750773] systemd[1]: Reached target Remote File Systems.
[ 2.755331] systemd[1]: Created slice User and Session Slice.
[ 2.758412] systemd[1]: Started Dispatch Password Requests to Console
Directory Watch.
[ 2.832187] EXT4-fs (vda1): re-mounted. Opts: discard,errors=remount-ro
[ 2.927054] EXT4-fs (vda1): resizing filesystem from 491515 to
4850040 blocks
[ 2.928598] EXT4-fs (vda1): resizing filesystem from 491515 to
4849664 blocks
[ 3.012134] systemd-journald[259]: Received request to flush runtime
journal from PID 1
[ 3.451080] EXT4-fs (vda1): resized filesystem to 4849664
[ 4.079689] input: Power Button as
/devices/LNXSYSTM:00/LNXPWRBN:00/input/input2
[ 4.081046] ACPI: Power Button [PWRF]
[ 4.104903] EFI Variables Facility v0.08 2004-May-17
[ 4.113739] pstore: Using compression: deflate
[ 4.114669] pstore: Registered efi as persistent store backend
[ 4.331588] kvm: Nested Virtualization enabled
[ 4.332281] kvm: Nested Paging enabled
[ 4.627829] audit: type=1400 audit(1614777485.680:2):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/sbin/chronyd" pid=351 comm="apparmor_parser"
[ 4.635721] audit: type=1400 audit(1614777485.688:3):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="nvidia_modprobe" pid=350 comm="apparmor_parser"
[ 4.638012] audit: type=1400 audit(1614777485.688:4):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="nvidia_modprobe//kmod" pid=350 comm="apparmor_parser"
[ 4.796470] audit: type=1400 audit(1614777485.848:5):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/bin/man" pid=354 comm="apparmor_parser"
[ 4.799586] audit: type=1400 audit(1614777485.852:6):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="man_filter" pid=354 comm="apparmor_parser"
[ 4.803262] audit: type=1400 audit(1614777485.856:7):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="man_groff" pid=354 comm="apparmor_parser"
[ 4.809261] audit: type=1400 audit(1614777485.860:8):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/sbin/tcpdump" pid=353 comm="apparmor_parser"
[ 8.538753] EXT4-fs (vda1): resizing filesystem from 4849664 to
4850040 blocks
[ 8.540528] EXT4-fs (vda1): resizing filesystem from 4849664 to
4849664 blocks
sys_vendor: Scaleway
product_name: SCW-DEV1-S
product_version: pc-i440fx-bionic
chassis_vendor: QEMU
chassis_version: pc-i440fx-bionic
bios_vendor: EFI Development Kit II / OVMF
bios_version: 0.0.0
** Loaded modules:
nls_ascii
nls_cp437
kvm_amd
vfat
fat
kvm
irqbypass
crct10dif_pclmul
crc32_pclmul
ghash_clmulni_intel
evdev
serio_raw
efi_pstore
efivars
qemu_fw_cfg
button
efivarfs
ip_tables
x_tables
autofs4
ext4
crc16
mbcache
jbd2
crc32c_generic
fscrypto
ecb
virtio_net
net_failover
failover
virtio_blk
virtio_scsi
crc32c_intel
ata_generic
ata_piix
libata
aesni_intel
aes_x86_64
crypto_simd
cryptd
scsi_mod
glue_helper
virtio_pci
virtio_ring
virtio
** PCI devices:
00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC
[Natoma] [8086:1237] (rev 02)
Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0
00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA
[Natoma/Triton II] [8086:7000]
Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
00:01.1 IDE interface [0101]: Intel Corporation 82371SB PIIX3 IDE
[Natoma/Triton II] [8086:7010] (prog-if 80 [ISA Compatibility mode-only
controller, supports bus mastering])
Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Region 0: [virtual] Memory at 000001f0 (32-bit, non-prefetchable) [size=8]
Region 1: [virtual] Memory at 000003f0 (type 3, non-prefetchable)
Region 2: [virtual] Memory at 00000170 (32-bit, non-prefetchable) [size=8]
Region 3: [virtual] Memory at 00000370 (type 3, non-prefetchable)
Region 4: I/O ports at c0a0 [size=16]
Kernel driver in use: ata_piix
Kernel modules: ata_piix, ata_generic
00:01.3 Bridge [0680]: Intel Corporation 82371AB/EB/MB PIIX4 ACPI
[8086:7113] (rev 03)
Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 9
00:02.0 Ethernet controller [0200]: Red Hat, Inc Virtio network device
[1af4:1000]
Subsystem: Red Hat, Inc Virtio network device [1af4:0001]
Physical Slot: 2
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at c080 [size=32]
Region 1: Memory at 80002000 (32-bit, non-prefetchable) [size=4K]
Region 4: Memory at 800000000 (64-bit, prefetchable) [size=16K]
Expansion ROM at 80080000 [disabled] [size=512K]
Capabilities: <access denied>
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci
00:03.0 SCSI storage controller [0100]: Red Hat, Inc Virtio SCSI [1af4:1004]
Subsystem: Red Hat, Inc Virtio SCSI [1af4:0008]
Physical Slot: 3
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at c040 [size=64]
Region 1: Memory at 80001000 (32-bit, non-prefetchable) [size=4K]
Region 4: Memory at 800004000 (64-bit, prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci
00:04.0 SCSI storage controller [0100]: Red Hat, Inc Virtio block device
[1af4:1001]
Subsystem: Red Hat, Inc Virtio block device [1af4:0002]
Physical Slot: 4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at c000 [size=64]
Region 1: Memory at 80000000 (32-bit, non-prefetchable) [size=4K]
Region 4: Memory at 800008000 (64-bit, prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci
** USB devices:
not available
-- System Information:
Debian Release: 10.7
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 4.19.0-13-cloud-amd64 (SMP w/2 CPU cores)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C.UTF-8
(charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
Versions of packages linux-image-4.19.0-13-cloud-amd64 depends on:
ii initramfs-tools [linux-initramfs-tool] 0.133+deb10u1
ii kmod 26-1
ii linux-base 4.6
Versions of packages linux-image-4.19.0-13-cloud-amd64 recommends:
ii apparmor 2.13.2-10
pn firmware-linux-free <none>
Versions of packages linux-image-4.19.0-13-cloud-amd64 suggests:
pn debian-kernel-handbook <none>
pn grub-pc | grub-efi-amd64 | extlinux <none>
pn linux-doc-4.19 <none>
Versions of packages linux-image-4.19.0-13-cloud-amd64 is related to:
pn firmware-amd-graphics <none>
pn firmware-atheros <none>
pn firmware-bnx2 <none>
pn firmware-bnx2x <none>
pn firmware-brcm80211 <none>
pn firmware-cavium <none>
pn firmware-intel-sound <none>
pn firmware-intelwimax <none>
pn firmware-ipw2x00 <none>
pn firmware-ivtv <none>
pn firmware-iwlwifi <none>
pn firmware-libertas <none>
pn firmware-linux-nonfree <none>
pn firmware-misc-nonfree <none>
pn firmware-myricom <none>
pn firmware-netxen <none>
pn firmware-qlogic <none>
pn firmware-realtek <none>
pn firmware-samsung <none>
pn firmware-siano <none>
pn firmware-ti-connectivity <none>
pn xen-hypervisor <none>
-- no debconf information
Reply to: