[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#983923: linux-image-4.19.0-13-cloud-amd64: Please add CONFIG_MAXSMP to the linux-image-cloud-amd64 kernel



Package: src:linux
Version: 4.19.160-2
Severity: important

Dear maintainer,

Please enable the MAXSMP kernel config parameter for the
linux-image-cloud-amd64 kernel image. This is the configuration
currently used in kernel 5.10 package.

When being used on a server with the latest AMD EPYC 7543 processors
(Milan), starting a QEMU VM with more than 64 processors will trigger a
kernel panic in the VM.

This has been tested on a Ubuntu Bionic host with QEMU versions 1.2.11
as well as 4.2.3 (sorry but all of our QEMU hosts use Ubuntu)

Here is an example of the situation :

### Host configuration

# dmidecode -s processor-version
AMD EPYC 7543 32-Core Processor
# nproc
128

# wget
# http://cloud.debian.org/images/cloud/buster/daily/20210303-565/debian-10-genericcloud-amd64-daily-20210303-565.qcow2
# qemu-img create -F qcow2 -f qcow2 -b
# debian-10-genericcloud-amd64-daily-20210303-565.qcow2 buster.qcow2 10G
# cat startit
cp /usr/share/OVMF/OVMF_CODE.fd /tmp
cp /usr/share/OVMF/OVMF_VARS.fd /tmp
/usr/bin/qemu-system-x86_64 \
-enable-kvm \
-display none \
-monitor none \
-nodefaults \
-nographic \
-serial mon:stdio \
-cpu host \
-smp $2 \
-m 8G \
-drive if=pflash,format=raw,readonly,file=/tmp/OVMF_CODE.fd \
-drive if=pflash,format=raw,file=/tmp/OVMF_VARS.fd \
-drive file=./$1,if=none,id=disk0 \
-device virtio-blk-pci,drive=disk0,id=virblk0,bootindex=1 \
-netdev user,id=n1 \
-device virtio-net-pci,netdev=n1

### Proof that the script is working as expected

# ./startit buster.qcow2 64
...
Debian GNU/Linux 10 debian ttyS0

debian login: root
password:

root@debian:~# nproc
64

### Example of the kernel panic situation with more dans 64 CPUS
$ ./startit buster.qcow2 66
[    0.007858] RSP: 0000:ffffaf5d40ebfea0 EFLAGS: 00010202
[    0.007858] RAX: ffffffffa57bd440 RBX: 00000000000001ed RCX:
0000000000000002
[    0.007858] RDX: ffff9bf33643c7b0 RSI: ffffaf5d40ebfec8 RDI:
00000000000001ed
[    0.007858] RBP: ffffaf5d40ebff38 R08: ffff9bf336412000 R09:
ffff9bf3360006d8
[    0.007858] R10: ffff9bf336000700 R11: 0000000000000000 R12:
ffffaf5d40ebfec8
[    0.007858] R13: ffffffffa423e706 R14: 0000000000000000 R15:
0000000000000000
[    0.007858]  ? start_secondary+0x156/0x200
[    0.007858]  ? start_secondary+0x156/0x200
[    0.007858]  _get_random_bytes+0x7d/0x1b0
[    0.007858]  ? rcu_cpu_starting+0x136/0x150
[    0.007858]  ? cpumask_next+0x16/0x20
[    0.007858]  ? speculative_store_bypass_ht_init+0x6d/0xb0
[    0.007858]  start_secondary+0x156/0x200
[    0.007858]  secondary_startup_64+0xa4/0xb0
[    0.007858] Modules linked in:
[    0.007858] CR2: 000000000000022d
[    0.007858] BUG: unable to handle kernel NULL pointer dereference at
000000000000022d
[    0.007858] PGD 0 P4D 0 [    0.007858] Oops: 0000 [#26] SMP NOPTI
[    0.007858] CPU: 64 PID: 0 Comm: swapper/64 Tainted: G      D
4.19.0-14-cloud-amd64 #1 Debian 4.19.171-2
[    0.007858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 0.0.0 02/06/2015

### Example with a locally compiled kernel using the documentation at
https://wiki.debian.org/DebianKernel/GitBisect with MAXCPU enabled :

$ ./startit buster.qcow2 128
...
Debian GNU/Linux 10 debian ttyS0

debian login: root
password:

root@debian:~# nproc
128

root@debian:~# uname -a
Linux debian 4.19.171-maxcpu #88 SMP Wed Mar 3 10:43:25 UTC 2021 x86_64
GNU/Linux

root@debian:~# egrep "NR_CPUS|MAXSMP" /boot/config-4.19.171-maxcpu
CONFIG_MAXSMP=y
CONFIG_NR_CPUS_RANGE_BEGIN=8192
CONFIG_NR_CPUS_RANGE_END=8192
CONFIG_NR_CPUS_DEFAULT=8192
CONFIG_NR_CPUS=8192

Please be aware that the same test done on an older version of the AMD
EPYC CPU (namely AMD EPYC 7401P 24-Core Processor) will not trigger this
problem.



-- Package-specific info:
** Version:
Linux version 4.19.0-13-cloud-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.160-2 (2020-11-28)

** Command line:
BOOT_IMAGE=/boot/vmlinuz-4.19.0-13-cloud-amd64 root=UUID=daf85f0d-98b3-4a1c-870f-cd2b3cd58684 ro console=tty0 console=ttyS0,115200 earlyprintk=ttyS0,115200 scsi_mod.use_blk_mq=Y

** Not tainted

** Kernel log:

** Model information
[    0.717177] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[    0.718238] PCI: CLS 0 bytes, default 64
[    0.718336] Unpacking initramfs...
[    1.129371] Freeing initrd memory: 12788K
[    1.131669] Initialise system trusted keyrings
[    1.132620] Key type blacklist registered
[    1.134165] workingset: timestamp_bits=40 max_order=19 bucket_order=0
[    1.138488] zbud: loaded
[    1.328162] Key type asymmetric registered
[    1.333078] Asymmetric key parser 'x509' registered
[ 1.333768] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250)
[    1.334783] io scheduler noop registered (default)
[    1.335519] io scheduler deadline registered
[    1.336409] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[ 1.359657] 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A [ 1.361880] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
[    1.363695] serio: i8042 KBD port at 0x60,0x64 irq 1
[    1.364301] serio: i8042 AUX port at 0x60,0x64 irq 12
[    1.365594] mousedev: PS/2 mouse device common for all mice
[    1.366956] rtc_cmos 00:00: RTC can wake from S4
[ 1.369809] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[    1.371193] rtc_cmos 00:00: registered as rtc0
[ 1.371807] rtc_cmos 00:00: alarms up to one day, y3k, 114 bytes nvram, hpet irqs
[    1.373543] NET: Registered protocol family 10
[    1.387199] Segment Routing with IPv6
[    1.387915] mip6: Mobile IPv6
[    1.388446] NET: Registered protocol family 17
[    1.389619] mpls_gso: MPLS GSO support
[ 1.390365] sched_clock: Marking stable (1380063413, 9532741)->(1405923239, -16327085)
[    1.392299] registered taskstats version 1
[    1.393317] Loading compiled-in X.509 certificates
[ 1.438484] Loaded X.509 cert 'Debian Secure Boot CA: 6ccece7e4c6c0d1f6149f3dd27dfcc5cbb419ea1' [ 1.439866] Loaded X.509 cert 'Debian Secure Boot Signer 2020: 00b55eb3b9'
[    1.441296] AppArmor: AppArmor sha1 policy hashing enabled
[ 1.442664] rtc_cmos 00:00: setting system clock to 2021-03-03 13:18:02 UTC (1614777482)
[    1.447643] Freeing unused kernel image memory: 1476K
[    1.461328] Write protecting the kernel read-only data: 16384k
[    1.465783] Freeing unused kernel image memory: 2028K
[    1.468555] Freeing unused kernel image memory: 1340K
[    1.470112] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[    1.471715] Run /init as init process
[    1.552150] cryptd: max_cpu_qlen set to 1000
[    1.556847] SCSI subsystem initialized
[    1.563239] AVX2 version of gcm_enc/dec engaged.
[    1.563923] AES CTR mode by8 optimization enabled
[    1.585243] libata version 3.00 loaded.
[    1.587012] ata_piix 0000:00:01.1: version 2.13
[    1.590667] PCI Interrupt Link [LNKB] enabled at IRQ 11
[    1.601236] scsi host0: ata_piix
[    1.617256] scsi host1: ata_piix
[    1.618007] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc0a0 irq 14
[    1.619083] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc0a8 irq 15
[    1.621574] ata1: port disabled--ignoring
[    1.621737] ata2: port disabled--ignoring
[    1.622379] PCI Interrupt Link [LNKC] enabled at IRQ 10
[    1.656281] PCI Interrupt Link [LNKD] enabled at IRQ 10
[ 1.664813] virtio_blk virtio2: [vda] 39062500 512-byte logical blocks (20.0 GB/18.6 GiB)
[    1.670353] scsi host2: Virtio SCSI HBA
[    1.685505] virtio_net virtio0 ens2: renamed from eth0
[ 1.687573] GPT:Primary header thinks Alt. header is not at the end of the disk.
[    1.689664] GPT:19531249 != 39062499
[    1.690683] GPT:Alternate GPT header not at the end of the disk.
[    1.692325] GPT:19531249 != 39062499
[    1.693429] GPT: Use GNU Parted to correct GPT errors.
[    1.694393]  vda: vda1 vda14 vda15
[ 1.819269] EXT4-fs (vda1): mounted filesystem with ordered data mode. Opts: (null)
[    2.233670]  vda: vda1 vda14 vda15
[ 2.303780] EXT4-fs (vda1): mounted filesystem with ordered data mode. Opts: (null)
[    2.462096] systemd[1]: Inserted module 'autofs4'
[ 2.504107] systemd[1]: systemd 241 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
[    2.508210] systemd[1]: Detected virtualization kvm.
[    2.509468] systemd[1]: Detected architecture x86-64.
[    2.522989] systemd[1]: Set hostname to <debian>.
[    2.525304] systemd[1]: Initializing machine ID from KVM UUID.
[    2.526224] systemd[1]: Installed transient /etc/machine-id file.
[    2.747923] systemd[1]: Reached target Swap.
[    2.750773] systemd[1]: Reached target Remote File Systems.
[    2.755331] systemd[1]: Created slice User and Session Slice.
[ 2.758412] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
[    2.832187] EXT4-fs (vda1): re-mounted. Opts: discard,errors=remount-ro
[ 2.927054] EXT4-fs (vda1): resizing filesystem from 491515 to 4850040 blocks [ 2.928598] EXT4-fs (vda1): resizing filesystem from 491515 to 4849664 blocks [ 3.012134] systemd-journald[259]: Received request to flush runtime journal from PID 1
[    3.451080] EXT4-fs (vda1): resized filesystem to 4849664
[ 4.079689] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input2
[    4.081046] ACPI: Power Button [PWRF]
[    4.104903] EFI Variables Facility v0.08 2004-May-17
[    4.113739] pstore: Using compression: deflate
[    4.114669] pstore: Registered efi as persistent store backend
[    4.331588] kvm: Nested Virtualization enabled
[    4.332281] kvm: Nested Paging enabled
[ 4.627829] audit: type=1400 audit(1614777485.680:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/chronyd" pid=351 comm="apparmor_parser" [ 4.635721] audit: type=1400 audit(1614777485.688:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=350 comm="apparmor_parser" [ 4.638012] audit: type=1400 audit(1614777485.688:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=350 comm="apparmor_parser" [ 4.796470] audit: type=1400 audit(1614777485.848:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=354 comm="apparmor_parser" [ 4.799586] audit: type=1400 audit(1614777485.852:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=354 comm="apparmor_parser" [ 4.803262] audit: type=1400 audit(1614777485.856:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=354 comm="apparmor_parser" [ 4.809261] audit: type=1400 audit(1614777485.860:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/tcpdump" pid=353 comm="apparmor_parser" [ 8.538753] EXT4-fs (vda1): resizing filesystem from 4849664 to 4850040 blocks [ 8.540528] EXT4-fs (vda1): resizing filesystem from 4849664 to 4849664 blocks
sys_vendor: Scaleway
product_name: SCW-DEV1-S
product_version: pc-i440fx-bionic
chassis_vendor: QEMU
chassis_version: pc-i440fx-bionic
bios_vendor: EFI Development Kit II / OVMF
bios_version: 0.0.0

** Loaded modules:
nls_ascii
nls_cp437
kvm_amd
vfat
fat
kvm
irqbypass
crct10dif_pclmul
crc32_pclmul
ghash_clmulni_intel
evdev
serio_raw
efi_pstore
efivars
qemu_fw_cfg
button
efivarfs
ip_tables
x_tables
autofs4
ext4
crc16
mbcache
jbd2
crc32c_generic
fscrypto
ecb
virtio_net
net_failover
failover
virtio_blk
virtio_scsi
crc32c_intel
ata_generic
ata_piix
libata
aesni_intel
aes_x86_64
crypto_simd
cryptd
scsi_mod
glue_helper
virtio_pci
virtio_ring
virtio

** PCI devices:
00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] [8086:1237] (rev 02)
	Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0

00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] [8086:7000]
	Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0

00:01.1 IDE interface [0101]: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] [8086:7010] (prog-if 80 [ISA Compatibility mode-only controller, supports bus mastering])
	Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Region 0: [virtual] Memory at 000001f0 (32-bit, non-prefetchable) [size=8]
	Region 1: [virtual] Memory at 000003f0 (type 3, non-prefetchable)
	Region 2: [virtual] Memory at 00000170 (32-bit, non-prefetchable) [size=8]
	Region 3: [virtual] Memory at 00000370 (type 3, non-prefetchable)
	Region 4: I/O ports at c0a0 [size=16]
	Kernel driver in use: ata_piix
	Kernel modules: ata_piix, ata_generic

00:01.3 Bridge [0680]: Intel Corporation 82371AB/EB/MB PIIX4 ACPI [8086:7113] (rev 03)
	Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 9

00:02.0 Ethernet controller [0200]: Red Hat, Inc Virtio network device [1af4:1000]
	Subsystem: Red Hat, Inc Virtio network device [1af4:0001]
	Physical Slot: 2
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 11
	Region 0: I/O ports at c080 [size=32]
	Region 1: Memory at 80002000 (32-bit, non-prefetchable) [size=4K]
	Region 4: Memory at 800000000 (64-bit, prefetchable) [size=16K]
	Expansion ROM at 80080000 [disabled] [size=512K]
	Capabilities: <access denied>
	Kernel driver in use: virtio-pci
	Kernel modules: virtio_pci

00:03.0 SCSI storage controller [0100]: Red Hat, Inc Virtio SCSI [1af4:1004]
	Subsystem: Red Hat, Inc Virtio SCSI [1af4:0008]
	Physical Slot: 3
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 10
	Region 0: I/O ports at c040 [size=64]
	Region 1: Memory at 80001000 (32-bit, non-prefetchable) [size=4K]
	Region 4: Memory at 800004000 (64-bit, prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: virtio-pci
	Kernel modules: virtio_pci

00:04.0 SCSI storage controller [0100]: Red Hat, Inc Virtio block device [1af4:1001]
	Subsystem: Red Hat, Inc Virtio block device [1af4:0002]
	Physical Slot: 4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 10
	Region 0: I/O ports at c000 [size=64]
	Region 1: Memory at 80000000 (32-bit, non-prefetchable) [size=4K]
	Region 4: Memory at 800008000 (64-bit, prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: virtio-pci
	Kernel modules: virtio_pci


** USB devices:
not available


-- System Information:
Debian Release: 10.7
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.19.0-13-cloud-amd64 (SMP w/2 CPU cores)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages linux-image-4.19.0-13-cloud-amd64 depends on:
ii  initramfs-tools [linux-initramfs-tool]  0.133+deb10u1
ii  kmod                                    26-1
ii  linux-base                              4.6

Versions of packages linux-image-4.19.0-13-cloud-amd64 recommends:
ii  apparmor             2.13.2-10
pn  firmware-linux-free  <none>

Versions of packages linux-image-4.19.0-13-cloud-amd64 suggests:
pn  debian-kernel-handbook               <none>
pn  grub-pc | grub-efi-amd64 | extlinux  <none>
pn  linux-doc-4.19                       <none>

Versions of packages linux-image-4.19.0-13-cloud-amd64 is related to:
pn  firmware-amd-graphics     <none>
pn  firmware-atheros          <none>
pn  firmware-bnx2             <none>
pn  firmware-bnx2x            <none>
pn  firmware-brcm80211        <none>
pn  firmware-cavium           <none>
pn  firmware-intel-sound      <none>
pn  firmware-intelwimax       <none>
pn  firmware-ipw2x00          <none>
pn  firmware-ivtv             <none>
pn  firmware-iwlwifi          <none>
pn  firmware-libertas         <none>
pn  firmware-linux-nonfree    <none>
pn  firmware-misc-nonfree     <none>
pn  firmware-myricom          <none>
pn  firmware-netxen           <none>
pn  firmware-qlogic           <none>
pn  firmware-realtek          <none>
pn  firmware-samsung          <none>
pn  firmware-siano            <none>
pn  firmware-ti-connectivity  <none>
pn  xen-hypervisor            <none>

-- no debconf information


Reply to: