[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#851641: linux-image-3.16.0-4-amd64 panic:double fault



hi Ben
Thanks very much for your reply.
> I looked for information on this hardware, and the first thing I found
> was that you previously reported several crashes to Debian on this same
> hardware:
> https://bugs.debian.org/834487
> https://bugs.debian.org/838658
> https://bugs.debian.org/847839

we don't see any hardware error message or ECC error. (we use mcelog
and the servers have BMC)

we found many machines happened this panic . if it is hardware
problem, I think maybe panic in different stacks.
and I found the same panic in another hardware,too:

[308495.512050] PANIC: double fault, error_code: 0x0
[308495.512077] CPU: 4 PID: 161103 Comm: parameter_serve Not tainted
3.16.0-4-amd64 #1 Debian 3.16.36-1+deb8u2
[308495.512079] Hardware name: Inspur SA5248M4/X10DRT-PS, BIOS 2.01 11/21/2016
[308495.512080] task: ffff883b5dfb0a20 ti: ffff883f3a9b0000 task.ti:
ffff883f3a9b0000
[308495.512082] RIP: 0010:[<ffffffff81518598>]  [<ffffffff81518598>]
sysret_check+0x1/0x4e
[308495.512088] RSP: 0018:ffffffffffffffd8  EFLAGS: 00010217
[308495.512090] RAX: 0000000000000000 RBX: 00000000816900f0 RCX:
0000000000000000
[308495.512091] RDX: ffff883f3a9b3fd8 RSI: ffff883f3a9b3d88 RDI:
ffff883f3c885040
[308495.512092] RBP: 0000000000000000 R08: ffff883f3a9b0000 R09:
000000000000b629
[308495.512092] R10: 000000010498cc7d R11: 0000000000000000 R12:
00000000c9ffe0b8
[308495.512093] R13: 00000000c9ffe000 R14: 0000000000000001 R15:
00000000ffffffff
[308495.512095] FS:  00007fb567de3700(0000) GS:ffff88407f900000(0000)
knlGS:0000000000000000
[308495.512096] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[308495.512097] CR2: ffffffffffffffc8 CR3: 0000003f44e30000 CR4:
00000000003407e0
[308495.512098] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[308495.512099] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[308495.512100] Stack:
[308495.512119] BUG: unable to handle kernel paging request at ffffffffffffffd8
[308495.512149] IP: [<ffffffff81016518>] show_stack_log_lvl+0x108/0x170
[308495.512181] PGD 1816067 PUD 1818067 PMD 0
[308495.512208] Oops: 0000 [#1] SMP
[308495.512224] Modules linked in: 8021q garp stp mrp llc tcp_westwood
x86_pkg_temp_thermal coretemp kvm_intel kvm iTCO_wdt
iTCO_vendor_support crc32_pclmul aesni_intel ast aes_x86_64 evdev lrw
joydev gf128mul ttm glue_helper drm_kms_helper ablk_helper cryptd drm
i2c_algo_bit pcspkr i2c_i801 lpc_ich mei_me i2c_core shpchp mei
mfd_core wmi tpm_tis tpm ipmi_watchdog processor thermal_sys
acpi_power_meter acpi_pad button ipmi_si ipmi_poweroff ipmi_devintf
ipmi_msghandler autofs4 ext4 crc16 mbcache jbd2 hid_generic usbhid hid
sg sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul
crct10dif_common crc32c_intel ahci libahci ehci_pci libata xhci_hcd
ehci_hcd ixgbe dca ptp usbcore scsi_mod pps_core usb_common mdio
[308495.512564] CPU: 4 PID: 161103 Comm: parameter_serve Not tainted
3.16.0-4-amd64 #1 Debian 3.16.36-1+deb8u2
[308495.512600] Hardware name: Inspur SA5248M4/X10DRT-PS, BIOS 2.01 11/21/2016
[308495.512627] task: ffff883b5dfb0a20 ti: ffff883f3a9b0000 task.ti:
ffff883f3a9b0000
[308495.512655] RIP: 0010:[<ffffffff81016518>]  [<ffffffff81016518>]
show_stack_log_lvl+0x108/0x170
[308495.512690] RSP: 0018:ffff88407f904e98  EFLAGS: 00010046
[308495.512711] RAX: ffffffffffffffe0 RBX: ffffffffffffffd8 RCX:
ffff88407f8fffc0
[308495.512738] RDX: 0000000000000000 RSI: ffff88407f904f58 RDI:
0000000000000000
[308495.512765] RBP: ffff88407f903fc0 R08: ffffffff81706753 R09:
00000000000005b4
[308495.512792] R10: 0000000000000000 R11: ffff88407f904c2e R12:
ffff88407f904f58
[308495.512819] R13: 0000000000000000 R14: ffffffff81706753 R15:
0000000000000000
[308495.512846] FS:  00007fb567de3700(0000) GS:ffff88407f900000(0000)
knlGS:0000000000000000
[308495.512876] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[308495.512898] CR2: ffffffffffffffd8 CR3: 0000003f44e30000 CR4:
00000000003407e0
[308495.512925] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[308495.512952] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[308495.512979] Stack:
[308495.512989]  ffffffff00000008 ffff88407f904ef0 ffff88407f904eb0
ffffffffffffffd8
[308495.513021]  ffff88407f904f58 ffffffffffffffd8 ffff883b5dfb0a20
0000000000000040
[308495.513052]  0000000000000001 00000000ffffffff ffffffff810165fe
ffff88407f904f58
[308495.513084] Call Trace:
[308495.513096]  <#DF>
[308495.513106]
[308495.513117]  [<ffffffff810165fe>] ? show_regs+0x7e/0x1f0
[308495.513136]  [<ffffffff810503af>] ? df_debug+0x1f/0x30
[308495.513158]  [<ffffffff81014ee8>] ? do_double_fault+0x78/0xf0
[308495.513181]  [<ffffffff8151a028>] ? double_fault+0x28/0x30
[308495.513204]  [<ffffffff81518598>] ? sysret_check+0x1/0x4e
[308495.513225]  <<EOE>>
[308495.513236]  <UNK>
[308495.513246] Code: 67 70 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b
f8 e8 5b 93 4f 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48
48 8d 43 08 <48> 8b 33 48 c7 c7 4b 67 70 81 89 54 24 14 48 89 4c 24 08
48 89
[308495.513381] RIP  [<ffffffff81016518>] show_stack_log_lvl+0x108/0x170
[308495.513408]  RSP <ffff88407f904e98>
[308495.513423] CR2: ffffffffffffffd8

> Are you using KVM?
we don't use KVM, the application is just computing and transferring  data.

BRs
Yongsu


---------- Forwarded message ----------
From: Ben Hutchings <ben@decadent.org.uk>
Date: 2017-01-18 1:18 GMT+08:00
Subject: Re: Bug#851641: linux-image-3.16.0-4-amd64 panic:double fault
To: 张永肃 <zhangyongsu@bytedance.com>, 851641@bugs.debian.org


Control: tag -1 moreinfo

On Tue, 2017-01-17 at 15:25 +0800, 张永肃 wrote:
> Package:linux-image-3.16.0-4-amd64
> Version:3.16.36-1+deb8u1
>
> 3.16.36-1+deb8u1 (debian stable package) kernel panic,double fault.
>
> [952650.981869] PANIC: double fault, error_code: 0x0
> [952650.981909] CPU: 4 PID: 14945 Comm: parameter_serve Not tainted
> 3.16.0-4-amd64 #1 Debian 3.16.36-1+deb8u1
> [952650.981911] Hardware name: Powerleader PR2760TG/X10DRT-PT, BIOS
> 2.0 12/18/2015

I looked for information on this hardware, and the first thing I found
was that you previously reported several crashes to Debian on this same
hardware:

https://bugs.debian.org/834487
https://bugs.debian.org/838658
https://bugs.debian.org/847839

Are you sure the hardware is stable?  Does it have ECC RAM?  (I know
the processor supports ECC.)

[...]
> it similar to this issue which happened on 4.4.0,but the patch do not
> work on 3.16 : http://linux-kernel.2935.n7.nabble.com/PANIC-double-fault-error-code-0x0-in-4-0-0-rc3-2-kvm-related-td1064080.html

Are you using KVM?

Ben.

--
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
                                                              - Albert
Camus

Attachment: signature.asc
Description: PGP signature


Reply to: