[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#621072: linux-image-2.6.32-5-amd64: 2.6.32-33 failes to boot as PV domU on Xen



Hi,

I can also reproduce the issue with 2.6.32-33:

overlord3:~$ sudo xm dmesg
(XEN) Xen version 4.0.1 (Debian 4.0.1-2) (waldi@debian.org) (gcc version 4.4.5 (Debian 4.4.5-10) ) Wed Jan 12 14:04:06 UTC 2011
(XEN) Bootloader: GRUB 1.98+20100804-14
(XEN) Command line: placeholder
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN) Disc information:
(XEN)  Found 4 MBR signatures
(XEN)  Found 4 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 000000000009ac00 (usable)
(XEN)  000000000009ac00 - 00000000000a0000 (reserved)
(XEN)  00000000000e4000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 00000000dfe90000 (usable)
(XEN)  00000000dfe90000 - 00000000dfea8000 (ACPI data)
(XEN)  00000000dfea8000 - 00000000dfed0000 (ACPI NVS)
(XEN)  00000000dfed0000 - 00000000dff00000 (reserved)
(XEN)  00000000ffe00000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000000420000000 (usable)
(XEN) ACPI: RSDP 000FBED0, 0024 (r2 ACPIAM)
(XEN) ACPI: XSDT DFE90100, 005C (r1 082410 XSDT1804 20100824 MSFT       97)
(XEN) ACPI: FACP DFE90290, 00F4 (r3 082410 FACP1804 20100824 MSFT       97)
(XEN) ACPI: DSDT DFE90460, F42B (r1  A1595 A1595000        0 INTL 20060113)
(XEN) ACPI: FACS DFEA8000, 0040
(XEN) ACPI: APIC DFE90390, 0088 (r1 082410 APIC1804 20100824 MSFT       97)
(XEN) ACPI: MCFG DFE90420, 003C (r1 082410 OEMMCFG  20100824 MSFT       97)
(XEN) ACPI: OEMB DFEA8040, 0072 (r1 082410 OEMB1804 20100824 MSFT       97)
(XEN) ACPI: SRAT DFE9F8B0, 0108 (r1 AMD    FAM_F_10        2 AMD         1)
(XEN) ACPI: HPET DFE9F9C0, 0038 (r1 082410 OEMHPET  20100824 MSFT       97)
(XEN) ACPI: SSDT DFE9FA00, 0156 (r1 A M I  POWERNOW        1 AMD         1)
(XEN) System RAM: 16382MB (16775336kB)
(XEN) Domain heap initialised
(XEN) Processor #0 0:10 APIC version 16
(XEN) Processor #1 0:10 APIC version 16
(XEN) Processor #2 0:10 APIC version 16
(XEN) Processor #3 0:10 APIC version 16
(XEN) Processor #4 0:10 APIC version 16
(XEN) Processor #5 0:10 APIC version 16
(XEN) IOAPIC[0]: apic_id 6, version 33, address 0xfec00000, GSI 0-23
(XEN) IOAPIC[1]: apic_id 7, version 33, address 0xfec20000, GSI 24-55
(XEN) Enabling APIC mode:  Flat.  Using 2 I/O APICs
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 3210.862 MHz processor.
(XEN) Initing memory sharing.
(XEN) HVM: ASIDs enabled.
(XEN) HVM: SVM enabled
(XEN) HVM: Hardware Assisted Paging detected.
(XEN) AMD-Vi: IOMMU not found!
(XEN) I/O virtualisation disabled
(XEN) Total of 6 processors activated.
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using new ACK method
(XEN) TSC is reliable, synchronization unnecessary
(XEN) Platform timer appears to have unexpectedly wrapped 10 or more times.
(XEN) Platform timer is 14.318MHz HPET
(XEN) Allocated console ring of 16 KiB.
(XEN) Brought up 6 CPUs
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x16b8000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000408000000->0000000410000000 (4081161 pages to be allocated)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff816b8000
(XEN)  Init. ramdisk: ffffffff816b8000->ffffffff83104a00
(XEN)  Phys-Mach map: ffffffff83105000->ffffffff85068048
(XEN)  Start info:    ffffffff85069000->ffffffff850694b4
(XEN)  Page tables:   ffffffff8506a000->ffffffff85097000
(XEN)  Boot stack:    ffffffff85097000->ffffffff85098000
(XEN)  TOTAL:         ffffffff80000000->ffffffff85400000
(XEN)  ENTRY ADDRESS: ffffffff81508200
(XEN) Dom0 has maximum 6 VCPUs
(XEN) Scrubbing Free RAM: .done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen)
(XEN) Freed 176kB init memory.
(XEN) traps.c:2308:d0 Domain attempted WRMSR 00000000c0010004 from 00006412:d4175407 to 00000000:00000000.
(XEN) traps.c:2308:d0 Domain attempted WRMSR 00000000c0010000 from 00000107:6e90187f to 00000000:00430076.
(XEN) save.c:72:d0 Domain 2 expects freq 3210MHz but host's freq is 3210MHz: trap and emulate rdtsc
(XEN) d64:v0: unhandled page fault (ec=0000)
(XEN) Pagetable walk from ffffffff8305a000:
(XEN)  L4[0x1ff] = 000000038e68a067 0000000000001003
(XEN)  L3[0x1fe] = 000000038ef4e067 0000000000001007
(XEN)  L2[0x018] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 64 (vcpu#0) crashed on cpu#4:
(XEN) ----[ Xen-4.0.1  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    4
(XEN) RIP:    e033:[<ffffffff8100c2af>]
(XEN) RFLAGS: 0000000000000216   EM: 1   CONTEXT: pv guest
(XEN) rax: ffffffff8305a000   rbx: 8000000000000063   rcx: 8000000000000163
(XEN) rdx: 0000000020000000   rsi: 0000000000000000   rdi: 0000000000000000
(XEN) rbp: 0000000000000000   rsp: ffffffff8142db90   r8:  00000000000001ff
(XEN) r9:  0000000000000003   r10: 0000000000202000   r11: 0000000000100000
(XEN) r12: 8000000000000163   r13: 0000000000000000   r14: 0000000020000000
(XEN) r15: 0000000020000000   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 000000038f60a000   cr2: ffffffff8305a000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff8142db90:
(XEN)    8000000000000163 0000000000100000 0000000000000000 ffffffff8100c2af
(XEN)    000000010000e030 0000000000010016 ffffffff8142dbd8 000000000000e02b
(XEN)    0000000000000000 ffffffff8100c2c2 ffffffff8100c33c 0000000003c00000
(XEN)    ffffffff8100c3da ffffffff8100c1c9 0000000000100000 0000000000202000
(XEN)    0000000000000003 00000000000001ff 8000000003c00063 0000000000000000
(XEN)    0000000020000000 8000000000000163 ffffffff812f8981 ffffffffff400000
(XEN)    0000000100202000 ffffffffff400000 000000010000049d 0000000000100000
(XEN)    0000000100000000 ffffffff8100dbe3 0000000000100000 ffff8800010060f0
(XEN)    ffffffff8142dd38 0000000003c00000 ffffffffff400000 ffff8800010060f0
(XEN)    8000000000000163 0000000003c00000 0000000020000000 0000000000000000
(XEN)    ffffffff812f8bc7 0000003700000009 0000000100000008 80000000000001e3
(XEN)    0000000000000000 000000002020205b 0000000020000000 ffff880001006000
(XEN)    ffffffff8100dbe3 ffffffff8142decc 8000000000000163 000000000000001e
(XEN)    0000000000100000 0000000000100000 0000000020000000 ffff880001002000
(XEN)    0000000000000000 8000000000000163 0000000020000000 0000000000000000
(XEN)    ffffffff812f8d8f 0000003000000020 0000000000000000 0000000000000000
(XEN)    0000000020000000 ffff880001002000 0000000000000000 0000000000000000
(XEN)    0000000000100000 0000000000202000 0000000020000000 ffffffff81001880
(XEN)    0000000000000000 ffff880000000000 0000000020000000 ffff880020000000
(XEN)    ffffffff812f8feb ffffffffff400000 0000000000000000 ffff880020000000

I tried using

on_crash = 'coredump-destroy'

but unfortunately it seems to fail:

[2011-04-12 09:50:33 2099] WARNING (XendDomainInfo:2071) Domain has crashed: name=squeeze64 id=66.
[2011-04-12 09:50:33 2099] ERROR (XendDomainInfo:2326) core dump failed: id = 66 name = squeeze64: (1, 'Internal error', 'p2m_size < nr_pages -1 (0 < 1ffff')
[2011-04-12 09:50:33 2099] DEBUG (XendDomainInfo:3053) XendDomainInfo.destroy: domid=66

I then installed linux-image-2.6.32-5-amd64-dbg 2.6.32-33 and took a
look at the addresses in the above backtrace.

> (XEN) RIP:    e033:[<ffffffff8100c2af>]

ffffffff8100c28b <get_phys_to_machine>:
ffffffff8100c28b:       48 83 c8 ff             or     $0xffffffffffffffff,%rax
ffffffff8100c28f:       48 81 ff ff ff 7f 00    cmp    $0x7fffff,%rdi
ffffffff8100c296:       77 1b                   ja     ffffffff8100c2b3 <get_phys_to_machine+0x28>
ffffffff8100c298:       48 89 f8                mov    %rdi,%rax
ffffffff8100c29b:       81 e7 ff 01 00 00       and    $0x1ff,%edi
ffffffff8100c2a1:       48 c1 e8 09             shr    $0x9,%rax
ffffffff8100c2a5:       89 c0                   mov    %eax,%eax
ffffffff8100c2a7:       48 8b 04 c5 00 e0 42    mov    -0x7ebd2000(,%rax,8),%rax
ffffffff8100c2ae:       81
ffffffff8100c2af:       48 8b 04 f8             mov    (%rax,%rdi,8),%rax   <==
ffffffff8100c2b3:       c3                      retq

unsigned long get_phys_to_machine(unsigned long pfn)
{
        unsigned topidx, idx;

        if (unlikely(pfn >= MAX_DOMAIN_PAGES))
                return INVALID_P2M_ENTRY;

        topidx = p2m_top_index(pfn);
        idx = p2m_index(pfn);
        return p2m_top[topidx][idx];   <==
}
EXPORT_SYMBOL_GPL(get_phys_to_machine);


> (XEN)    0000000000000000 ffffffff8100c2c2 ffffffff8100c33c 0000000003c00000

ffffffff8100c2b4 <pfn_to_mfn>:
ffffffff8100c2b4:       80 3d c7 c6 4c 00 00    cmpb   $0x0,0x4cc6c7(%rip)        # ffffffff814d8982 <xen_features+0x2>
ffffffff8100c2bb:       75 15                   jne    ffffffff8100c2d2 <pfn_to_mfn+0x1e>
ffffffff8100c2bd:       e8 c9 ff ff ff          callq  ffffffff8100c28b <get_phys_to_machine>  <==
ffffffff8100c2c2:       48 89 c7                mov    %rax,%rdi
ffffffff8100c2c5:       48 b8 ff ff ff 7f ff    mov    $0xffffffff7fffffff,%rax
ffffffff8100c2cc:       ff ff ff
ffffffff8100c2cf:       48 21 c7                and    %rax,%rdi
ffffffff8100c2d2:       48 89 f8                mov    %rdi,%rax
ffffffff8100c2d5:       c3                      retq

ffffffff8100c31b <pte_pfn_to_mfn>:
ffffffff8100c31b:       40 f6 c7 01             test   $0x1,%dil
ffffffff8100c31f:       53                      push   %rbx
ffffffff8100c320:       74 24                   je     ffffffff8100c346 <pte_pfn_to_mfn+0x2b>
ffffffff8100c322:       48 bb ff 0f 00 00 00    mov    $0xffffc00000000fff,%rbx
ffffffff8100c329:       c0 ff ff
ffffffff8100c32c:       48 21 fb                and    %rdi,%rbx
ffffffff8100c32f:       48 c1 e7 12             shl    $0x12,%rdi
ffffffff8100c333:       48 c1 ef 1e             shr    $0x1e,%rdi
ffffffff8100c337:       e8 78 ff ff ff          callq  ffffffff8100c2b4 <pfn_to_mfn> <==
ffffffff8100c33c:       48 89 c7                mov    %rax,%rdi
ffffffff8100c33f:       48 c1 e7 0c             shl    $0xc,%rdi
ffffffff8100c343:       48 09 df                or     %rbx,%rdi
ffffffff8100c346:       48 89 f8                mov    %rdi,%rax
ffffffff8100c349:       5b                      pop    %rbx
ffffffff8100c34a:       c3                      retq

> (XEN)    ffffffff8100c3da ffffffff8100c1c9 0000000000100000 0000000000202000

ffffffff8100c3d5 <xen_make_pte>:
ffffffff8100c3d5:       e8 41 ff ff ff          callq  ffffffff8100c31b <pte_pfn_to_mfn> <==
ffffffff8100c3da:       c3                      retq

ffffffff8100c1b8 <__raw_callee_save_xen_make_pte>:
ffffffff8100c1b8:       51                      push   %rcx
ffffffff8100c1b9:       52                      push   %rdx
ffffffff8100c1ba:       56                      push   %rsi
ffffffff8100c1bb:       57                      push   %rdi
ffffffff8100c1bc:       41 50                   push   %r8
ffffffff8100c1be:       41 51                   push   %r9
ffffffff8100c1c0:       41 52                   push   %r10
ffffffff8100c1c2:       41 53                   push   %r11
ffffffff8100c1c4:       e8 0c 02 00 00          callq  ffffffff8100c3d5 <xen_make_pte> <==
ffffffff8100c1c9:       41 5b                   pop    %r11
ffffffff8100c1cb:       41 5a                   pop    %r10
ffffffff8100c1cd:       41 59                   pop    %r9
ffffffff8100c1cf:       41 58                   pop    %r8
ffffffff8100c1d1:       5f                      pop    %rdi
ffffffff8100c1d2:       5e                      pop    %rsi
ffffffff8100c1d3:       5a                      pop    %rdx
ffffffff8100c1d4:       59                      pop    %rcx
ffffffff8100c1d5:       c3                      retq

> (XEN)    0000000000000003 00000000000001ff 8000000003c00063 0000000000000000
> (XEN)    0000000020000000 8000000000000163 ffffffff812f8981 ffffffffff400000

ffffffff812f88a8 <phys_pte_init>:
ffffffff812f88a8:       41 57                   push   %r15
ffffffff812f88aa:       48 89 f0                mov    %rsi,%rax
ffffffff812f88ad:       49 89 d7                mov    %rdx,%r15
ffffffff812f88b0:       48 c1 e8 0c             shr    $0xc,%rax
ffffffff812f88b4:       41 56                   push   %r14
ffffffff812f88b6:       25 ff 01 00 00          and    $0x1ff,%eax
ffffffff812f88bb:       49 89 d6                mov    %rdx,%r14
ffffffff812f88be:       48 8d 3c c7             lea    (%rdi,%rax,8),%rdi
...
ffffffff812f8964:       4c 89 e7                mov    %r12,%rdi
ffffffff812f8967:       48 23 3d d2 da 1d 00    and    0x1ddad2(%rip),%rdi        # ffffffff814d6440 <__supported_pte_mask>
ffffffff812f896e:       48 89 d8                mov    %rbx,%rax
ffffffff812f8971:       48 25 00 f0 ff ff       and    $0xfffffffffffff000,%rax
ffffffff812f8977:       48 09 c7                or     %rax,%rdi
ffffffff812f897a:       ff 14 25 70 6c 46 81    callq  *0xffffffff81466c70 <==
ffffffff812f8981:       48 89 c6                mov    %rax,%rsi
ffffffff812f8984:       48 8b 7c 24 10          mov    0x10(%rsp),%rdi
ffffffff812f8989:       ff 14 25 30 6c 46 81    callq  *0xffffffff81466c30
ffffffff812f8990:       48 89 d8                mov    %rbx,%rax

I then compared this to output of

debdiff linux-2.6_2.6.32-31.diff.gz linux-2.6_2.6.32-33.diff.gz

and looked for potentially related hunks:

+index 8451908..166b824 100644
+--- a/mm/mremap.c
++++ b/mm/mremap.c
+@@ -92,9 +92,7 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd,
+                */
+               mapping = vma->vm_file->f_mapping;
+               spin_lock(&mapping->i_mmap_lock);
+-              if (new_vma->vm_truncate_count &&
+-                  new_vma->vm_truncate_count != vma->vm_truncate_count)
+-                      new_vma->vm_truncate_count = 0;
++              new_vma->vm_truncate_count = 0;
+       }
+
+       /*

# upstream a3e8cc643d22d2c8ed36b9be7d9c9ca21efcf7f7

+diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
+index 350a3de..6ec047d 100644
+--- a/arch/x86/xen/mmu.c
++++ b/arch/x86/xen/mmu.c
+@@ -1658,9 +1658,6 @@ static __init void xen_map_identity_early(pmd_t *pmd, unsigned long max_pfn)
+               for (pteidx = 0; pteidx < PTRS_PER_PTE; pteidx++, pfn++) {
+                       pte_t pte;
+
+-                      if (pfn > max_pfn_mapped)
+-                              max_pfn_mapped = pfn;
+-
+                       if (!pte_none(pte_page[pteidx]))
+                               continue;
+
+@@ -1704,6 +1701,12 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
+       pud_t *l3;
+       pmd_t *l2;
+
++      /* max_pfn_mapped is the last pfn mapped in the initial memory
++       * mappings. Considering that on Xen after the kernel mappings we
++       * have the mappings of some pages that don't exist in pfn space, we
++       * set max_pfn_mapped to the last real pfn mapped. */
++      max_pfn_mapped = PFN_DOWN(__pa(xen_start_info->mfn_list));
++
+       /* Zap identity mapping */
+       init_level4_pgt[0] = __pgd(0);
+
+@@ -1767,9 +1770,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
+ {
+       pmd_t *kernel_pmd;
+
+-      max_pfn_mapped = PFN_DOWN(__pa(xen_start_info->pt_base) +
+-                                xen_start_info->nr_pt_frames * PAGE_SIZE +
+-                                512*1024);
++      max_pfn_mapped = PFN_DOWN(__pa(xen_start_info->mfn_list));
+
+       kernel_pmd = m2v(pgd[KERNEL_PGD_BOUNDARY].pgd);
+       memcpy(level2_kernel_pgt, kernel_pmd, sizeof(pmd_t) * PTRS_PER_PMD);

# upstream 14988a4d350ce3b41ecad4f63c4f44c56f5ae34d

If you have time and skills you might want to try reverting these.

-Timo



Reply to: