[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Install failes in LDOM



On Tue, Dec 10, 2013 at 12:19 AM, Anatoly Pugachev <matorola@gmail.com> wrote:
> Jurij , Rainer ,
>
> Just installed debian 7.2 sparc into LDOM , with boot over network
> with help of http://tech.libresoft.es/doku.php/en:debianontoldom
>
> Tried installation from externally attached ISO - it doesn't work for
> me as well, same symptoms on this system, inaccessible cdrom device:
>
> # modprobe sunvnet
> # modprobe sunvdc
> # tail /var/log/syslog
>
> Dec  9 16:29:25 kernel: [460613.520627] sunvdc.c:v1.0 (June 25, 2007)
> Dec  9 16:29:25 kernel: [460613.521418] sunvdc: vdiska: 33398784
> sectors (16308 MB)
> Dec  9 16:29:25 kernel: [460613.521938]  vdiska: unknown partition table
> Dec  9 16:29:25 kernel: [460613.523047] sunvdc: vdiskb: 1310720 sectors (640 MB)
> Dec  9 16:29:25 kernel: [460613.523470]  vdiskb: vdiskb1 vdiskb2
> vdiskb3 vdiskb4 vdiskb5 vdiskb6 vdiskb7
> Dec  9 16:30:26 udevd[72]: timeout '/sbin/blkid -o udev -p
> /dev/.tmp-block-254:8'
> Dec  9 16:30:27 udevd[72]: timeout: killing '/sbin/blkid -o udev -p
> /dev/.tmp-block-254:8' [982]
> Dec  9 16:30:28 udevd[72]: timeout: killing '/sbin/blkid -o udev -p
> /dev/.tmp-block-254:8' [982]
> Dec  9 16:30:29 udevd[72]: timeout: killing '/sbin/blkid -o udev -p
> /dev/.tmp-block-254:8' [982]
>
> vdiskb is being attached ISO, vdiska is virtual harddisk.
>
> Network installation succeed, went to reboot, and LDOM boots into
> installed debian, but somehow a few package files got broken (checked
> with debsums, so I have to reinstall perl, perl-base and a few
> others)... I'm probably going to try another installation, just to see
> is files corruption on install is reproducible.

reinstalled today a few times, random file corruption on installation,
sometimes it's not even possible to partition vdiska, in case of LVM ,
it reports "Incorrect metadata area header checksum", sometimes it is
not possible to create a filesystem on a partition.

If it's installed, it boots with read-only mounted root. Syslog is filled with

Dec 10 18:49:04 debian7 kernel: [541063.292335] EXT4-fs warning
(device dm-0): ext4_end_bio:250: I/O error writing to inode 16649
(offset 0 size 8192 starting block 41485)
Dec 10 18:49:04 debian7 kernel: [541063.292360] EXT4-fs warning
(device dm-0): ext4_end_bio:250: I/O error writing to inode 16650
(offset 0 size 8192 starting block 41487)
Dec 10 18:49:04 debian7 kernel: [541063.292408] EXT4-fs warning
(device dm-0): ext4_end_bio:250: I/O error writing to inode 16651
(offset 0 size 24576 starting block 41489)
Dec 10 18:49:04 debian7 kernel: [541063.292453] EXT4-fs warning
(device dm-0): ext4_end_bio:250: I/O error writing to inode 16652
(offset 0 size 24576 starting block 41495)
Dec 10 18:49:04 debian7 kernel: [541063.292476] EXT4-fs warning
(device dm-0): ext4_end_bio:250: I/O error writing to inode 16653
(offset 0 size 8192 starting block 41501)

leaving machine working , got the following on the linux console:

root@(none):~# [550888.991063] EXT4-fs (dm-0): re-mounted. Opts:
errors=remount-ro
[550931.870863] ds-0: Machine description update.
[550931.870885] ------------[ cut here ]------------
[550931.870900] WARNING: at
/build/linux-lrlTWh/linux-3.2.51/mm/page_alloc.c:1335
get_page_from_freelist+0x28c/0x4c4()
[550931.870909] Modules linked in: loop flash ext4 crc16 jbd2 mbcache
dm_mod sunvnet sunvdc
[550931.870932] Call Trace:
[550931.870939]  [00000000004dba7c] get_page_from_freelist+0x28c/0x4c4
[550931.870946]  [00000000004dc3c0] __alloc_pages_nodemask+0x13c/0x798
[550931.870957]  [000000000077cc54] cache_alloc_refill+0x330/0x5a8
[550931.870964]  [000000000050de44] __kmalloc+0xb4/0x17c
[550931.870973]  [0000000000437bcc] mdesc_kmalloc+0x10/0x5c
[550931.870979]  [0000000000437e08] mdesc_update+0x2c/0x19c
[550931.870986]  [0000000000446a48] md_update_data+0x18/0x64
[550931.870992]  [0000000000446954] ds_thread+0x198/0x1e4
[550931.870999]  [000000000048133c] kthread+0x5c/0x70
[550931.871006]  [000000000042a9cc] kernel_thread+0x30/0x48
[550931.871011]  [0000000000481678] kthreadd+0xe0/0x124
[550931.871016] ---[ end trace 2c263e75fdb3fa95 ]---
[550931.871082] VIO: Adding device vnet-port-0-1
[550931.872083] sunvnet: eth0: PORT ( remote-mac 00:21:f6:00:00:7d )
[550931.872149]               \|/ ____ \|/
[550931.872151]               "@'/ .. \`@"
[550931.872152]               /_| \__/ |_\
[550931.872153]                  \__U_/
[550931.872166] kldomd(44): Kernel illegal instruction [#1]
[550931.872174] TSTATE: 0000004480e01601 TPC: 00000000009052ac TNPC:
00000000009052b0 Y: 00000000    Tainted: G        W
[550931.872191] TPC: <mdesc_memblock_free+0x0/0x6c>
[550931.872197] g0: 0000000000837260 g1: 00000000009052ac g2:
0000000000000000 g3: 0000000000000000
[550931.872206] g4: fffff8042d71a7a0 g5: fffff80039adc000 g6:
fffff8042c840000 g7: 0000000000000001
[550931.872215] o0: fffff8042f9ea000 o1: fffff8042f9ea020 o2:
0000000000000001 o3: 0000000000445280
[550931.872223] o4: 00000000000002d2 o5: ffffffffffffffff sp:
fffff8042c8433d1 ret_pc: 0000000000437f2c
[550931.872232] RPC: <mdesc_update+0x150/0x19c>
[550931.872238] l0: 0000000000027afc l1: 00000000008b26e8 l2:
00000000008b26e8 l3: 0000000000000000
[550931.872247] l4: 00000000f027a972 l5: 00000000fed01981 l6:
00000000f027af4c l7: 0000000000000000
[550931.872255] i0: 0000000000000038 i1: 0000000000000000 i2:
00000000008ac800 i3: 0000000000000000
[550931.872263] i4: 0000000000000000 i5: fffff8042f9ea000 i6:
fffff8042c843491 i7: 0000000000446a48
[550931.872272] I7: <md_update_data+0x18/0x64>
[550931.872276] Call Trace:
[550931.872282]  [0000000000446a48] md_update_data+0x18/0x64
[550931.872289]  [0000000000446954] ds_thread+0x198/0x1e4
[550931.872296]  [000000000048133c] kthread+0x5c/0x70
[550931.872303]  [000000000042a9cc] kernel_thread+0x30/0x48
[550931.872310]  [0000000000481678] kthreadd+0xe0/0x124
[550931.872315] Disabling lock debugging due to kernel taint
[550931.872323] Caller[0000000000446a48]: md_update_data+0x18/0x64
[550931.872330] Caller[0000000000446954]: ds_thread+0x198/0x1e4
[550931.872337] Caller[000000000048133c]: kthread+0x5c/0x70
[550931.872345] Caller[000000000042a9cc]: kernel_thread+0x30/0x48
[550931.872352] Caller[0000000000481678]: kthreadd+0xe0/0x124
[550931.872357] Instruction DUMP: 00000000  00000000  00000000
[550931.872370]  00000000  00000000  00000000  00000000  00000000


So, given all above, current state of linux in LDOMs currently is unusable.

At least on T5-2 hardware with solaris 11:

sysadmin@deimos:~$ pkg info pkg:/entire | grep FMRI
          FMRI:
pkg://solaris/entire@0.5.11,5.11-0.175.1.13.0.6.0:20131108T211557Z

sysadmin@deimos:~$ ldm -V

Logical Domains Manager (v 3.1.0.0.24)
        Hypervisor control protocol v 1.11
        Using Hypervisor MD v 1.4

System PROM:
        Hostconfig      v. 1.3.3.a      @(#)Hostconfig 1.3.3.a 2013/08/12 10:46
        Hypervisor      v. 1.12.3.a     @(#)Hypervisor 1.12.3.a 2013/08/12 10:21
        OpenBoot        v. 4.35.3       @(#)OpenBoot 4.35.3 2013/08/05 11:37

with LDOM linux:
 - debian 7.2 , 3.2.0-4-sparc64-smp kernel




PS: If any of the developers are interested, we could try to debug it,
or later I plan to install dedicated T1000/T2000 , that will take me a
month or so, and install it with solaris 10 (not sure about 11) and
linux (on another hdd), and create a test environment which would be
accessible from the internet, in case some of developers would like to
test/debug.


Reply to: