Re: Install failes in LDOM
On Tue, Dec 10, 2013 at 12:19 AM, Anatoly Pugachev <matorola@gmail.com> wrote:
> Jurij , Rainer ,
>
> Just installed debian 7.2 sparc into LDOM , with boot over network
> with help of http://tech.libresoft.es/doku.php/en:debianontoldom
>
> Tried installation from externally attached ISO - it doesn't work for
> me as well, same symptoms on this system, inaccessible cdrom device:
>
> # modprobe sunvnet
> # modprobe sunvdc
> # tail /var/log/syslog
>
> Dec 9 16:29:25 kernel: [460613.520627] sunvdc.c:v1.0 (June 25, 2007)
> Dec 9 16:29:25 kernel: [460613.521418] sunvdc: vdiska: 33398784
> sectors (16308 MB)
> Dec 9 16:29:25 kernel: [460613.521938] vdiska: unknown partition table
> Dec 9 16:29:25 kernel: [460613.523047] sunvdc: vdiskb: 1310720 sectors (640 MB)
> Dec 9 16:29:25 kernel: [460613.523470] vdiskb: vdiskb1 vdiskb2
> vdiskb3 vdiskb4 vdiskb5 vdiskb6 vdiskb7
> Dec 9 16:30:26 udevd[72]: timeout '/sbin/blkid -o udev -p
> /dev/.tmp-block-254:8'
> Dec 9 16:30:27 udevd[72]: timeout: killing '/sbin/blkid -o udev -p
> /dev/.tmp-block-254:8' [982]
> Dec 9 16:30:28 udevd[72]: timeout: killing '/sbin/blkid -o udev -p
> /dev/.tmp-block-254:8' [982]
> Dec 9 16:30:29 udevd[72]: timeout: killing '/sbin/blkid -o udev -p
> /dev/.tmp-block-254:8' [982]
>
> vdiskb is being attached ISO, vdiska is virtual harddisk.
>
> Network installation succeed, went to reboot, and LDOM boots into
> installed debian, but somehow a few package files got broken (checked
> with debsums, so I have to reinstall perl, perl-base and a few
> others)... I'm probably going to try another installation, just to see
> is files corruption on install is reproducible.
reinstalled today a few times, random file corruption on installation,
sometimes it's not even possible to partition vdiska, in case of LVM ,
it reports "Incorrect metadata area header checksum", sometimes it is
not possible to create a filesystem on a partition.
If it's installed, it boots with read-only mounted root. Syslog is filled with
Dec 10 18:49:04 debian7 kernel: [541063.292335] EXT4-fs warning
(device dm-0): ext4_end_bio:250: I/O error writing to inode 16649
(offset 0 size 8192 starting block 41485)
Dec 10 18:49:04 debian7 kernel: [541063.292360] EXT4-fs warning
(device dm-0): ext4_end_bio:250: I/O error writing to inode 16650
(offset 0 size 8192 starting block 41487)
Dec 10 18:49:04 debian7 kernel: [541063.292408] EXT4-fs warning
(device dm-0): ext4_end_bio:250: I/O error writing to inode 16651
(offset 0 size 24576 starting block 41489)
Dec 10 18:49:04 debian7 kernel: [541063.292453] EXT4-fs warning
(device dm-0): ext4_end_bio:250: I/O error writing to inode 16652
(offset 0 size 24576 starting block 41495)
Dec 10 18:49:04 debian7 kernel: [541063.292476] EXT4-fs warning
(device dm-0): ext4_end_bio:250: I/O error writing to inode 16653
(offset 0 size 8192 starting block 41501)
leaving machine working , got the following on the linux console:
root@(none):~# [550888.991063] EXT4-fs (dm-0): re-mounted. Opts:
errors=remount-ro
[550931.870863] ds-0: Machine description update.
[550931.870885] ------------[ cut here ]------------
[550931.870900] WARNING: at
/build/linux-lrlTWh/linux-3.2.51/mm/page_alloc.c:1335
get_page_from_freelist+0x28c/0x4c4()
[550931.870909] Modules linked in: loop flash ext4 crc16 jbd2 mbcache
dm_mod sunvnet sunvdc
[550931.870932] Call Trace:
[550931.870939] [00000000004dba7c] get_page_from_freelist+0x28c/0x4c4
[550931.870946] [00000000004dc3c0] __alloc_pages_nodemask+0x13c/0x798
[550931.870957] [000000000077cc54] cache_alloc_refill+0x330/0x5a8
[550931.870964] [000000000050de44] __kmalloc+0xb4/0x17c
[550931.870973] [0000000000437bcc] mdesc_kmalloc+0x10/0x5c
[550931.870979] [0000000000437e08] mdesc_update+0x2c/0x19c
[550931.870986] [0000000000446a48] md_update_data+0x18/0x64
[550931.870992] [0000000000446954] ds_thread+0x198/0x1e4
[550931.870999] [000000000048133c] kthread+0x5c/0x70
[550931.871006] [000000000042a9cc] kernel_thread+0x30/0x48
[550931.871011] [0000000000481678] kthreadd+0xe0/0x124
[550931.871016] ---[ end trace 2c263e75fdb3fa95 ]---
[550931.871082] VIO: Adding device vnet-port-0-1
[550931.872083] sunvnet: eth0: PORT ( remote-mac 00:21:f6:00:00:7d )
[550931.872149] \|/ ____ \|/
[550931.872151] "@'/ .. \`@"
[550931.872152] /_| \__/ |_\
[550931.872153] \__U_/
[550931.872166] kldomd(44): Kernel illegal instruction [#1]
[550931.872174] TSTATE: 0000004480e01601 TPC: 00000000009052ac TNPC:
00000000009052b0 Y: 00000000 Tainted: G W
[550931.872191] TPC: <mdesc_memblock_free+0x0/0x6c>
[550931.872197] g0: 0000000000837260 g1: 00000000009052ac g2:
0000000000000000 g3: 0000000000000000
[550931.872206] g4: fffff8042d71a7a0 g5: fffff80039adc000 g6:
fffff8042c840000 g7: 0000000000000001
[550931.872215] o0: fffff8042f9ea000 o1: fffff8042f9ea020 o2:
0000000000000001 o3: 0000000000445280
[550931.872223] o4: 00000000000002d2 o5: ffffffffffffffff sp:
fffff8042c8433d1 ret_pc: 0000000000437f2c
[550931.872232] RPC: <mdesc_update+0x150/0x19c>
[550931.872238] l0: 0000000000027afc l1: 00000000008b26e8 l2:
00000000008b26e8 l3: 0000000000000000
[550931.872247] l4: 00000000f027a972 l5: 00000000fed01981 l6:
00000000f027af4c l7: 0000000000000000
[550931.872255] i0: 0000000000000038 i1: 0000000000000000 i2:
00000000008ac800 i3: 0000000000000000
[550931.872263] i4: 0000000000000000 i5: fffff8042f9ea000 i6:
fffff8042c843491 i7: 0000000000446a48
[550931.872272] I7: <md_update_data+0x18/0x64>
[550931.872276] Call Trace:
[550931.872282] [0000000000446a48] md_update_data+0x18/0x64
[550931.872289] [0000000000446954] ds_thread+0x198/0x1e4
[550931.872296] [000000000048133c] kthread+0x5c/0x70
[550931.872303] [000000000042a9cc] kernel_thread+0x30/0x48
[550931.872310] [0000000000481678] kthreadd+0xe0/0x124
[550931.872315] Disabling lock debugging due to kernel taint
[550931.872323] Caller[0000000000446a48]: md_update_data+0x18/0x64
[550931.872330] Caller[0000000000446954]: ds_thread+0x198/0x1e4
[550931.872337] Caller[000000000048133c]: kthread+0x5c/0x70
[550931.872345] Caller[000000000042a9cc]: kernel_thread+0x30/0x48
[550931.872352] Caller[0000000000481678]: kthreadd+0xe0/0x124
[550931.872357] Instruction DUMP: 00000000 00000000 00000000
[550931.872370] 00000000 00000000 00000000 00000000 00000000
So, given all above, current state of linux in LDOMs currently is unusable.
At least on T5-2 hardware with solaris 11:
sysadmin@deimos:~$ pkg info pkg:/entire | grep FMRI
FMRI:
pkg://solaris/entire@0.5.11,5.11-0.175.1.13.0.6.0:20131108T211557Z
sysadmin@deimos:~$ ldm -V
Logical Domains Manager (v 3.1.0.0.24)
Hypervisor control protocol v 1.11
Using Hypervisor MD v 1.4
System PROM:
Hostconfig v. 1.3.3.a @(#)Hostconfig 1.3.3.a 2013/08/12 10:46
Hypervisor v. 1.12.3.a @(#)Hypervisor 1.12.3.a 2013/08/12 10:21
OpenBoot v. 4.35.3 @(#)OpenBoot 4.35.3 2013/08/05 11:37
with LDOM linux:
- debian 7.2 , 3.2.0-4-sparc64-smp kernel
PS: If any of the developers are interested, we could try to debug it,
or later I plan to install dedicated T1000/T2000 , that will take me a
month or so, and install it with solaris 10 (not sure about 11) and
linux (on another hdd), and create a test environment which would be
accessible from the internet, in case some of developers would like to
test/debug.
Reply to: