[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Connot load Wheezy in a "virgin" desktop -- long



I apologize in advance for the length of this post.  Since however I do not know what information is necessary to determine why this installation failed I am including everything which I have the least suspicion may be contributing to the failure.

To begin I describe the essentials of my recently purchased custom designed desktop.  It consists of a Gigabyte GA-Z87N mainboard, the Z87N being one of Intel's latest Haswell chipsets.  Integrated in that processor is Intel's HD graphics 4600, the mainboard having the required graphics outputs.  Two 2 terabyte hard drives are designated as a RAID1.

I started out with the intention of  installing only the basic system, thereby shortening  reinstalment processes if for any reason I needed to start the installer over again -- such as to change the partitioning.  At this stage for example I tried to install UEFI.  Since I was unsuccessful I reverted to the standard BIOS.  My experience trying to install UEFI will be the subject of another thread.

Since Dec 31 when I started this process I have lost count of the number of false starts.  My guess is that since then I ended up initiating the installer about fifteen times.   

When I thought everything was ready to install the desktop environment (DE) and other things I tried to do so manually by installing xserver-xorg and xfce meta packages.  This method was unsuccessful. Examination of log files indicated that much was missing, including whatever package contains startx.  (Running startx in a virtual terminal returned the message "command not found".)

I then decided to allow the installer to install the DE and other selected tasks; to do so however required initiating the installer once again -- in fact twice.  First I set up in the partition section of the installer with xfs file type in most of my partitions with lilo as the boot loader, but  the installer would not install lilo.  (Does xfs still require lilo instead of grub?)

The second time I used the ext4 file type with grub as the boot loader.  The boot process aborted with a  "failed" message.  To see what would happen  I typed Control-D to resume the boot.  In due course the xfce login window appeared.  After entering my user name and password the monitor went blank.  At this point the computer was totally unresponsive to any input.  I had to reboot.

This boot ran to the point where the line containing Control-D appeared.  This time I entered the root password in order to examine the output of the dmesg command, various log files and the information which scrolls by on the monitor while the OS is loading -- and perhaps glean some idea of why the DE would not load.  

The first problem encountered was that something was detecting and enabling the Logitech USB optical mouse many times, even to the point of interrupting other commands and their outputs.  I solved that problem -- at least temporarily -- by disconnecting it.  In a virtual terminal it is anyway not needed.

The first part of the dmesg output which scares me reads as follows.
-------------------------------------------------------------------------
[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.2.0-4-amd64 root=/dev/mapper/TG1-root ro single
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] xsave/xrstor: enabled xstate_bv 0x7, cntxt size 0x340
[    0.000000] Checking aperture...
[    0.000000] No AGP bridge found
[    0.000000] Calgary: detecting Calgary via BIOS EBDA area
[    0.000000] Calgary: Unable to locate Rio Grande table in EBDA - bailing!
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: at /build/linux-rrsxby/linux-3.2.51/drivers/iommu/dmar.c:492 warn_invalid_dmar+0x77/0x85()
[    0.000000] Hardware name: Z87N-WIFI
[    0.000000] Your BIOS is broken; DMAR reported at address 0!
[    0.000000] BIOS vendor: American Megatrends Inc.; Ver: F4; Product Version: To be filled by O.E.M.
[    0.000000] Modules linked in:
[    0.000000] Pid: 0, comm: swapper Not tainted 3.2.0-4-amd64 #1 Debian 3.2.51-1
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff81046cbd>] ? warn_slowpath_common+0x78/0x8c
[    0.000000]  [<ffffffff81046d1f>] ? warn_slowpath_fmt_taint+0x3d/0x42
[    0.000000]  [<ffffffff81348632>] ? __pte+0x7/0x8
[    0.000000]  [<ffffffff816bde4c>] ? __early_set_fixmap+0x89/0x8d
[    0.000000]  [<ffffffff812757e5>] ? warn_invalid_dmar+0x77/0x85
[    0.000000]  [<ffffffff816df6d4>] ? check_zero_address+0xad/0xdc
[    0.000000]  [<ffffffff81358b26>] ? bad_to_user+0x620/0x620
[    0.000000]  [<ffffffff816df714>] ? detect_intel_iommu+0x11/0xaf
[    0.000000]  [<ffffffff816b1e38>] ? pci_iommu_alloc+0x3f/0x67
[    0.000000]  [<ffffffff816bdc2b>] ? mem_init+0x14/0xe5
[    0.000000]  [<ffffffff816ab94e>] ? start_kernel+0x1d0/0x3c3
[    0.000000]  [<ffffffff816ab140>] ? early_idt_handlers+0x140/0x140
[    0.000000]  [<ffffffff816ab3c4>] ? x86_64_start_kernel+0x104/0x111
[    0.000000] ---[ end trace 01021c3814caad1d ]---
----------------------------------------------------------------------------
The first thing that scares me is the line: " Your BIOS is broken; DMAR reported at address 0!"  After some online research I discovered that this phenomenon also occurs in other mainboard brands besides Gigabyte.  Three possibilities of removing it were mentioned in posts online: disabling VT-d,  turning off iommu or updating the BIOS.

The closest thing to VT-d I could find in the BIOS was something called "Intel Virtualization Technology".  I am not sure what that means or what it does; disabling it however made no difference; so I reënabled it.  I did not try turning off iommu or updating the BIOS.

So I suppose the real questions at this point are the following.  What purpose does this file serve?   Is the invalidity of the DMAR referred to in the "WARNING" line above sufficient to cause the DE not to load?

The other part of the dmesg output which concerns me follows.
---------------------------------------------------------------------------
1.240960]  sdb: sdb1 sdb2
[    1.241103] sd 1:0:0:0: [sdb] Attached SCSI disk
[    1.260609]  sda: sda1 sda2
[    1.260755] sd 0:0:0:0: [sda] Attached SCSI disk
[    1.593645] md: md0 stopped.
[    1.594503] md: bind<sdb1>
[    1.594659] md: bind<sda1>
[    1.595242] md: raid1 personality registered for level 1
[    1.595394] bio: create slab <bio-1> at 1
[    1.595484] md/raid1:md0: active with 2 out of 2 mirrors
[    1.595541] md0: detected capacity change from 0 to 248315904
[    1.596423]  md0: unknown partition table
[    1.683228] Refined TSC clocksource calibration: 3392.144 MHz.
[    1.683278] Switching to clocksource tsc
[    1.797451] md: md1 stopped.
[    1.797959] md: bind<sdb2>
[    1.798118] md: bind<sda2>
[    1.798620] md/raid1:md1: not clean -- starting background reconstruction
[    1.798673] md/raid1:md1: active with 2 out of 2 mirrors
[    1.798731] md1: detected capacity change from 0 to 1499865088000
[    1.806447]  md1: unknown partition table
[    1.999928] device-mapper: uevent: version 1.0.3
[    2.000006] device-mapper: ioctl: 4.22.0-ioctl (2011-10-19) initialised: dm-devel@redhat.com
[    2.195467] EXT4-fs (dm-0): INFO: recovery required on readonly filesystem
[    2.195518] EXT4-fs (dm-0): write access will be enabled during recovery
[    2.263170] md: resync of RAID array md1
[    2.263216] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[    2.263264] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
[    2.263330] md: using 128k window, over a total of 1464712000k.
[    2.277910] EXT4-fs (dm-0): recovery complete
[    2.319337] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
--------------------------------------------------------------------------
The lines above which i do not understand are 1.596423 and 1.806447, both of which say that the system is not aware of partition tables for md0 and md1.  Both are part of  a RAID1; md0 contains only the /the /boot partition, which happens to be empty because the boot loader is in the MBR; and md1 is the only physical volume in LVM volume group TH1.  All the other partitions are logical volumes in that volume group.

The following quote neither comes from the output of dmesg nor is part of syslog.  Instead it appears at the end of the information which scrolls by on the screen as part of the boot process.
--------------------------------------------------------------------------
[ ok ] setting up LVM Volume Groups ... done.
[ .... ] Starting remaining crypto disks .... [info] TG1-swap_crypt (starting) ... TG1 -swap_crypt (started) ... TG1-swap_crypt (running) ... [info] TG1-tmp_crypt (starting) ...
[  ok  mp_crypt (started ) ... done.  {sic}
[ ok ] Activating lvm and md swap ... done.
[....]  Checking file systems ... fsck from util-linux 2.20.1
fsck.ext4: Unable to resolve 'UUID=a5fdb692-2b34-4e18-8fd5-c24dde957071'
fsck.ext4: No such file or directory while trying to open /dev/mapper/TH1-ken
Possibly non-existent device?
fsck.ext4: No such file or directory while trying to open /dev/mapper/TH1-martin
Possibly non-existent device?
fsck.ext2: No such file or directory while trying to open /dev/mapper/TH1-tmp_crypt
Possibly non-existent device?
fsck.ext4: No such file or directory while trying to open /dev/mapper/TH1-var
Possibly non-existent device?
fsck died with exit status 8
failed (code 8).  {code 8 means "an operational error"  -- my comment.}
[....]  File system check failed.  A log is being saved in /var/log/fsck.checkfs if
[FAIL] the location is writable.  Please repair the file system manually. ... failed!
[....] A maintenance shell will now be started.  CONTROL-D will terminate this [warning] shell and resume system boot. ... (warning).
Give root password for maintenance
(or type Control-D to continue):
----------------------------------------------------------------------------
I am reasonably certain that this failure is the main -- possibly the only -- reason for failure of the boot process to complete and install the DE.  I am also at a loss as to how to fix it.  The /etc/fstab file shows those four partitions -- with file type ext4 -- are mounted in accordance with the partitions created during installation.  The output of command blkid also shows correctly the same information.  In maintenance mode I was able to access all the "failed" mount points and write files to them.

I would appreciate any advice from anybody out there as to how to make this computer operational.  Once again I apologize for the length of this post.

Regards, Ken Heard.


Reply to: