[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Intresting dd fsck grub uuid fstab action




From: deblis@lionunicorn.co.uk

On Fri 26 May 2017 at 17:52:33 (-0400), Fungi4All wrote:
> From: deblis@lionunicorn.co.uk
>> On Thu 25 May 2017 at 16:41:37 (-0400), Fungi4All wrote:
>> > Le 25/05/2017 à 05:11, Fungi4All a écrit :
>> > > I experimented in switching a clone of my sid installation to an experimental, that was the plan.
>> > > Since I was doing other things I thought I'll let the cloning take place unattended.
>> > > Let's say sda5/6/7/8/9 were to be cloned to sdb5-9 (5 / 6 var 7 sw 8 tmp 9 home) all b partitions were slightly larger.
>> > > I used dd bs=1M for each one and though all I had to do is log in sda5 edit the sdb5 fstab and then update-grub.
>>
>> … and I get the impression that that's exactly what you did:
>> 1 dd
>> 2a "log in sda5" (is that mount, or boot?)
>>
> Boot
>>
>> 2b edit an fstab file (to what purpose, at this point?)
>>
> Why would I edit the fstab in the original installation that I copied from?

How would I know which fstab you edited? You have this situation:

fsUUIDs Disk A Disk B

MBR points to any fsUUID ABCD unknown MBR
some other partitions some other partitions
ABCD sda5 identical with sdb5
EF23 sda6 identical with sdb6
4567 sda7 identical with sdb7
89AB sda8 identical with sdb8
CDEF sda9 identical with sdb9

or do you? Why not:

fsUUIDs Disk A Disk B

MBR points to any fsUUID ABCD unknown MBR
some other partitions some other partitions
ABCD sdb5 identical with sda5
EF23 sdb6 identical with sda6
4567 sdb7 identical with sda7
89AB sdb8 identical with sda8
CDEF sdb9 identical with sda9

IOW how did you decide which disk the kernel had decided to call sda?
How did you tell which was the original and which was the identical copy?

Because the source was on hd0 (physical unit) to target hd1,
I suspect grub does not worry much about sda/sdb label
but according to bios knows the master and slave.  Gparted
will not be confused where is what and just based on contents and
size it is clear which is which, even when it decides to recognize a
usb-stick as sda.  It does care about uuids
What is strange is that my error of not paying attention for
uniqueness was not consistent. Not all 5 partitions had the same uuids

> From sda5 I edited the new copy on sdb5 but sda5 did not boot my first
> time around.

I don't know what "my first time around" means. I've struggled to
reconstruct what you did from a list of numbered questions/comments
in your OP (not a strict list of actions) and some responses to
comments in your follow-up.

Let's say I used a live debian to copy 1 partition to another
(let's eliminate the confusion of distributed installations).
Let's say sda3 (uuid=abcd) to sdb3 (****) and I go read
the uuid for sdb3 and write it to the fstab on sdb3.
Then I intent to boot on sda3 and update-grub to include a
boot entry for sdb3.  When I attempt to do so the fsck finds
errors, confusion, boots sda3 in an emergency mode and
does not see /var /tmp /home /swp
So now on this "first time around" I am trying to understand
why anything changed and sda3 (a functional installation)
does not boot when the fstab on it matches the uuids which
I never changed.  So I double check and I am wrong, they
have changed.  Was it because fsck while booting assigned
new uuids to stop the confusion.
Rounds 2 and 3 eventually results on booting the original.  
Then grub is updated (maybe again).  Then I try to boot sdb3
and it boots on sda3 but with the rest of partitions (var tmp
home swp) from sdb.  By now I have installed a pseudo file
indicators on each desktop so I can tell which is which, and
by looking at what partitions on file manager seem as mountable
but not yet mounted (partitions from other installations).

I asked "to what purpose do you edit fstab" because at this
stage I see no sense in it. The original disk has an MBR that
allegedly boots partition 5. Presumably it then mounts the
partitions of interest, 6-9 using it's fstab.

Well, that was my initial question.  I could understand something
going bad with the copy but not the source.  The only logic is that
the fsck asked to replace the uuids and I said "y" and it did.  Not
necessaraly understanding fully what the problem was with
the partitions.  I remember having to do with alignment and covering
up free space between two partitions.  Possibly this create new uuids?
Sometimes when you create a subsequent partition with gparted
there is a 1Mb between that can't be covered.

The "other" disk has an unknown MBR but is otherwise the same.
Were you to disconnect the original disk's data cable, you
could boot a rescue/live stick and install the correct MBR/grub
on the "other" disk and be able to boot *its" partition 5.
Its fstab would remain the same. All the UUIDs would still tie
up. Everything would be hunky dory so long as the disks never
"met" each other.

The disks coexist for some time now.  The sdb MBR is not used.
Grub on sda 's Mbr is responsible for booting all systems installed.
It does so by reading the grub.cfg from sda5

> The fsck messages I could only think were pertaining to
> something wrong in the new partitions but even IT did not make sense.

I have no evidence to suggest that problems could be caused by
the trailing garbage in the copied partitions (which you said
were bigger), but it would spur me into finding out.

Knowing you can not copy with dd a partition that has the size of 8 and 4gb of
data into a 5gb partition, I made the target 8.5gb (for example), so now there
would be 4.5gb of date free instead of 4 of the source.  I don't think this is a
problem.  Even one byte more than the source is adequate.  1 byte less causes
an error.


>>
>> 3 run update-grub
>>
> Yes, thinking that at least this would fix the original booting up.
> Either source or target wouldn't matter.

But I think this would screw up the (which?) grub.cfg royally.
And, unless you're actually a forensic scientist prepared to
carry out experiments and examine the code (because there's
some seriously valuable deductions to be made), picking over
the messed up file is probably a waste of time.

I don't understand a single bit of your worries here.  I had no intention
of ever editing grub manually, just do a normal and regular update so
it finds all other bootable systems and updating itself.  It was the 
problem grub eventually recreated over and over again refusing to
have a correct entry for sdb5 even when ALL UUIDs were unique and
properly registered at their corresponding fstabs

The mess is very specific at this point, I included the actual point of the problem,
on the same one entry for sdb5 two different uuids were used, as a result
on the seconf new installation it would boot according to grub instruction
sda5 with all other sdb6-9 partitions.  When I edited those lines and wrote
into grub.cfg manually the correct uuids the problem ended and everything
worked as it should.  Simple!

>>
>> Assuming that your copies are good, and that the apparent
>> "alterations" to sda… arise from misunderstanding or misinterpreting
>> UUIDs (or that they actually post-date your running update-grub), then
>> your problems arise from one or both of:
>>
> I had no intention to alter anything in sda, why would I. It was just a
> source to copy from an already functional system. The only change
> would have been an additional entry to grub.cfg in sda5 for booting to
> sdb5.

Computers don't take notice of intentions, only actions.

And this was done by updating grub which mixed and matched ids even though
it clearly found and recognized the installation on sdb5.

>> Booting into sda5 with both devices sda and sdb connected,
>> Running update-grub with both devices sda and sdb connected.
>>
> That only happened after I had already copied the "new" uuids on sdb5
> to the fstab there.

That's no help. You've already booted a system with duplicate UUIDs
using a version of grub that finding things by UUID (choosing between
original and copied partitions at random?, who knows?) and a kernel
that guesses at random which disk to label /dev/sda and which to
label /dev/sdb.

OK, since you blame every single bit of it in what I did why don't you 
tell me what are the correct steps for doing exactly what I wanted to
do, copy an installation on an other hard disk and having it boot by
the same original grub that handled all other systems.
Because I am clueless at this point on what I exactly I did wrong and
why such a simple procedure becomes so enormous source of problems.

Also please explain the logic of the developer of dd in "copying and transfering"
a UUID that is supposedly meant to be unique.  Why not always asign a new
one every time?  Because it is meant for backup work and not for copying?
Why grub be updated and write a new entry (grub's terminology) for the
new installation and read its UUID but mismatch on the same entry a different
UUID?

I really like to know so I don't waste anyone's time writing a bug report.
And I really think this list is meant for that, saving time writing silly bug-reports.

> Simple copy-paste of 5 uuids from /dev/disk/by-uuid

I don't understand this. You seem to be fiddling round with the
UUIDs in the fstab(s) without being concerned about the actual
duplicate fsUUIDs embedded in the filesystems.

I am telling you over and over again, I had no clue and never in my
experience had I realise that dd "sometimes" does this.  Not consistent!
I discovered this in practice and "fixed it", by using gparted and making
new uuids for anything duplicate (which I believe only happened in 2
out of 5 partitions).  That is when I fixed the fstabs and then booted
sda5 and asked it update its grub.  Then the problem became one of
grub alone.  I don't know how else can I explain it.  If I say I do this
in step 4 you tell me I shouldn't because there was a problem in step2
I tell you what I did to fix it to get to 4 you tell me what I did in 2
caused 4.  

It shouldn't take 5hours of guessing to copy a 5gb system from one
disk to the next and expect to be able to run both in the next hour
or so.  This is rediculusly complex and inconsistent.  Why so much
redundancy?  And this Uuid thing has got me thinking in ways that
I didn't before.  Why within a single system (not a WAN) does uniqueness
has to be in the order of trillions and trillions?


>>
>> > > 1 - Does dd leave uuid targets as they are, does it create new ones, or does it copy uuid from source to target?
>>
>> I assume that's your reporting of step 1.
>>
> ???

As I said, your OP didn't give a strict timeline of actions and
consequences, but just a numbered list of questions. Thus I wrote:

… and I get the impression that that's exactly what you did:
1 dd

If you hadn't asked this question, I might have written something
like:

… and I get the impression that this might be what you did:
1a dd
1b tune2fs -U … to create new UUIDs on the target disk

IOW in the absence of a list of actions, I was reconstructing
one in order to get a handle on what might have happened.

>>
>> > > 2 - When I rebooted sda5 was not booting up properly. Eventually I checked
>> > its fstab and corrected it but at this poing the system run an fsck and I got
>> > some errors and hit y,y,y,y, for fixing. I don't know if the errors came from
>> > uuids existing in the system twice.
>>
> Something about partition inconsistencies but the print was too small for me to read
> in detail, never mind remembering.
>>
>> Is that "log in"? What did you correct in fstab? What was wrong?
>>
> The only thing I did to fstab up to this point was at the target installation, not the original.

How do you know that? You haven't presented any evidence that you knew
which disk was which when you first started making changes. All your
actions are predicated on knowing which disk was which. When I was
cloning systems 15 years ago, that was relatively easy. /dev/hdaX
corresponded with the cables and jumpers, masters and slaves.
Now, you can't make those assumptions.

> It took a few reboots trying to figure out why the source was so affected. At this point
> I think I got the target to boot but what it was doing as the entry of Debian on sdb5 it
> would boot sda5 while using sdb6,7,8,9. The reason I guess is because sda5 and sdb5
> still had the same uuid. At some point when I used gparted to check all 10 partitions and
> create new uuids and labels, and redid both fstab to the new uuids I went back to sda5
> and reupdated grub. Then the entry for sdb5 was still booting up from sda5!
> That is when I went into grub.cfg and discovered the two uuids used for that entry had
> 2 different uuids, for the same entry. The first was sdb5 the next was sda5's uuid.

As I said, by now things could be so mixed up that you either put on a
forensic hat to work out what happened or you do as you did and just
put things straight manually once you have a set of unique UUIDs
(forgive the tautology).

Unless you take the forensic approach, there is no justification for
blaming Grub or Fsck for screwing things up (or apparently doing so).
You just have to accept that some of your assumptions are not based
on evidence, and are probably incorrect.

>>
>> Is sda the same disk as it was when you ran dd? There's no reason why
>> it should be if the kernel races to label them sda and sdb. How did
>> you check?
>>
> Yes, dd run from a stick sdc "dd if=sda5 of=sdb5 bs=1M" then 6,8,9
> 7 was swap created with gparted, why copy it?

Well, you implied you cloned it in your very first paragraph. What
reason do I have to doubt you? Shaky ground, I know.

So let's assume now that you didn't copy it, and partition 7 on the
target was empty/FAT/something/whatever. Each time the system ran,
did you observe the swap file being started? Either of the fstab files
would point to the correct partition because this UUID is one
that didn't get duplicated. So you've either got to examine dmesg
carefully for a line like
Adding 979960k swap on /dev/sda4. Priority:-1 extents:1 across:979960k FS
and check that that is an a ↑ and not a b,
or notice this fact as it flashes by. Why would be looking out for it?
At this point in time, you were confident that the source was still
sda and the copy sdb.

>>
>> Why does a system suddenly run an fsck?
>>
> It run on recovery so I can run update grub before a full boot-up.
> Before the login root prompt it reported the disk problems that never
> had before with the same disks. But I'm pretty sure it checks filesystems
> in every boot because I remember it says hit ctrl-C to skip filechecks.

My question was asked before you said that you had rebooted many
times, so no surprise to me anymore.

↓ that's a "3".
>>
>> > > 3 - Updating grub seemed to have made a bigger mess as now I could
>> > boot up but the partitions were mixed up between sda and sdb parts. It
>> > would boot-up on root / of sda and have a home in sdb9 ....
>>
>> Yes. I would expect grub to be completely confused about what to
>> write in …wherever/grub/grub.cfg and possibly have no idea which
>> MBR it related to, with two disks present.
>>
> With the same 2 disks grub had been updated before, at least 2 times
> recently. I don't understand you reasoning, but again: After all uuid's
> were unique and both fstabs have been edited and double-checked.

In your OP, no such thing had yet been done. Corrected things was your
"4" and my comment here is about "3" (see above).

> update-grub runs and on the next reboot it is asked on menu entry
> Debian on sdb5 to boot and it boots sda5 because on its cfg file it has
> 2 different uuids for the same entry.
> Any explanations?

That's very good of you to make the UUIDs unique, but I'm not sure
what role you think fstab plays in getting grub.cfg sorted out.
Also bear in mind that the copied (and possibly mangled/out-of-date)
grub.cfg can yield input to grub-mkconfig being run on the original
partition.

In the absence of all the information about both disks, and the
then existing grub.cfg files, it wouldn't be worth even speculating
about what grub-mkconfig would come up with.

>>
>> > The first thing I did after dd was done, was to make a txt table of all
>> > the sda* sdb* in order with uuids and rewrote the fstab on sdb, I thought
>> > the sda would be as it was. But wasn't. It wasn't all uuids that transfered
>> > but seem enough for me not to notice they were the same and for sda5
>> > to boot up and discover the discrepancy. I speculate that fsck then instead
>> > of giving sdb a new uuid it altered sda. By the time the check was done
>> > and I run grub-upd some of the uuid in grub belonged to sda partitions and
>> > some on sdb.
>> > By copying from sda to sdb I never thought the original would be affected,
>> > only the target needed fixing. That threw me off. But on the new grub.cfg
>> > both entried would boot from the sda but with remaining partitions being
>> > a mismatch.
>>
>> This makes me wonder whether you've taken on board Pascal's comments
>> re different types of UUIDs.
>>
>> And when did fsck start handing out UUIDs to filesystems, let alone devices.
>>
> I have no clue what fsck did, I assumed at that point that leaving gaps between
> partitions in an extended one was a problem and extended the end of a partition
> to avoid the gap. I am clueless of fsck's capabilities.
>>
>> > > 4 - Eventually to stay safe I booted from live usb and edited all the
>> > partitions, gave them separate labels to avoid any confusion (as "Deb
>> > on Sda6 var..."_) and switched to all new uuids to clear up the mess.
>> > Re-edited fstab on bot sda and sdb (hd0 and hd1 dos5) and rebooted
>> > sda and updated-grub again.
>>
>> And we're left guessing what the contents are.
>> Do remind me, what is msdos3 here?
>>
> I am not on that system now, but it is a different system and it is a logical part.
> I remember 4 is the extended partition and the 5-11 are within it.

I don't think 3 can be a logical partition number.

>>
>> > > 5 - Here is the unexplained part: My sda5 system booted up fine and normal, when I would try sdb5 the second line would say debian clean on sda5!!!
>> > > So I went into grub.cfg to see what is going on. On the menu entry of the second installation the first line of each entry would have the proper sdb5 uuid, the following line - which I believe is the actual command - had the uuid from sda5. So, am I to assume that grub when it updates only replaces the first line of each entry expecting the next uuid will be the same, even if it is not?
>>
>> No idea. Again, we don't know what you've written where, when you say
>> "switched to all new uuids to clear up the mess". And what's the
>> "second installation"? What are you calling an entry, and a menu entry?
>>
> Instead of twiching my eyes to read uuids digit by digit and since once published
> they are unique, at about the 3rd round I usded gparted and asked it to issue new
> uuids, to make sure they are unique.
> Grub at boot comes up and has some entries, ie
> Windows7
> Debian 8
> Debian 8 recovery
> memcheck
> Debian 9
> Debian 9 recovery
> Debian 10
> FreeBSD
> Manjaro
>
> This is what I call menu entries. They are separated in the cfg file bu groups 00
> 10 20 30 ... and contain sub-entries of the each installation, as recovery modes
> and booting other linux images. If they are not called menu entries this is what
> I mean by entry.

The OS that's running is what populates the 10 section. 20 is always
empty for me. 30 contains other OSes, and some of that information is
copied from their grub.cfg files, as Pascal has already pointed out.
This may be why there was odd-looking information in your new grub.cfg.

>>
>> > I know if I boot
>> > up something on sda12 and install or reconfigure grub in there then 12 will
>> > take control of booting all other bootable partitions in the system.
>>
>> Well, that depends on whether you install it or just configure it.
>>
> If it exists in the mbr and grub as a package is installed in sda12 then updating
> takes control of booting from sda5 and on the next boot the grub.cfg that matters
> is on sda12, right? If you I monitor, as I do, to only allow sda5 to ever update
> grub the rest are there installed in case I ever need to use them.

No. Your language is too imprecise to convey any unambiguous meaning.
Grub as a package should be installed in all the instances of linux.
Each will maintain its own /boot/grub/grub.cfg (assuming you don't
put this in a separate partition and share it between them).

In the straightforward case, the last system that "installed grub
in the MBR" is the system whose grub.cfg will be consulted at boot.
"installed grub in the MBR" has nothing to do with installing the
grub package (like apt-get install grub*) and everything to do with
running the program called grub-install.

>>
>> When you generate a new grub.cfg file, grub collects bits and pieces
>> from the grub.cfg files it finds on the various partitions it guesses
>> are linux systems. This new grub.cfg will normally be written in
>> whatever filesystem contains /boot/grub/. However, that's not
>> necessarily the grub.cfg that was read and acted upon when you just
>> booted, nor necessarily the grub.cfg that will be active when you next
>> boot. You can demonstrate this to yourself by adding a word, say, to
>> one of the id strings.
>>
> So let's say I install lubuntu in a single new partition sda100, and reboot the
> system Debian on sda1 for example, and flubuntu is not even there, there is
> a /boot but no /boot/grub and I go and manually create a boot/grub/grub.cfg
> and I write "all work and no play makes Jack a dull boy" will it make a difference
> to Debian's grub? I believe it will find the /boot and the images there and
> write a "menu entry" to properly boot flubuntu and even display the available
> kernels to boot from. Now if I boot flumuntu, install grub, then the "...Jack" will
> be replaced with new entries. This has been my experience.

It may well be, but the language here is too ambiguous to be of any
help if, say, one step didn't work properly and this was your explanation
of what you did.

> Nothing to do with auto-updating and replacing one uuid and leaving an old
> one in there for the same "entry" "pick" single booting option which caused
> unnecessary havoc.
>>
>> A grub-install, the one that writes typically to the MBR, will make
>> future boots read the new configuration file.
>>
>> Thus, if you maintain a backup linux installation on your machine,
>> and the kernel is updated on the backup, the new grub.cfg that is
>> then configured will have no direct effect on booting. However, if
>> your backup system gets an update of its grub package, that will
>> write a new boot image, and now the backup's grub.cfg will be the
>> active one on next booting.
>>
> Perfect agreement, by experience this is exactly how it has worked
> from day one of ever seeing a grub screen. I have not even hinted
> otherwise.

I'm not trying to suggest that you said otherwise, but only that your
language is too ambiguous for somebody to be convinced that you
understand it. You wrote "I know if I boot up something on sda12 and
install or reconfigure grub in there then 12 will take control of
booting all other bootable partitions in the system."
If you "reconfigure grub in there", 12 will *not* take control.
Now you may know what you mean by "install or reconfigure" but you
cannot expect somebody else (me) to read your mind.

In the paragraph you wrote about lubuntu in sda100, you wrote
"will it make a difference to Debian's grub". Again, you know
what you mean. I don't. What's Debian's grub?

Grub is a booting system, like LILO.
Grub is a suite of programs in a Debian .deb file used to configure
and install such a booting system.
Grub is some pieces of code in the MBR and elsewhere (some poorly
defined) that run in th CPU at boot time.
Using the word Grub involves making plain which part of grub
you're talking about, both in space and time.

>>
>> > [various complaints] I think this is grub's fault.
>>
>> I'm not sure blaming grub will help you straighten out your system.
>> I only hope that one of your disks still carries a suitable clone
>> of the other.
>>
> 1st of all my problem was all figured out and both work fine, is not
> a matter of trying to fix anything. I had it all fixed the night/morning
> before I wrote the message.
>
> What I am looking for is an explanation why grub on its own will
> blend two different uuids for the same "entity" and booting action.
> It labeled the entry on its own, Debian 9 on sdb5 and its single
> action contained two different uuids that I did not edit. It would
> then proceed with booting and end up on mounting sda5.
> Because the one id it had was from sdb5 the second
> was from sda5. Sda5 was all correct at that point, but on grub's
> primary and residing entry it does not even use a uuid. Right?

Well, it now appears that you are an experienced practitioner of
blending systems with non-unique UUIDs. All I can do is sit on
the sidelines and challenge your assumptions about what happened.
When you conduct a post-mortem, the first thing you have to do
is throw away all your assumptions and consider only the evidence
and the possibilities.

>From your last paragraph, particularly the "grub on its own"
_expression_, I don't think you've yet come to terms with that.
>From your last word, it appears you just want someone sitting
on the other side of the world to agree with your assumptions,
someone who has only read reported fragments of evidence
filtered through your assumptions.

Cheers,
David.



Reply to: