[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: q re transferring mysql db from redhat to debian



On Wed, Jul 28, 2004 at 11:39:40AM +1000, Kevin Littlejohn wrote:
> >that's bizarre....and could easily lead to a hopelessly corrupted database
> >when other tables refer to that id field.
> >
> >how are you supposed to restore a mysql db from backup then?
> 
> Two answers:
> 
> 1) Why are you relying on the auto_increment field to increment from highest
> point each time?  So long as it gives you a unique value (and it should
> always do that), it shouldn't matter if it's re-using an old value (if it
> does, you shouldn't have deleted the old value...). 

i'm not.  i was just curious.

btw, sometimes it does matter if record ids are re-used.  e.g. one reason not
to re-use id numbers is if it's a search field on a web database.  if someone
bookmarks a particular search (e.g. for id=99) then returning to that bookmark
should either return the same record or it should return "no such record" if it
has been deleted.  it should never return a completely different record.

actually, this is true for any kind of app, not only for web databases.  e.g.
if your sales staff are used to entering product ids from memory, or if your
customers quote their customer ID, this can lead to serious confusion or
problems.  at best, some time will be wasted sorting out the mess.  at worst,
the wrong product may be shipped or the wrong customer may be billed....or the
wrong medical records may be referred to when consulting with a patient.

in short, unique IDs need to be unique forever(*), not just unique for the
present moment.

(*) or at least a reasonable facsimile of "forever" :)

> Certinaly, if you're referring to those IDs elsewhere, and you've 
> deleted the record it was referring to, good database design would be to 
> not leave the references lying around, imnsho.

true enough.

more to the point, good database design wouldn't LET you leave them lying
around.  note: i mean database design here, not application design or schema
design.  i mean the database engine itself should not allow this to happen, it
is not something that can or should be left up to the application to enforce,
it has to be enforced by the database engine itself.



> 2) You can set the point to increment from, in a fairly hackish way, by 
> doing a "alter table tbl_name auto_increment = x" where x is the highest 
> number in use.  Requires scripting around your backup/restore process, 
> unfortunately.

no big deal.  some scripting is almost inevitable in database backup and restore.

> With regard 1, the actual definition of auto_increment doesn't preclude
> re-use of numbers as far as I know, so if you're relying on it not to, you've
> got broken code anyway.  That means the mysqldump is doing the correct thing,
> according to spec for auto_increment - there's no requirement in there to
> retain the highest number.  The name of auto_increment is misleading,
> obviously ;)

ok. "works as designed" - it's not an implementation bug it's a design bug :)


> With regard Craig's comment, if your database leaves hanging references to
> non-existant data around, you've got a broken database, whether you've
> realised it yet or not.

true, i didn't think about that at the time.  it was just my initial reaction
to the idea that there was weirdness with restoring a mysql dump.  since
dumping to text (or other re-importable format) is the only good way of backing
up a database, it seems like a major problem....being able to *reliably* backup
and restore a database is, IMO, an essential feature of any database.  you need
to be certain that what you will restore is *identical* to what you backed up.

whether it actually is a major problem or not, i don't know.  that's why i was
asking.  the alter table workaround you mentioned seems reasonable.


OTOH, since mysql doesn't actually do transactions(*) or check referential
integrity, it's quite possible to have such references in the db.  and in this
case, an import like this will convert dangling references which point to
non-existent records into references that point to records that actually exist
(but aren't the right ones).

(*) yes, i know about innodb....but hardly anyone actually uses it because that
means giving up the only feature that mysql users (mistakenly) care about - raw
speed.  not that mysql is actually any faster in the real world with multiple
simultaneous readers and writers, but that's the mythology.

> General note:  We make a policy of using auto_increment _only_ to create
> sequence tables, which we manage ourselves.  This is in line with postgres
> and oracle's use of sequence tables, and makes porting easier.  We don't
> bother with ensuring that the next ID is higher than all previous ones - as
> long as they're unique, that's sufficient, any references to a defunct entry
> are removed when the entry is removed.

postgres sequences (and serial fields) are what i'm used to.

craig

-- 
craig sanders <cas@taz.net.au>

The next time you vote, remember that "Regime change begins at home"



Reply to: