[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Data loss: suggestions for handling



Matthew Palmer (2003-08-01 19:51:46 +1000) :

> The latest upstream version of a package I've begun to maintain,
> IRM, has a problem in that a portion of the data in the system
> (relating to software and licence assignment) can't be upgraded
> along with the rest of the database - the schema is totally
> different.

Do you have an upgrade script?  Like a set of SQL commands that will
convert from one schema to the other?  More importantly, do you have a
set of criteria to check that the upgrade went smoothly and is now
complete?  If so, then I've done that successfully.

> I've thought about it for a while, and I can't come up with any good
> way to make the change.  The best I've come up with so far is to put
> a question in the postinst which warns the user and allows them to
> continue if they're sure, or they can CTRL-C out and backup.  If
> it's running in non-interactive mode, the install aborts.  I really
> want to make sure the user doesn't lose all their data.

  I faced the same problems with sourceforge and gforge.

> A couple of questions:
>
> * Am I being too paranoid?

  Probably not.  Maybe some of your users won't mind too much if they
lose data, but most of them probably will.  Then there's the personal
pride in building a crash-proof system even if nobody notices.

> * Can anyone think of another way of handling this?  I can think of
> a couple of other ways:
>
> 	- create an irm1.4 package, but that is, shall we say, ugly

  Ugly indeed.

> 	- dump the old software tables and store the dump somewhere,
> 	giving pointers to the dump in all sorts of useful places.

  Could be an option.

> I appreciate any comments or suggestions anyone has as to how I
> could proceed.

  The way I did it for the sourceforge and gforge packages is this: I
have a special table (called debian_metadata), in which I can store
key-value pairs.  One of the keys (okay, the only one normally) is
"db-version", and the corresponding value is a version number with the
same semantics as the one provided by dpkg for the ordering).  When I
need to upgrade something, I go the following steps:

,----[ One upgrade chunk ]
| Begin transaction
| Do stuff
| Check that stuff went fine (and raise an exception/abort if not)
| Update the value for db-version
| Commit transaction
`----

This is of course only executed if current db-version is less than
targeted version.  So I have a series of upgrade chunks, arranged like
this:

,----[ Upgrade script ]
| $target version = 1.0
| if (current-version < $target-version) {
|    Do the upgrade chunk for target=$target-version
| }
| 
| $target version = 1.1
| if (current-version < $target-version) {
|    Do the upgrade chunk for target=$target-version
| }
| 
| $target version = 1.4
| if (current-version < $target-version) {
|    Do the upgrade chunk for target=$target-version
| }
`----

  The postinst script always runs this upgrade script.  So all the
steps that were previously completed are not replayed, and you'll
start at the first one that hasn't been successfully run yet.  If one
step fails, the transaction is aborted and the user is presented with
an error message giving out info such as the current db-version, the
SQL statement that went wrong, and so on.  And he's requested to
report the bug with this info :-)

  All this requires a transactional database, obviously, but they're a
dime a dozen these days.

  For more details, apt-get source gforge and have a look at the
deb-specific/db-upgrade.pl script.

Roland.
-- 
Roland Mas

You can't second-guess ineffability, I always say.
  -- Aziraphale, in Good Omens (Terry Pratchett and Neil Gaiman)



Reply to: