[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH 2/7] Conffile database handling functions



hi guillem,

i'll try to answer these in order, here's the first to look over
while i start looking over the next.

On Sun, Feb 07, 2010 at 08:24:55PM +0100, Guillem Jover wrote:
> On Mon, 2009-12-14 at 08:27:32 +0100, Sean Finney wrote:
> > The layout pattern for the conffile database is:
> > 
> > 	<admindir>/conffiles/<pkg>/<version>/<hash>
> 
> While the implementation resulting from this layout seems more elegant
> than the previous iteration, I still think that using the layout I
> described in the last thread
> 
>   “<admindir>/conffiles/<pkg>/<hash>[.<base>]”
> 
> is superior as it allows us to switch to use it as the native storage
> of conffiles in the future. Currently the configure step knows which
> conffiles have been processed by checking if the conffile.dpkg-new is
> there, which the user might have removed, intentionally or not. Keeping
> track of that fact in the db itself would be more robust. Also there's
> not going to be any possibility for dangling files form other versions.

before diving into this: i want to emphasize that for me the bigger
priority is getting at least the basic functionality in place for squeeze.
i would hope that if there were agreement that it's possible to
extend the functionality as desired that it would be sufficient for
patch acceptance.  as it stands right now with the current rate of
back-and-forth i'm already skeptical whether it will make the deadline.

also, in the meantime could you please at least accept the
non-controversial "move foo out of help.c" type stuff?  it's a major
PITA to have to keep rebasing my changes and resolving these conflicts
and making sure that i'm not introducing regressions in the meantime.

okay, that said, onwards:

> than the previous iteration, I still think that using the layout I
> described in the last thread
> 
>   “<admindir>/conffiles/<pkg>/<hash>[.<base>]”
> 
> is superior as it allows us to switch to use it as the native storage

i don't think the current layout pattern conflicts with this idea, i.e.
i don't see a reason that along with <version>/<hash> that we couldn't *also*
have <version>/<hash>[.<dpkg-{old,new,foo}>], or some functional equivalent.

the only extra constraint that using <version>/<hash>[dpkg-ext] imposes is 
that some of these files need to be kept across upgrade/downgrades (such
as .dpkg-old).  one solution for that is to copy the files at the proper
time to the new location, and the other is to not keep such files in
the <version>-specific directories to begin with and instead use specially
partitioned "dpkg:ext" versions.  an extra bonus with the second option
is that most of the code necessary to do this is already contained in
the patches i've sent.  as i'm sure you're obvious i'm a big fan of this idea
because it keeps the somewhat turbulent data contained separate from the
dist files, which in turn makes error handling all-round much simpler.
but even the alternative of "promoting" some of these [dpkg-ext] files across
upgrades/downgrades doesn't really seem that bad to me (just means the api
would need to be tweaked a bit).

as debian policy ensures that the "special" versions will never conflict
with the actual versions, i don't really see what the problem is with the
<version> approach, though, apart from maybe aesthetics.  and if that
were really the problem, we could always introduce an extra level of
subdirectories and keep them entirely separate, i just don't see the
benefit (and it would also mean more code).

the alternative, of not having a <version> subdirectory, and keeping
everything on just a per-package basis, is that we either (a) can possibly
lose information in some failure/unwind cases, or (b) have a complicated
procedure to handle unpacking/installing the new files while removing
the old ones.  i think (a) is out of the question, and (b) is worse than
both of the above alternatives.

.dpkg-new, for example:

with regards to tracking conffile configuration via .dpkg-new the
existing patch could later be augmented so that instead of going from
<hash>.dpkg-tmp to <hash> during unpack as it currently does, a conffile
goes to <hash>.dpkg-new, and then the existing conffile handling code
is pointed to look at this location instead of<path>.dpkg-new.
alternatively, a "dpkg:new" version directory could be used, so that
the current conffile handling code only has to ask for the "dpkg:new"
version of a conffile when comparing against the actual conffile.

without this <version> subdirectory, there's a window of time between
the first conffile and the last conffile where <hash> without the extension
might refer to the old or new file (please correct me if i'm wrong).
to avoid problem (a) from above we would need to have an extra
<hash>.dpkg-prev copy old version's dist conffile, and a whole extra
flotilla file i/o, error unwinding code, and checks for errors from
previous runs.

> Currently your implementation seems to lose track of files if for
> example you do several consecutive unpacks w/o a configure step
> inbetween.

afaik, the only problem with several unpacks of v_1...v_n followed by
a configure for v_n is that it's possible that v_1...v_{n-2}'s dist
files could be left behind in their respective <version> subdirectories.
i.e., minor cruft from unhandled errors.  iirc i also stated this in the
patch and that a solution would be pretty trivial (after a successful
configure delete any <pkg>/<version> directories which are no longer
relevant).

or is there some other case that i've overlooked?

> Additionally using <base> to describe what kind of conffile it is, seems
> more natural than trying to use a fake version instead.

but, .dpkg-old *is* a version, isn't it?  i.e. "the previous version
of the conffile before you updated it".  it's not tied with a *package*
version, but the naming scheme already keeps the two seperate.

in terms of desired features, i don't really see a conflict here.  where
is the problem?  if the objection is more of an aesthetic nature,
like i've said it could be solved by "more code" (tweaking the layout
to place them outside of the "versions" subdirectory, and writing some
extra functions to handle them), but i don't really see the need for it.

> > diff --git a/lib/dpkg/test/Makefile.am b/lib/dpkg/test/Makefile.am
> > index 9400267..46d1bfe 100644
> > --- a/lib/dpkg/test/Makefile.am
> > +++ b/lib/dpkg/test/Makefile.am
> 
> This test is for stuff from src/, could you move it under scr/test/
> instead?

it wasn't clear if the code was going to stay in src or move to lib,
and there is no src/test directory.  i could of course move it but i
didn't want to get too deep in hacking changes into the build system
when i could instead get what i wanted with a couple lines in an existing
makefile.

> > diff --git a/lib/dpkg/test/t-conffiledb.c b/lib/dpkg/test/t-conffiledb.c
> > new file mode 100644
> > index 0000000..561b159
> > --- /dev/null
> > +++ b/lib/dpkg/test/t-conffiledb.c
<snip>
> > +/* Forward declaration to avoid needing to include main.h */
> > +void ensure_pathname_nonexisting(const char *pathname);
> 
> Just include main.h so that we won't accidentally desync from the real
> function.

the reason for not including main.h is that it makes a huge mess for
linking (i.e. all the extern variables and declared-but-not-defined
functions).  since this is a *test case* i don't see what the problem is.

> > +/**
> > + * Our "main program" admindir, which we set to a temporary
> > + * location for our tests.
> > + */
> > +char *admindir;
> > +
> > +/* Convenience pointer used for inspection of conffiledb contents */
> > +char *conffiledbdir;
> 
> Make this static.

Done.

> > +	pkg_a2.available.version.revision = m_strdup("6deb7");
> 
> Just use string literals, then there's not need to free them
> afterwards. I should probably export something ressembling the version
> macro from t-version.c (but with a less generic name), which would
> allow you to simply do:
> 
> 	pkg_a2.configversion = version(1, "2.3", "5deb6");

i'm allocating the strings because in my experience the combination of
stack-based allocation and compiler optimizations sometimes hide fairly
subtle memory/string issues on string literals/arrays/etc from tools
like valgrind.  For example, a slightly more complicated occurence of:

	int a[10];
	int b[10];
	a[11] = 1;

won't necessarily result in a valgrind error, whereas the equivalent with
heap-based allocation will.  given that this is a fairly small and quick
testcase, is there a strong reason to avoid allocating them?

> > +	free((char*)pkg_a2.available.version.revision);
> 
> See above.

Likewise.

> > diff --git a/src/conffiledb.c b/src/conffiledb.c
> > new file mode 100644
> > index 0000000..2dec3b5
> > --- /dev/null
> > +++ b/src/conffiledb.c
> [...]
> > +
> > +#define CONFFILEDBDIRROOT "conffiles" /**< @ref conffiledb root location */
> > +
> > +static char *conffiledb_path_cversion(const struct pkginfo *pkg,
> > +                                      const char *version, const char *path);
> 
> Just move the function definition here.

Done.  fwiw, the declaration was only so that it would be read after
the function for which it was a "helper".

> > +char *
> > +conffiledb_path_cversion(const struct pkginfo *pkg, const char *version,
> > +                         const char *path)
> 
> This function might produce funny results of one of the latter
> arguments is non-null but the previous ones are.

yes, the checking is a little loose here--it was assumed that if this were
to happen something must be seriously wrong and dpkg would probably be
crashing shortly anyway... would internerr be more appropriate here?

> > +	debug(dbg_conffdetail, "confffiledb_path(%s, %s) = %s\n",
> > +	      pkg?(pkg->name):"(null)", path, full_path.buf);
> > +
> > +	return full_path.buf;
> 
> You can use varbuf_detach() now.

Done.

> > +	/* ensure each directory exists while reconstructing the path */
> > +	for (i = 0; i < 2; i++) {
> > +		if (stat(dbdirs[i], &s) != 0) {
> > +			debug(dbg_conffdetail,
> > +			      "conffiledb_ensure_db: creating %s\n", dbdirs[i]);
> > +			if (errno != ENOENT)
> > +				ohshite(_("stat() '%s' failed"), dbdirs[i]);
> > +			if (mkdir(dbdirs[i], 0700) == -1)
> > +				ohshite(_("mkdir() '%s' failed"), dbdirs[i]);
> 
> You could just do a mkdir and check for success or EEXIST as a non-error.

Okay, done.

> > +	conffiledb_ensure_db(pkg, &pkg->available.version);
> > +	debug(dbg_conff, "conffile_reg_fd: %s, %s, %s\n", pkg->name, path, p);
> 
> In addition to the previous layout comments, tying the registry to the
> available version would seem to me might bite us once dpkg-conffile
> grows a way to dynamically register files. Of course this could be
> fixed by just passing the version as an additional argument, but
> still.

i don't see how this is any worse or any better than the alternative
that you propose, but then again nobody has really spec'd out how that
should work either.  suffice to say i don't see a limitation in the current
version with regards to this that couldn't be solved with a .dpkg-dyn
extension or a "dpkg:dyn" special version, with the same provisions for
the other cases above.

> > +	/* make a copy of file to the .dpkg-tmp location and then rename
> > +	 * it to the proper destination */
> > +	cfgfd = open(tmpfn.buf, O_CREAT|O_TRUNC|O_WRONLY, 0600);
> 
> This perms would disallow normal users to read those conffiles through
> dpkg-conffile for example, which might be ok, although I don't see the
> problem with world readable ones, something to consider.

there's no reason for the dist files to be 0600, and in fact at one point
i thought of using file_copy_perms instead but at that point it wasn't
clear if non-dist files would go through this same code path
(i.e. .dpkg-old), and i figured it would be better to loosen this up
later once dynamic file registration was figured out.


okay, that's it.  looking forward to the counterpoint.

	sean


Attachment: signature.asc
Description: Digital signature


Reply to: