[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Automated testing - design and interfaces



Anthony Towns writes ("Re: Automated testing - design and interfaces"):
> On Thu, Nov 17, 2005 at 06:43:32PM +0000, Ian Jackson wrote:
> >   The source package provides a test metadata file debian/tests/
> >   control. This is a file containing zero or more RFC822-style
> >   stanzas, along these lines:
> > 	  Tests: fred bill bongo
> > 	  Restrictions: needs-root breaks-computer
> >   This means execute debian/tests/fred, debian/tests/bill, etc.,
> 
> Seems like:
> 
>   debian/tests/bar:
>     #!/bin/sh
>     # Restrictions: needs-root trashes-system
>     # Requires: foo

Urgh.  I'm really not a fan of those files which mix up different
languages.  We'll end up with complicated scheme for separating out
the test metadata from other stuff appearing in the comments at the
top of files (Emacs and vim modes, #! lines, different comment
syntaxes in different languages, etc.)

Also, we want to be able to share the actual tests - that is, the meat
of the work - with non-Debian systems.  So we should separate out the
metadata (which describes when the test should be run and where it is,
and is Debian-specific) from the actual tests (which need not be
Debian-specific).

>  Is the "Depends:" line meant to refer to other Debian packages (and
> thus be a lower level version of Restrictions:) or is it meant to
> indiciate test interdependencies? If it's meant to be for debian
> packages, maybe
>   # Restrictions: deb:xvncviewer
> might be better.

Yes, Depends is semantically much like Restrictions but refers to a
Debian package (which must be installed on the test system).  However,
Depends might have version numbers etc. - it's just like a Depends
field.  I don't want to try to mix that with the simple syntax of
Restrictions.

IMO it's better to have two fields if the structure (and hence the
syntax) of the information is going to be significantly different,
even if there's a certain similarity to the semantics.

> Note that it's often better to have a single script run many tests, so
> you probably want to allow tests to pass back some summary information,
> or include the last ten lines of its output or similar. Something like:
> 
>   foo FAIL:
>     FAILURE: testcase 231
>     FAILURE: testcase 289
>     FAILURE: testcase 314
>     3/512 test cases failed

This is no good because we want the test environment to be able to
tell which tests failed, so the test cases have to be enumerated in
the test metadata file.

You do have a point about not necessarily starting a single process
for each test.  An earlier version of my draft had something like
  Test: .../filename+
where the + meant to execute filename and it would print
   138: PASS
   231: FAIL
   289: FAIL
   314: SKIP: no X11
or some similar standard format.

> >   A basic test could be simply running the binary and checking the
> >   result status (or other variants of this). Eventually every
> >   package would to be changed to include at least one test.
> 
> These sorts of tests are better done as part of debian/rules, I would've
> thought -- the advantage of that is that the problems get caught even
> when users rebuild the package themselves, and you don't need to worry
> about special test infrastructure like you're talking about for the
> simple case.

You can't check that the binary works _when the .deb is installed_
without installing it.

> >   Ideally eventually where possible the upstream regression tests
> >   could be massaged so that they test the installed version. Whether
> >   this is possible and how best to achieve it has to be decided on a
> >   per-package basis.
> 
> Having
>   Restrictions: package-installed
> and
>   Restrictions: build-tree

Hrm, that's an interesting idea.  I really think that concentrating on
testing as-installed is going to yield much richer results - that is,
more test failures :-).  So I want to provide that interface straight
away.

Also, a `Restriction' isn't right because if the test has neither of
those Restrictions then presumably it can do either but how would it
know which ?

> >   Even integration tests can be represented like this: if one
> >   package's tests Depend on the other's, then they are effectively
> >   integration tests. The actual tests can live in whichever package
> >   is most convenient.
> 
> Going from build/foo-1.0/debian/tests/x to
> projects/bar-3.14/debian/tests/y seems difficult.

No, I mean that if the tests live (say) in
build/foo-1.0/debian/tests/x then build/foo-1.0/debian/tests/control
could say
 Depends: bar
which would mean bar would have to be installed, effectively making it
an integration test.

> Anyway, something that can be run with minimal amounts of setup seems
> most likely to be most useful: so running as part of the build without
> installing the package, running without anything special installed but the
> package being tested and a script that parses the control information,
> stuff that can be run on a user's system without root privs and without
> trashing the system, etc.

My idea is that the test runner will do `something sensible' by
default - ie, normally it will do the kind of things that a maintainer
would want to do just before uploading.  Tests that are too disruptive
for that would be skipped unless you told the test-runner it was OK.

> If there's going to be a "debian/rules check" command, "debian/tests/*"
> probably should just be a suggested standard, or vice-versa --

If we make debian/tests/control apply to build-time as well as
as-installed tests then debian/rules check is pretty much obsolete.

I much prefer the debian/tests/control interface to a single check
target for the reasons I've explained earlier.

Regards,
Ian.



Reply to: