[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: archive rebuilds, step 3



Hi,

Let's discuss Step 3, which I originally described as:
     Look at each failure and submit bugs (that's done
     semi-automatically)

When you will get used to them, Steps 1 and 2 will take less than 15
mins. (Well, with the long waiting period between Step 1 and Step 2).
Step 3 is the most time-consuming part.

At the end of Step 2, you should have:
- all the logs for failed builds available somewhere on the web
- all the logs for failed builds available locally
- the list of failures, merged with the previous list, in e.g.
  collab-qa/archive-rebuilds/2013-05-09-unstable-amd64/failed.2013-05-09.txt
  That list should contain lines with "TODO". That's the bugs that need to
  be filed.

That step is quite simple. It consists in going through each log and filing the
corresponding bug. Of course, it's partially automated to be efficient.

Let's start by setting up your local environment. I recommend that you use mutt
for bug filing, because it's fast to start. And also a local MTA such as
nullmailer. nullmailer is great, because its queue is available in a directory,
and you can still remove mails from the queue before you flush it. That's great
in the "oh, I didn't realize that all those failures I just filed are actually
the same one and should not be filed" case.
Also, install collab-qa-tools locally.

Go to the directory with the logs.

$ export TODOFILE=~/collab-qa/archive-rebuilds/2013-05-09-unstable-amd64/failed.2013-05-09.txt
(point to the list of failures)

$ export DATE=2013/05/14
that date is used to generate the path to the log files, and the BTS usertag.

$ cqa-scanlogs -t $TODOFILE
You already know this tool. That way, it only lists bugs still marked TODO.

$ cqa-fetchbugs -t $TODOFILE
That fetches the list of known bugs matching some regexp in .bugs.srcpkg files.
Those files will be used by cqa-annotate.  To refresh this "cache", just
rm .bugs.*

The goal here is to identify bugs to file, and also bugs to NOT file
(typically, when you have 20 failures with the identical error message, you
should make sure that you should really file 20 bugs and not just one against
another common dep). Use grep, common sense, and some manual analysis.
Generally, the most easy bugs are the ones tagged GCC_ERROR or LD_ERROR.
The most interesting ones (in terms of analysis and "wtf potential") are found
in the ones tagged UNKNOWN.
BUILDDEPS bugs are rarely filed as is: they are often failures caused by
another package.

To file bugs, use
$ cqa-annotate -t $TODOFILE -r REGEXP
with REGEXP being, e.g.:
- GCC_ERROR
- "ommand not found"
- ... any string from the one-line bug summary

cqa-annotate is a hackish interactive script. For each bug matching REGEXP, it
will present you with a prompt. real-life example:

| ######## argparse_1.2.1-2_unstable.log ########
^ name of the log
| --------- Error:
|  fakeroot debian/rules clean
| pyversions: computed set of supported versions is empty
| dh_testdir
| dh_testroot
| [ ! -e html ] || rm -rf html
| [ ! -e doc.orig ] || mv doc.orig doc
| set -e; for pyver in ; do \
| 		python$pyver setup.py clean --all; \
| 	done
| [ -f argparse.pyc ] && rm -f argparse.pyc
| make: *** [clean] Error 1
^ multi-line log extract. that's the one that would end up in the mail body.

| ----------------
| XXX
^ one-line log extract. that's the one that would end up in the mail subject.
if it's XXX, it means that nothing meaningful could be found, and you need to
write your own summary (usually copy/pasting the appropriate line from
the log).

| ----------------
| package: argparse
^ name of the source package

| lines: 11
^ number of lines in the log extract. if it's too long, it's a good idea to edit
it manually. Don't send megabytes of logs to the BTS.

| 1: 556262 minor argparse: FTBFS of unstable/testing sources on lenny || 
| 2: 707117 serious argparse: FTBFS: make: *** [clean] Error 1 ||
^ that's the list of possible bugs that could match this failure, as fetched
by cqa-fetchbugs.

At this point, you can press:
's' to skip that bug. it remains TODO in $TODOFILE
'v 1' or 'v 2' to view bug number 1 or 2 using w3m -dump
'1' or '2' to confirm that this is actually bug number 1 or 2. The bug number will replace TODO in $TODOFILE.
'sev 1' or 'sev 2' to indicate that this is actually bug number 1 or 2, and that bug severity must be increased to 'serious' (using bts severity nnn serious).
'r' to report a bug.

If you press 'r', mutt opens. Edit the email as needed, then send it.
Once the mail is sent, cqa-annotate asks for confirmation:
edit TODOFILE? ('n' if not!)
(just press enter)
at this point, TODO is replaced by NNN in $TODOFILE.

When you are done filing bugs, you can replace all those NNN by actual bug numbers. For that, you need to use e.g.:
cqa-importbugnumbers debian-qa@lists.debian.org qa-ftbfs-20130509 < $TODOFILE > tmp
the two parameters are bug username and usertag.

You mission this time:
- pick up ~100 random packages from the list of packages that failed to build on
  2013-05-09 and are still marked TODO. it's a good idea to ignore the remaining
  GCC_ERROR bugs: those don't need to be filed.
  grep TODO failed.* | grep -v GCC_ERROR | cut -d ' ' -f 1 | shuf | head -n 100
  should give you something.
- rebuild them. (use generate-tasks-rebuild with -i pkglist)
- file 10 bugs using everything described above.
  (make sure to double-check the content of the emails, esp. the URLs.
  Those are hardcoded, so you could hack ./lib/collab-qa/log-parser.rb
  to put your own URL for now. In the future, it should be customizable
  differently (env variables? small config file ?))
- commit the bug numbers to the list for 20130509.
- report here. :)

We will do a full archive rebuild "for real" in a few days, which will involve
reserving more build nodes, too.
And also do the gcc 4.8 rebuild with more nodes.

Three final words:
- one difficult aspect is to find a good balance between
  spending too much time yourself on each bug, and making mistakes,
  that make other people spend too much time on the bugs you filed.
  I'm sure you will learn :-)
- when people can't reproduce the failures, a good way to move forward
  is to ask for a full build log using pbuilder, and for a diff between
  your log and their log. Usually, this solves 80% of the unreproducible
  failures. Also remember that masternode.rb automatically restarts
  failed builds once, so you know it failed two times in a row if it
  ends up in your list.
- sometimes (rarely) people get angry. Don't get discouraged by that.
  Most of the people are grateful about the bug filing.

Lucas


Reply to: