[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#750149: [www] some improvement of validate



Package: www.debian.org
Severity: wishlist
Tags: patch 

part 1:
- currently validate is invoked per 20 files at once and
  validate checks those files one by one in its loop,
  which means if the lang has 4k of files,
  the overhead creating a new process occurs 200 times (4k / 20).
  it looks like about 10% of its process time is wasted by the overhead

- on the other side, the validate is a perl script
  which means it can do/have anything inside of it.

- then, checking files in one loop instead of two loops, can reduce time.

- excluding releasenote docs explicitly in the perl script
  to avoid false positive.
  - btw, 5 release pages had errors in Japanese translation
    that are currently not checked due to the false positive.

- remove useless ja specific hack. 

- test results on my vm (wheezy on virtualbox on vista), ja only:
  pass 20 w/ xargs: 5:40 5:39 5:41 5:39 5:37 5:42 5:38 5:39 6:06 5:37
  loop all in perl: 5:06 5:08 5:06 5:05 5:30 5:05 5:27 5:28 5:31 5:05
  - avg. of rank 2-9:
    pass 20 w/ xargs: 5:39
    loop all in perl: 5:14

--
part 2:
- check only updated file just after every build.
- checking only updated file takes only a few minutes
  so it can be run on every build.
spec:
- regular check checks files which had error in the previous check,
  and updated files only.
- full check runs once a day.

--
part 2 is based on part 1;
attached bz2 archive has 3 patch:
  diff between vcs and part 1 (validate1.diff)
  diff between part 1 and part 2 (validate1-2.diff)
  diff between vcs and part 2 (validate2.diff)

-- 
victory
http://userscripts.org/scripts/show/102724 0.0.1.4
http://userscripts.org/scripts/show/163846 0.0.1
http://userscripts.org/scripts/show/163848 0.0.1

Attachment: validate.tbz
Description: Binary data


Reply to: