Re: First approximation of source line count for potato
> I wrote a new script and ran it on a potato mirror a couple of days
> old. The temporary results are in http://liw.iki.fi/liw/foo.html
> (WATCH OUT! it's an 800 kilobyte table, which is quite slow to display
> on Netscape). The totals are below:
> Files Size Lines AWK C C++ Perl Python
> 714315 7497103.3 228096.6 39.0 80457.0 7500.0 693.0 595.0
> Size is in kilobytes, line counts are in units of 1000 lines. That is,
> there are about 7.5 gigabytes of files in source packages, making about
> 230 million lines, of which about 80 million lines are C.
> If anyone has suggestions for better statistics, let me hear them.
I finished off my line counter enough to get a second measurment.
My counter also works off of the file suffixes. I covered main,non-free,
We correlate well for C and C++, it was closer before I put in contrib and
non-free. I'm guessing Lars did just main? The other file types are fuzzier
for suffixes and we apparently disagreed.
One large note, my total line count is much smaller, 154M agains 228M.
Many things I skipped as non-source were evidently counted by Lars.
Interested folk can drill over to http://folk.federated.com/~jim/debcount/
see the language breakdown, see how their favorite packages fared, and
suggest new regexps so their favorite baby gets counted correctly.
I will set this up to regenerate nightly after the archive updates.
The current tally sits like...
837 Audio <-- files, not lines
66,539 Image <-- files, not lines
1,408 Postscript <-- files, not lines
79,362 debian/ <-- not in a diff
5,306,637 diff <-- just the debian .diff files
153,936,320 Grand Total
Jim Studt, President
The Federated Software Group, Inc.