[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

WARC-file Incompatibility of the Debian web sites

WARC-files have their origins at the Internet Archive
and they are essentially a persistent hash-table in the form of

key   --- <URL the way it is in the Wild-Wild-Web>
value --- <thefile>


the issue with the current Debian sites seems to be
that tools like the


create files like  (~30MiB)


that fail to be viewed with a tool like the


With the exception of large files


the warc-proxy actually works fine and the WARC
cration and viewing tools that I use can be downloaded from


however, some sites, including the Debian web sites,
fail to be "WARC-able". It would be nice, if it were fixed,
specially given the fact that one never knows, when
something becomes censored. Please keep in mind that
there is no limit at the absurdity of censorship.
At some day photos of pigeons might be banned, because
may be some religious sect or political party finds
them offensive or otherwise endangering their ability
to keep the dumb ones working as slaves for them, paying taxes, etc.

The warc-proxy works fine with files that have a size of ~200MiB,
meaning, the aforementioned


is not "too big".


Reply to: