[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#966649: Request for feedback on upload_history re-implementation



On 21/08/20 at 00:04 -0700, Asheesh Laroia wrote:
> Great!
> 
> It sounds to me like if we use the *mtime* of /srv/
> udd.debian.org/email-archives/debian-devel-changes/debian-devel-changes.current
> (but not its contents), that would smoothly and solidly overcome the
> worries about unnecessary polling. If the file's mtime is the same as the
> last time the tool ran, then no need to run any import process (no matter
> if the import process involves HTTP or not). If we are only relying on
> mtime (and we're the only consumer), we can truncate it before looking at
> it. :)
> 
> For the XZ mbox files, it sounds to me like they're currently not reliable
> -- their existence on ullmann.debian.org depends on disk space stuff.

Well, their existence relies on manually rsyncing them from
master.debian.org from time to time.

I think that the possible designs are:

A) use a copy of the mbox archives on ullmann. requires manual (or
better, automatic) rsync. if manual or infrequent, also needs
debian-devel-changes.current. Main downside: requires disk space on
ullmann to hold the copy of the mbox archives.
That's the current implementation.

B) export mbox archives from master.d.o to ullmann (using NFS for
example). No need for local disk space on ullmann. Downside: requires
support from DSA.

C) rely on HTTP archives. Only requires a local cache. Downside: lots of
HTTP requests to lists.d.o

> Related question: My code requires a 4-5GB cache file if it stores all data
> from all years of debian-devel-changes. Is that workable? If not, it's easy
> to cut it down. Is 1GB appropriate to expect as cache storage space? If
> not, what about approx 400MB? (Context: 2020's data so far takes up 208MB
> in my cache format.)

That's a bit strange: the full copy of the debian-devel-changes archives on
ullmann is currently 869M. (but it's gziped or xziped)
But that's a question to ask DSA. /srv is 20G, with 8.8G free currently.

Lucas


Reply to: