[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#781459: udd: please provide dumps more often



On Sun, Mar 29, 2015 at 6:42 PM, Lucas Nussbaum <lucas@debian.org> wrote:
> On 29/03/15 at 17:47 +0200, Mattia Rizzolo wrote:
>> package: qa.debian.org
>> user: qa.debian.org@packages.debian.org
>> usertag: udd
>>
>> From #debian-qa@OFTC
>> <mapreri> lucas: might I ask you to provide udd dumps more often than daily?
>>           like, every 6 hours? this would greatly help the public udd mirror to
>>           stay in sync (and I guess it could also help others, to avoid doing
>>           things like what formorer complained the other day)
>> <lucas>   mapreri: yes, but it would also be useful to switch to another dump
>>           format
>> <lucas>   mapreri: could you ask by mail? that way I could Cc the other users
>>           of the UDD dumps, and see if we can come up with something that works
>>           for everyone
>>
>>
>> So, here it is.
>> Other than more frequent dumps I'd also welcome a different dump format, maybe
>> smaller, to avoid downloading nearly 900MB every time I have to test something.
>
> Hi,
>
> Currently three dumps are generated every day:
> - usql.sql.gz, with all the data except the ldap, really_active_dds, and
>   pts relations, which are considered "private" data and not suitable for
>   wide exposure.
> - udd-bugs.sql.xz, with only the bugs data (both archived and
>   unarchived -- Andreas Tille was the main user of that -- is this still
>   needed?)
> - udd-popcon.sql.xz, with only the popcon data (codesearch.d.n needed
>   that -- is this still needed?)

codesearch still uses udd-popcon with no plans to switch away. If you
want codesearch to use something different, I’m open to that, but
it’ll take a while until we can make any switch :).

>
> Before changing that, I'd like to understand:
>
> 1) what is the rationale for the public UDD mirror. Is there a way this
>    could be provided from Debian infrastructure, for example by
>    whitelisting specific hosts that need UDD access? Is there something
>    here that could be acceptable for DSA (Cced)?
>
> 2) what is the rationale for the more frequent dumps. It's currently
>    being dumped once a day. It's never going to be "in sync" with the
>    live instance, unfortunately.
>
> 3) Would dumps in "custom format" (pg_dump -Fc) work for you? they allow
>    parallel restore with pg_restore.
>
> 4) Could some tables be excluded from the dumps?
>
> 5) Couldn't you trigger the dumps from your side, by calling pg_dump
>    inside an SSH connection to ullmann.d.o?
>
> Lucas



-- 
Best regards,
Michael


Reply to: