[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: binary -> source mapping file from the BTS



On 12244 March 1977, Don Armstrong wrote:
>> > This is what the code in binary_to_source does that we expose via
>> > SOAP, and it uses the binsrc.idx bdb database file, which you can
>> > rsync directly from the BTS

>> > (rsync://bugs-mirror.debian.org/bts-versions/indices/binsrc.idx). [You
>> > should also be able to directly rsync the sources file using
>> > rsync://bugs-mirror.debian.org/bts-spool-indexes/sources .]
>> This sounds like it entirely should be a file provided by dak, not by
>> the BTS.

>> There are various files we have that list various relations. One is
>> pkg-file-mapping used by snapshot, then whatever the bts is using,
>> but whatever, this should be exported by us if another service needs
>> it. After all we are the canonical location for that kind of data.
>> So, what exactly do you need?
> It would be ideal if dak could provide it, but I'm not sure if dak has
> the historical data required to generate that file.

> Basically, it's just a mapping of the (binary,arch,version) triple to
> the (source,version) double for all packages debian has distributed in
> a very long time. [That's what the BTS needs.]

Have a look into the changes table. This holds information of all
changes ever accepted into the archive (minus very few old unparseable
ones).
On merkel, 
   psql -p 5433 projectb
   \d changes
   select source, binaries, version from changes limit 20;

This obviously is what was in the .changes files, so is as accurate as
what those files had. There are other ways in the database, but this
table is the only one going back years and years.


On 12244 March 1977, Raphael Hertzog wrote:
> On Mon, 20 Sep 2010, Don Armstrong wrote:
>> It would be ideal if dak could provide it, but I'm not sure if dak has
>> the historical data required to generate that file.
 
>> Basically, it's just a mapping of the (binary,arch,version) triple to
>> the (source,version) double for all packages debian has distributed in
>> a very long time. [That's what the BTS needs.]
 
>> I'm not sure exactly what qa.debian.org needs, but I'm assuming it's
>> similar to what the BTS requires.

> My needs are rather limited. I want a list of binary package names
> generated by a given source package: it's used to aggregate some stats
> that we have by binary packages (gift bugs, help bugs, BTS stats on the
> right). It's also used to map a description per binary package to the
> corresponding source package entry.

> I want a mapping "binary package name -> source package generating it"
> so that people who type http://packages.qa.debian.org/libc6 get redirected
> to http://packages.qa.debian.org/e/eglibc.html

> I don't care about versions, I just want the current sid/experimental
> information. I don't care about the arch of a given package but I want to
> see all binary packages generated on all arches.

So you dont actually care about old data. We could give you an export
per suite only containing "live", ie whats currently in the suite,
data. Or just for "all archive", whatever.

See projectb (see above) with

select b.package, s.source from binaries b left join source s ON (b.source = s.id) where s.source='eglibc' limit 10;

or, with bin_associations:

select b.package, s.source from binaries b left join source s ON (b.source = s.id) left join bin_associations ba on (b.id=ba.bin)  where s.source='eglibc'  and ba.suite=1 limit 10;

though i think the src_assoc is better (though its same in this case):

select b.package, s.source from binaries b left join source s ON (b.source = s.id) left join src_associations sa on (s.id=sa.source)  where s.source='eglibc'  and sa.suite=5 limit 10;

Just some quick hacks, there sure can be more done (and the sql checked
if it really fits all)

-- 
bye, Joerg
First you don’t want me to get the pony, now you want me to take it
back. Make up your mind!


Reply to: