[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [GSoC] Adding information to UDD and inject the rendering to tasks,py



Hi Akshita,

On Sat, Apr 18, 2015 at 08:15:50PM +0530, Akshita Jha wrote:
> > The thing is that the resulting debian.bib and debian.tex should include
> > only those citations that belong to packages which are just *inside*
> > Debian.
> 
> Does this mean that debian.bib and debian.tex should include citations for
> packages that are only in the Debian package list ('source' column of
> 'sources' table). 'bibref' table may have 'source' that is not in 'source'
> of 'sources' table, since bibref_gatherer.py gets its data from VCS. We do
> not want the citations of these 'source's to be incuded in debian.bib and
> debian.tex files. Am I correct ?

Yes.  I have announced debian.bib as a BibTeX file covering all Debian
packages.  It would be potentially confusing if additional citations
could be found - however, I guess not many people are using this
currently.  So this is not really a strict requirement but I think this
should be implemented.
 
> > That's the reason why it is done inside the bibref_gatherer.py.
> 
> I do not clearly understand this statement. Does this mean that all the
> source and their citations are included from VCS into bibref table ?

The bibref_gatherer.py consumes the data from Umegaya which contains
only Debian packages.  If debian.bib is generated here the fact that we
get only references for existing packages.  The problem is that as I
said Umegaya is not reliable and we need to inject the data from VCS
which is done later.
 
> > So our later corrections will not come into effect.  Blends-prospective
> > will be called later as you can see in scripts/cron_ftpnew_blends.sh
> > which has the only reason to define a sequence.
> 
> I do not understand this very clearly either. Why will update not work?
> 
>  ufile = upath+'/'+source+'.upstream'
>       if exists(ufile):
>           cur.execute("EXECUTE check_reference (%s)", (source,))
>           if cur.fetchone()[0] == 0:
>              upstream = upstream_reader(ufile, source, self.log)
>              if upstream.references:
>                if not prospective:
>                  # Valid references found in this upstream file and it is
> valid YAML even if it was not fetched by bibref gatherer
>                  self.log.warning("%s of %s has upstream file but no
> references in UDD" % (source, sprosp['blend']))
>                upstream.parse()
>                for ref in upstream.get_bibrefs():
>                  bibrefs.append(ref)
> 
> The code snippet above from blends_prospective_gatherer.py checks if there
> is a .upstream file for the source and then appends it to bibrefs. The data
> is from VCS and this is what is inserted in bibref table. So, are you
> saying that this insert will not matter as bibref_gatherer.py runs before
> Blends-prospective according to scripts/cron_ftpnew_blends.sh and so it
> will not be reflected in .tex and .bib files ?

Yes.  Even if the citation data are updated they will not be included
into debian.bib any more since this is created before the injection of
the data - at least as it is coded currently.
 
> > If we mix this with data from prospective packages and create
> > debian.{bib,tex} afterwards to many references would be included.  On
> > the other hand we could do this if we would add a join to the sources
> > table and include only those citations that have a matching source
> > package name.
>
> Does this mean that creating a debian.bib or debian.tex file afterwards,
> will include references for 'source's that are there in bibref table but
> not necessarily present in 'source' column of 'sources' table, implying
> that the package is not *inside* Debian ? This can be prevented if we join
> on 'source' column of sources and bibref table.
> 
>     "SELECT distinct s.source from bibref b join sources s on s.source =
> b.source;"
> 
> mentioned in Proof Of Principle (https://wiki.debian.org/UpstreamMetadata).

Yes, that's correct.

> > This would mean:
> >
> >    split the generation of debian.{bib,tex} from bibref_gatherer.py
> >    This could be even a separate script appart from all the udd
> >    configuration stuff.  Looks like a good idea anyway perhaps with
> >    an extra parameter if somebody wants to have all the citation.
> >    (=per default use a join to the sources table but enable an option
> >    to provide all)
> >
> 
> Does this coarsely mean, moving the class (with changes):
> 
>                    class bibref_gatherer(gatherer):
>                    """
>                      Bibliographic references from debian/upstream files
>                    """
> 
> from bibref_gatherer.py to a different .py file, and inserting the
> references based on the 'join' condition mentioned above ? Also, including
> a parameter, to allow anyone who wants all the citations (irrespective of
> whether the 'source' is in Debian package list or not) to ignore the 'join'
> and get all the citations from bibref table for a particular 'source'

That's exactly what I mean.

> > Is my description of the problem clear enough?
> 
> I think so. I will keep troubling you with more questions though :)

:-)

Kind regards

      Andreas.

-- 
http://fam-tille.de


Reply to: