[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [GSoC] blends-gen-control hints (Was: blends-dev, gsoc 2013)



Hello Andreas,

I was also offline yesterday, I had no internet cause I moved to my hometown village. I will catch up the lost day of work during the weekend.

On Thu, Jul 4, 2013 at 12:41 AM, Andreas Tille <andreas@an3as.eu> wrote:

As promised some comments to your patches.  While beeing offline (in
some talk rooms) I was bound to some local checkouts of tasks files that
were featuring a syntax bug (forgotten ',' to separate dependencies).  I
handles this by an error message inside the importer log and in addition
tried to import this as well.  So the importer code has changed and I
also cherry picked from your patches and commited to UDD git (so please
`git pull`).

Moreover I tried to run your code for filling up the alternatives table
and think it is not what you finally want to use to create the
metapackages.  As far as I see you are injecting only thosedependencies
into the table that actually are containing alternatives (== are
containing a '|').  IMHO this will just create trouble for the
metapackage creation because you finally need to look into two tables to
assemble the dependency string and you also need to make sure that you
will not duplicate things.  That's not what I would call easy or
reliable.  My idea was to fill *all* dependencies into the new table and
you can look up single dependencies and alternative dependencies
straight from there.

Yes I keep straight the dependencies which contain "|" ( the alternatives). I will try to explain my base idea about that:
Currently we have all the packages into the blends_dependencies table, each package into a single row. The good thing about this is that with one query (blends-gsoc/sql/blendsd) we get all the info we need to check whether the packages are available or not (distribution, component, architecture). With a single parse of the latter's result set I end up with a dictionary containing for each task all the proper(all the needed checks are done) Depends, Recommends packages etc,  as a list :

eg Depends : [ 'a' , 'b', 'c', 'd', 'e' , 'f', 'g', 'h' , 'i', 'j', 'm', 'n' ]

The only problem with the above list is that we do not know which of the above packages are alternatives between them. So here comes the info of the table blends_dependencies_alternatives which will provide strings per blend/task/dependency such as:

alternatives = ['p | e | a', 'j | n'] 

Using the above alternatives relation we can easily convert the previous Depends list : 

[ 'a' , 'b', 'c', 'd', 'e' , 'f', 'g', 'h' , 'i', 'j', 'm', 'n' ]
+
 [ 'p | e | a' ,  'j | n' ] (here for example the package "p" is missing from the previous list because it may was not available) 
=
[ 'e | a' , 'j | n', 'b', 'c', 'd', 'f', 'g', 'h', 'i', 'm']

Then we would end up with a list containing the packages (with their alternatives if any). 

An example with 3 rows in blends_dependencies
- - - - - - - - - - - - - - - - - - - - -  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
blend             | task | package | dependency | distribution | compoment | architecture
- - - - - - - - - - - - - - - - - - - - -  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
debian-med    | bio   | pkg1      |       d          | debian       |   main         |   i386
debian-med    | bio   | pkg2      |       d          | debian       |                   |   
debian-med    | bio   | pkg3      |       d          | debian       |   main         |   i386

And let's assume that the above packages are alternatives so we will have an entry in blends_dependencies_alternatives such as:
- - - - - - - - - - - - - - - - - - - - -  - - - - - - - - - - - - - - - - - - - - - - - - -
blend             | task |         alternatives         |      dependency 
- - - - - - - - - - - - - - - - - - - - -  - - - - - - - - - - - - - - - - - - - - - - - - - 
debian-med    | bio   |   pkg1  | pkg2 | pkg3    |       d         
      
From the blends_dependencies we will get something like that

Depends : [ "pkg1" , "pkg3" ]
Suggests : [ "pkg2" ]

and using the alternatives table we convert the above to:

Depends :[ "pkg1" , "pkg3" ] (using  "pkg1  | pkg2 | pkg3" ) = [ "pkg1 | pkg3" ]
Suggests : [ "pkg2" ]

So as you see we only use the blends_dependencies_alternatives string just to define the alternative relation between packages which are already been processed (to find out which are available or not etc)

To some point I had this implemented on my local instance but beeing
offline sometimes enables you to think twice before commiting and so I
stumbled upon the fact how to later calculate the resulting rependency
of a set of alternatives.

I'm not fully made up my mind about this.  Finally the decision is
whether at least *one* of the alternatives is in Debian main for a
given architecture to put the "Recommends" label onto this.  The
thing is that if we are storing strings inside the column of the
database you can not really easily query for it using SQL.

If we would decide to store a n array (PostgreSQL type) we could
possibly handle this better via a single query.  On the other hand we
could stick to the "store a string of alternatives" into the database
and do the verification whether for a given architecture at least one of
the alternatives exists later on inside the Python code that creates the
control file (for the tasksel input file we are out of trouble because
we can use the existing table without the alternatives).

I did not know about the array type of PostgresSQL, it may be interesting for our case. I hope I gave you my idea/the reason about the way I stored the alternatives that way in proper words. It may be done in two steps but keep in mind that first we do the verification or each package, ending up with the correct list of dependencies and then using the alternatives strings we just define the alternative relation between them. 

I hope I made my point clear - which I personally doubt ;-).  Because if
this doubt I consider to implement the importer according to my vision
how it might be helpful for the control file creation and let you check
out whether you agree to my arguing that this table content is most
helpful for the given purpose.  I think I will easily find the time
to do so until tomorrow.
 
Ok let me know if you like my approach(or if it has any weak parts) , if not and you prefer to have the info in one table we can implement the importer according to your vision that's not problem :-)

Meanwhile I'd recommend you might look into the source of the openjdk7
package how the different control files for the different architectures
are created there.  Perhaps this might give you an idea because it seems
even after my rewording on debian-mentors there was no helpful answer to
your question.

Yes I will do that :-)

Kind regards

Emmanouil

Reply to: