[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: udd/blends_metadata_gathener.py hints



Hello Andreas,

I updated blends_metadata_gathener.py

>From first intuition I would think it might make sense to add single paragraphs to
the configfile, like

  blends-all
  blend-med
  blend-edu
  blend-gis
  blend-...

 I added the above paragraphs inside config-ullman.yaml.
The gathener with blends-all runs for each available Blend else it runs for the selected blend.

I created the single blend paragraphs using <<: *blends-conf in case we need to override any of the blends-all attributes.

Each Blend now has each own log file by the name :
blends_metadata_gatherer-BLEND.log

In case the gathener fails before he updates any blend it logs into a blends_metadata_gatherer-default.log file.

For checking if a task file has changed I added a  "hashkey" column in the blends_tasks. When a task is imported I save a md5 hash in the blends_tasks. Before I delete and add from scratch a taskfile I checked whether its hashkey has changed. So if you run once the new gathener in order to save some first hashkeys then it will only delete/adds the changed tasks.

In the above case I could not delete and readd the Blend entry from blends_metadata table (because of the references in blends_tasks etc) so I check whether a Blends exists. If it exists I update the entry to save any changes else I use the blends_metadata_insert to create a  new entry.

You can test the gathener. Any feedback/comments is more than welcome :-).

I will now check on the following (quoting from a previous mail of yours):

c) try to make the insertion procedure itself more efficient by for
     instance:
      - check, whether we could speed up the check for a package that
        just exists in UDD
      - inject all packages in one rush


Kind regards

Emmanouil

Reply to: