Re: Using BibRef from upstream-metadata.yaml (Was: Multiple publication data in upstream-metadata.yaml)

To: debian-med@lists.debian.org
Subject: Re: Using BibRef from upstream-metadata.yaml (Was: Multiple publication data in upstream-metadata.yaml)
From: Andreas Tille <andreas@an3as.eu>
Date: Sun, 15 Jan 2012 18:35:03 +0100
Message-id: <[🔎] 20120115173503.GE11734@an3as.eu>
In-reply-to: <[🔎] 20120115130906.GA8149@merveille.plessy.net>
References: <[🔎] 20120110075953.GB28931@an3as.eu> <[🔎] 4F0D5F51.3010303@rostlab.org> <[🔎] 20120111110652.GE28988@an3as.eu> <[🔎] 4F0D71A9.9090402@rostlab.org> <[🔎] 20120111120129.GF28988@an3as.eu> <[🔎] 20120112004804.GA6927@merveille.plessy.net> <[🔎] 20120115082319.GC11734@an3as.eu> <[🔎] 20120115130906.GA8149@merveille.plessy.net>

On Sun, Jan 15, 2012 at 10:09:06PM +0900, Charles Plessy wrote:
> Sorry, the data was actually rotten for multiple reasons.  First, the machine
> running upstream-metadata.debian.net stopped keeping dep-src entries in its
> sources.list, so debcheckout was not working anymore, and my rudimentary
> scripts did not catch the error.  I added error-catching to the TODO list.
> Second, when packages change their repository URL, which is not supposed to
> happen often, they have to be refreshed by hand.  Third, I hardcoded the
> erroneous git url git://git.debian.org/git that we now correct in
> git://git.debian.org/.
> 
> I have reloaded the data from scratch, by deleting the database and
> running the following command for each package med-bio depends on.
> 
>   curl http://upstream-metadata.debian.net/$package/YAML-URL

Hmmm, I wonder in how far you consider only med-bio as a target for
Ume(ga)ya?  Several Debian Science packages would profit from this as
well?

While I assumed to have a brilliant idea to simply check on alioth

  find /git/debian-med -name upstream-metadata.yaml
  find /git/debian-science -name upstream-metadata.yaml

I learned another trick of Git to hide the debian/ dir in the repository
clone on Alioth.  That's unfortunate for my idea.

> I am now injecting all the fields related to bibliography.  By the way, I
> regret that I have put PMID and DOI outside the Reference-* namespace.
> Would you mind if I correct this ?

I remember that I was astonished about this decision but I simply
assumed you would have your reasons.  I don't mind fixing something
which should be fixed before it has some relevant usage - so it should
be fixed now.  I assume you could simply iterate over everything in
Reference which sounds quite reasonable.  Would you try to care for
fixing the existing upstream-metadata.yaml files in our repository or do
you think we should do this step by step manually?

> The file used for injection, http://upstream-metadata.debian.net/for_UDD/biblio.yaml,
> is valid YAML; this is why I managed to write the loader.  It
> is a serie of records, which all contain an array of three fields.
> Altogether, they are loaded as a table of three columns.

Well, I don't mind if you want to keep it this way.  I'd consider it
more complicated than I would have implemented it - but if you volunteer
to maintain it that way that's perfectly fine for me.

> upstream-metadata.debian.net stores its data in a Berkeley database, where the
> field names are the concatenation of the package name and the
> upstream-metadata.yaml field name, that is, if in the perlprimer package, there
> is “PMID: 15073005”, the Berkeley DB will contain “15073005” for the field
> “perlprimer:PMID”.  In the whole information chain, the structure is always
> ‘package - field - value’.
> 
> I do not know where the perlprimer duplicaion came.  Perhaps there was an
> invisible character somewhere ?  On the server side, there is a command line
> tool to manipulate field values directly, I may have done a typo when making
> tests.  This said, I agree that the output should be sanitized.  Also, I
> definitely agree to use PRIMARY KEY (package,key) as an extra safety net.
> Should it be added to udd/sql/bibref.sql ?

Yes.  Just put it there and ping me if you added means to cope with
injection problems because of this.  I can push it to UDD after testing
on blends.d.n.

Kind regards

        Andreas.

-- 
http://fam-tille.de

Reply to:

Follow-Ups:
- Re: Using BibRef from upstream-metadata.yaml (Was: Multiple publication data in upstream-metadata.yaml)
  - From: Charles Plessy <plessy@debian.org>

References:
- Adding Rostlab packages to our tasks files
  - From: Andreas Tille <andreas@an3as.eu>
- Re: Adding Rostlab packages to our tasks files
  - From: Laszlo Kajan <lkajan@rostlab.org>
- Re: Adding Rostlab packages to our tasks files
  - From: Andreas Tille <andreas@an3as.eu>
- Re: Adding Rostlab packages to our tasks files
  - From: Laszlo Kajan <lkajan@rostlab.org>
- Multiple publication data in upstream-metadata.yaml (Was: Adding Rostlab packages to our tasks files)
  - From: Andreas Tille <andreas@an3as.eu>
- Re: Multiple publication data in upstream-metadata.yaml (Was: Adding Rostlab packages to our tasks files)
  - From: Charles Plessy <plessy@debian.org>
- Using BibRef from upstream-metadata.yaml (Was: Multiple publication data in upstream-metadata.yaml)
  - From: Andreas Tille <andreas@an3as.eu>
- Re: Using BibRef from upstream-metadata.yaml (Was: Multiple publication data in upstream-metadata.yaml)
  - From: Charles Plessy <plessy@debian.org>

Prev by Date: Re: Multiple publication data in upstream-metadata.yaml (Was: Adding Rostlab packages to our tasks files)
Next by Date: Re: Multiple publication data in upstream-metadata.yaml (Was: Adding Rostlab packages to our tasks files)
Previous by thread: Re: Using BibRef from upstream-metadata.yaml (Was: Multiple publication data in upstream-metadata.yaml)
Next by thread: Re: Using BibRef from upstream-metadata.yaml (Was: Multiple publication data in upstream-metadata.yaml)
Index(es):
- Date
- Thread