[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#509935: decide whether Uploaders is parsed per RFC 5322



* Russ Allbery <rra@debian.org>, 2008-12-27, 12:27:
Policy currently says the following about the Maintainer field, which
applies by reference to the Uploaders field:

   The package maintainer's name and email address. The name should come
   first, then the email address inside angle brackets <> (in RFC822
   format).

   If the maintainer's name contains a full stop then the whole field
   will not work directly as an email address due to a misfeature in the
   syntax specified in RFC822; a program using this field as an address
   must check for this and correct the problem if necessary (for example
   by putting the name in round brackets and moving it to the end, and
   bringing the email address forward).

Most software has taken this to mean that the e-mail address should be
in RFC822 format, not that the whole field should be.

This is primarily posing a problem for people who have commas in their
name.  The main example to date is Adam C. Powell, IV, but it can happen
with various other name qualifiers and honorifics.

I propose the following simple solution to this bug:
- Let's forget about RFC 822/5322 compatibility, as it would introduce only needless complexity. - Let's allow any punctuation characters in maintainer names and e-mail addresses *except* "<" and ">".

This way comma is completely disambiguated: it splits the field if and only it's preceded by the ">" character. I.e. you can use the following Perl regex to split the field: /\>\K\s*,\s*/.

One can easily check that this method does the right thing for parsing Uploaders fields of the existing packages: you could e.g. try this on ries:
$ zcat /srv/ftp.debian.org/mirror/dists/*/*/source/Sources.gz | grep-dctrl -ns Maintainer,Uploaders -e '' | perl -pe 's/\>\K\s*,\s*/\n/g' | sort -u

Incidentally, this is (almost) the same method dak uses to split Uploaders:

$ grep -r uploaders.*split daklib/
daklib/dbconn.py:        for up in u.pkg.dsc["uploaders"].replace(">, ", ">\t").split("\t"):

Currently, the only way to express such a name that works with our existing tools is to drop the comma, since several programs blindly split on commas when parsing the field.

Let's fix them, then. :) I volunteer to fix lintian and dd-list. Do you know any other tools that parse Uploaders?

--
Jakub Wilk



Reply to: