[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Parser for Debian's RFC822-based formats



Hi,

Am Mittwoch, den 22.02.2012, 22:06 -0600 schrieb Jeremy Shaw:
> The Debian library perhaps?
> 
> 
> http://hackage.haskell.org/package/debian-3.61
> 
> 
> The ByteString versions are quite fast.. can load the sid in a second
> or two.

the speed is ok, but not really great. For SAT-Britney I initially used
it, but I had to hand-roll the parser (but only for the fields of the
Sources file that I needed):
http://git.nomeata.de/?p=sat-britney.git;a=commitdiff;h=e7f64e9f3db32a08d69e8b774eb56dd272f2cbd7
The 25% improvement is overall runtime, I think, the parsing was sped up
much more.

The main reason why its faster is, I think, that it can reference chucks
of the input ByteString, which Parsec based parsers don’t do, as far as
I know. I think there are parsing libraries that are newer than
ByteStrings and are also able to do so (attoparsec maybe)?

In any case, try the debian package first, if it is good enough, it is
certainly the best option. And if not, maybe it can be improved.

That would also help our haskell-pkg-debcheck.hs which still uses debian
and spends most of its runtime parsing.

Greetings,
Joachim


-- 
Joachim "nomeata" Breitner
Debian Developer
  nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C
  JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: