[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#738342: lintian: checks/cruft - GFDL check is slow




Le 9 févr. 2014 13:54, "Niels Thykier" <niels@thykier.net> a écrit :
>
> Package: lintian
> Version: 2.5.21
> Severity: normal
>
> A quick benchmark suggests that lintian spends nearly 2 minutes on the
> Linux source package (I tested with linux/3.10~rc7-1~exp1).  Profiling
> Lintian with perl -d:NYTProf suggests that the vast majority of the time
> is spent in:
>
> """
>             if ($cleanedblock =~ $gfdlpattern) {
> """
>
> Where $gfdlpattern is one of:
>
> """
>             # classical gfdl matching pattern
>             my $normalgfdlpattern = qr/
>                  (?'contextbefore'(?:
>                     (?:(?!a \s+ copy \s+ of \s+ the \s+ license \s+ is).){1024}|
>                     (?:\s+ copy \s+ of \s+ the \s+ license \s+ is.{0,1024}?)))
>                  gnu \s+ free \s+ documentation \s+ license
>                  (?'rawgfdlsections'(?:(?!gnu \s+ free \s+ documentation \s+ license).){0,1024}?)
>                  a \s+ copy \s+ of \s+ the \s+ license \s+ is
>                 /xsmo;
>
>             # for first block we get context from the beginning
>             my $firstblockgfdlpattern = qr/
>                  (?'rawcontextbefore'(?:
>                     (?:(?!a \s+ copy \s+ of \s+ the \s+ license \s+ is).){1024}|
>                   \A(?:(?!a \s+ copy \s+ of \s+ the \s+ license \s+ is).){0,1024}|
>                     (?:\s+ copy \s+ of \s+ the \s+ license \s+ is.{0,1024}?)
>                   )
>                  )
>                  gnu \s+ free \s+ documentation \s+ license
>                  (?'rawgfdlsections'(?:(?!gnu \s+ free \s+ documentation \s+ license).){0,1024}?)
>                  a \s+ copy \s+ of \s+ the \s+ license \s+ is
>                  /xsmo;
> """
>
>
> The profiler suggests that 60% of the runtime is spent in the
> "CORE:match" operations inside "license_check" from c/cruft.  The
> regex appeas to be hit "only" 2452 times, but it spends an average of
> 55.9ms per time totalling 137s.
>
> Bastian, do you have an ideas for reducing the cost of the regex?

Yes I have.

Use these regexp only if we could match gnu free documentation license

Bastien
>
> ~Niels
>


Reply to: