Re: RFC: (ab)using autopkgtest for benchmarking

To: debian-ci@lists.debian.org
Subject: Re: RFC: (ab)using autopkgtest for benchmarking
From: Christian Kastner <ckk@debian.org>
Date: Mon, 4 May 2020 12:18:30 +0200
Message-id: <[🔎] 77a20bd1-cc1b-59c1-ac5e-11e8afb49c06@debian.org>
In-reply-to: <[🔎] 20200504084544.GA1625@piware.de>
References: <[🔎] 77e6dcdf-ac4a-720d-3f0c-8e1ff8ddf53a@debian.org> <[🔎] 20200504084544.GA1625@piware.de>

Hi Martin,

On 2020-05-04 10:45, Martin Pitt wrote:
> Christian Kastner [2020-05-01 22:09 +0200]:
>> Specifically, I would propose using the following new restrictions
>> (unknown to autopkgtest, and therefore skipped by default):
>>
>>    benchmark-task       A typical task for which the package is used
>>    benchmark-io         An I/O intensive task
>>    benchmark-network    A task that requires connectivity
> 
> That's the bit that I'd strongly recommend against. Declaring these as a
> restriction doesn't fit the spirit of restrictions defining testbed
> capabilities [1].

> If you want to go into the "Restrictions" direction [...]

I'm ashamed to admit that I didn't know that Features even existed,
hence I went by Restrictions, as I was familiar with them. So please
don't read too much into my Restrictions suggestion.

Regarding the particulars of integration into autopkgtest, I'd honestly
just defer to you, the autopkgtest / debci Maintainers. I'm just
interested in the (powerful, and broadly established) machinery.

> Something like "benchmark-io" is at the same time too generic
> (what does it mean or guarantee exactly from the point of the runner, which has
> to check for and provide this?) and too specific (what if I need to benchmark
> slightly related, but not identical things, like graphics I/O performance?)

Indeed. Even for the purposefully vague "benchmark-task" (which I
envisioned as simply measuring wall-time of a particular task), one
would still need some mechanism to differentiate between setup/teardown
time and actual benchmarking time.

I believe that some sort of "Benchmarks Specification" would be needed
to address this, and my next step would have been to initiate a
discussion on -devel with the goal of seeking consensus on a simple v1.

I'd start with a "high-level" goal, namely to require no more than that
these tests should be informative to the average user. The requirement
is not to  achieve some publication-quality-level comparison (there's
specialized software for that), but that users get a rough indication.

However, as previously stated, I wouldn't want to go further down this
path without the autopkgtest / debci Maintainer's blessing, as the merit
of this idea is borne by their toolchain.

> These cannot be well-defined from autopkgtest's specification, and adding these
> to your package will mean that they will just cause these tests to be skipped
> on the CI infra. Maybe that's what you want, but it seems that most benchmark
> tests should at least be able to *run* on our CI without failing. Of course the
> numbers they produce are not very useful.

I did expect them to be skipped on CI infra, for lack of being useful
whilst consuming (possibly significant) resources.

Christian

Reply to:

Follow-Ups:
- Re: RFC: (ab)using autopkgtest for benchmarking
  - From: Christian Kastner <ckk@debian.org>

References:
- RFC: (ab)using autopkgtest for benchmarking
  - From: Christian Kastner <ckk@debian.org>
- Re: RFC: (ab)using autopkgtest for benchmarking
  - From: Martin Pitt <mpitt@debian.org>

Prev by Date: Re: RFC: (ab)using autopkgtest for benchmarking
Next by Date: Re: RFC: (ab)using autopkgtest for benchmarking
Previous by thread: Re: RFC: (ab)using autopkgtest for benchmarking
Next by thread: Re: RFC: (ab)using autopkgtest for benchmarking
Index(es):
- Date
- Thread