[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1032899: unblock: rocm-hipamd/5.2.3-6



Hi Paul,

On 2023-03-16 10:31, Paul Gevers wrote:
> Control: tags -1 moreinfo On 16-03-2023 00:16, Christian Kastner 
> wrote: For next time, can you please contact us earlier? We could 
> have solved the earlier problems in testing-proposed-updates (in 
> January), then we would now be in a better position.

I didn't think of that solution as the RC-blocked dependency was only
available in unstable, and admittedly because I thought this would
resolve itself in time.

But in any case: yes, earlier contact would have been helpful, and I'll
do so in future.

> + * Reduce arch to amd64, arm64, ppc64el
> 
> But it fails on ppc64el; so why this selection?

Because those are the only architectures for which the required amdgpu
kernel driver is available [2].

> Also, as the other architectures FTBFS, we prefer in Debian to *not*
>  limit the architectures, but just let them fail [1]. This eases 
> porter efforts.

Thanks for pointing this out, I thought it was the other way around
(prefer *to* limit to avoid failures). Well, with ppc64el, we followed
that strategy.

> If the packages really don't make sense on some architectures, 
> consider using some of the "properties" provided by 
> bin:architecture-properties in your Build-Depends.

I wasn't aware of this package and I don't think it'll help us here
because we're specifically tracking [2]. But it'll be very useful to
some of my other packages, thanks!

> By the way, I checked, but none of the ci.d.n host will run any of 
> your tests, as none of them has an amdgpu (is that a thing you could 
> expect on non-amd architectures by the way?).

Correct! Tests will be skipped on official infra.

It's not just a matter of the missing hardware (we have it, but DSA has
understandable concerns), it's also about how to even express that a
package needs a GPU to run its tests (build-time or autopkgtest).

I recently initiated a discussion about this [3]. For now, the idea to
run parallel debci infra with guaranteed GPU presence, gather
experience, and to eventually share proposals on how a GPU dependency
could be expressed in d/control and d/tests/control.

> One thing I spotted along the way; the (Build-)Depends on llvm 
> related packages use the *versioned* ones. Is there a reason not to 
> use the unversioned ones from src:llvm-defaults? That would make llvm
> transitions a bit easier.

I'd have to check with the co-maintainers who added it, but from what I
gather so far, the ROCm stack needs a very recent llvm because of many
changes being upstreamed there.

> Overall, the diff is a bit long (and has some irrelevant stuff), so 
> I'm hesitant to offer t-p-u now (to avoid waiting for 
> llvm-toolchain-15).

Understood. Yeah, the diff is long, unfortunately, as the packaging
fixes accumulated over time.

Is this something that you could consider at a later point in time, if I
also break down the diff into more reviewable fragments (dependencies,
build, metadata, ...)? Because I do think that most changes are just
fixes of one sort or another - no features added.

Best,
Christian


> [1] https://lists.debian.org/debian-devel/2022/09/msg00105.html and follow-up 

[2] https://github.com/torvalds/linux/blob/v6.2/drivers/gpu/drm/amd/amdkfd/Kconfig#L6-L8
[3] https://lists.debian.org/debian-ai/2023/03/msg00038.html


Reply to: