Segfault in hsa_init on Debian-derived systems
Hello,
One of the things that has vexxed me about the 'universe' packages for
ROCm on Ubuntu is that they simply didn't work. On Ubuntu 22.04 and
22.10, even just a simple `rocminfo` invocation would crash with a
segfault. I assumed that there was something that Ubuntu was doing to
the Debian packages that was breaking them, but I was wrong.
I've finally figured out what is happening. This is the static
initialization order fiasco [1]. Inside libhsa-runtime64, there are two
variables with static storage duration in different translation units,
and there is an implicit expectation that one will be initialized before
the other. That expectation happens to be satisfied when rocr-runtime is
built for Debian, but the order just happens to be different when the
package is built for Ubuntu and that expectation is violated. When the
variables are initialized out-of-order, some data that is copied from
one static variable to another is copied before it is initialized
(including pointers which are therefore unexpectedly null).
This problem technically does not affect the Debian binary package, but
only by coincidence. Any change to the build toolchain could potentially
change the initialization order and thereby introduce this defect into
the library the next time it is built from source.
I'm not sure what priority this issue is for Debian. In fact, I'm not
sure if it even is technically a bug on Debian given that it hasn't
manifested itself yet. Nevertheless, I hope we can upload a fix for this
relatively quickly given that both Debian [2] and Ubuntu [3] will be
freezing packages soon.
I've prepared a patch to ensure the correct initialization order [4]. If
there's anything more that I can do to help, please let me know.
Sincerely,
Cory Bloor
[1]: https://en.cppreference.com/w/cpp/language/siof
[2]: https://release.debian.org/bookworm/freeze_policy.html
[3]: https://discourse.ubuntu.com/t/lunar-lobster-release-schedule/27284
[4]:
https://salsa.debian.org/rocm-team/rocr-runtime/-/commit/b4bfd3e07426034ba65a5f4adf05b41c23c3eb81
Reply to: