Re: DEP 17: Improve support for directory aliasing in dpkg
Hi,
On 6/21/23 20:33, Guillem Jover wrote:
I don't think we disagree (?), I probably didn't express myself clearly.
The fact that no package ships those symlinks *is* and *has* been a
problem, and what I've been saying all along, this will be the only
correct way to let dpkg know whether there will be aliasing in play.
I've looked into building a dpkg-alias tool that would work similar to
dpkg-divert, and currently that looks like it might be a viable solution.
Rules:
- some package will need to register the alias in its preinst and
remove it in its postrm (to make piuparts happy and provide symmetry).
My favourite for that would be systemd-sysv, because that is literally
the package that brings in the requirement, but there might be problems
with that approach for containers, so I suspect that's not a good choice.
- unpacking a symlink over a registered alias is fine if the symlink
and the alias match.
This way, we can ship the symlink in a package for bootstrapping. Terms
and conditions apply.
- dpkg keeps track of the name of files in the .tar.gz, but also
recognizes aliased names as referring to the same file
This can be done inside dpkg's file database -- whenever an entry is
created, additional entries for aliases are also generated along it, so
the file can be looked up using any aliased path
- circular aliases are not allowed
This would break the requirement that it is possible to generate an
exhaustive set of all names a file may be found under.
- the newly created dpkg-alias tool is responsible for moving files,
if necessary
This is a separate tool, so we don't need to extend the unpacking logic,
and we can build an algorithm here that includes error recovery.
- if an alias is registered for a symlink that already exists, that is
not an error
This way, we accept the status quo silently.
- registering an existing alias or unregistering a nonexistant alias
is not an error
This allows future releases to change the list of aliases without
requiring complex logic in maintainer scripts.
- Files remain in the same place during the trixie cycle
We only shift responsibility for moving files and creating the symlinks
during this cycle, but bootstrapping will have to go through an unmerged
phase in the beginning, and files are then moved into merged paths from
the preinst of the key package, after the initial unpack.
Whether bootstrapping tools prefer to create the symlinks themselves
does not really matter at this point -- ideally they wouldn't, because
we'd need to keep track of any symlinks typically created by bootstrap
tools and explicitly remove these if the actual system should not have
them (e.g. if an architecture specific symlink is created on an arch
that doesn't have it).
- Symlinks cannot yet be shipped in data.tar during the trixie cycle
Because the unpack phase during bootstrap creates an unmerged file
system, the symlinks cannot be unpacked here.
- In trixie+1, the symlinks are then created from data.tar, and files
can then be moved.
This allows bootstrap to create merged filesystems directly during the
unpack phase.
- dpkg-alias can fail if there is a conflict during alias registration
This should not actually happen, but protects people who may have local
packages that use unmerged paths.
- dpkg-divert and dpkg-statoverride act on the normalized path
- dpkg-divert and dpkg-statoverride are registered with un-normalized
paths
- it is an error to register a diversion or statoverride with
non-matching data
This should allow most of the handover scenarios. For a diversion, it is
sufficient if the normalized destinations match, so we can have a
handover that registers /usr/lib/x -> /usr/lib/y from the preinst of a
new package and then unregisters /lib/x -> /lib/y from the postrm of an
older one.
The package would need to unregister on upgrade in the postrm though,
but that is standard for removed diversions.
- dpkg-query returns the package name if any aliased name matches
There should also be a flag whether to report the file name from the
data.tar as well, defaulting to "no", because that's what scripts expect.
But given these mentioned constraints
it cannot be made to support (as in accept) unpacking files inside
aliased directories (it should be able to unpack the symlinks creating
those aliased directories though!).
I think that can be done. I have already successfully made it report a
conflict between /bin/testfile and /usr/bin/testfile, with a meaningful
error message, and runtime overhead isn't too bad -- a factor of
log_{262144} 2 on the lookup time for a single path, but inserts got a
bit more expensive because these now have prefix comparisons on the
path. The latter could probably be improved with another hash on the
first N bytes of the path.
dpkg-divert distinguishes between local and package level changes, it
is true that dpkg-statoverride does not have (currently) that
distinction, although it is primarily an admin tool where I don't
think it makes much sense to support something like declarative
package statoverrides TBH once we can ship fsys metadata (perhaps
conditional one though).
This interface could be provided independent from the implementation, by
essentially pretending that maintainer scripts contain calls to
dpkg-statoverrides if a specific control file is present (and the same
would work for dpkg-divert and dpkg-alias). The change for that would be
fairly localized, around the maintainer script calls.
This can then be optimized later, keeping the same interface, if needed.
I'd like to see a mechanism that ensures that dpkg understands those
control files, though -- like a "critical" flag.
I suspect that for trixie, this will have to be an archive side check
that any package using one of the declarative interfaces depends on an
appropriate version of dpkg, and/or its use disallowed until trixie+1
for the convenience of backporters.
Simon
Reply to: