Re: Emdebian
Neil,
Thanks for reading and responding to my post. Your points are excellent
and have increased my overall understanding of the Emdebian effort.
First off, let me say that my main motivation in writing my points was
not to try to in any way challenge or alter the course of the Emdebian
effort, but rather to shed light on how Emdebian _may_ be used in ways
some of the Emdebian developers have not considered. In effect, I
expect that Emdebian will be upstream of many embedded development
projects (hopefully many of which will be able to contribute back to the
main effort over time). In other words, Emdebian will be exactly what
you describe, Debian for embedded devices. Very tightly integrated and
leveraging the overall Debian effort. However, what you may not see is
the many development teams that use Emdebian as the basis for products
(in essence "downstream" of the Emdebian distribution, as Emdebian is
downstream of Debian).
I hoped to explain what happens in these kinds of "downstream" projects
and why. Bear in mind as you read my responses, that this is all from
the perspective of a development team working on a product that has an
embedded processor that uses Linux as a base OS. Such a project would
have a team of some size that is responsible for developing the product,
providing updates and fixes over a product lifecycle, etc. This is the
classic TIVO model, in which a product makes use of Linux to create a
product that leverages Linux as the OS, while performing some other main
function. In my opinion, this is one of the things that makes Linux
great, it's ability to be harnessed to create new and interesting
products that might not otherwise have existed. Embedded Linux is
especially special in this manner, and for this I think Emdebian has
great promise. I will try to clarify somewhat what I am talking about.
Neil Williams wrote:
On 08/11/06 21:06:28, Jim Heck - Sun Microsystems wrote:
2. I'm confused by "local source control" - is this a private VCS
(CVS/SVN/arch etc.) (if so, why?) or is this some form of control in
terms of restrictions of freedom, isolation of the source from the
main Debian archives and redistribution with non-free software?
Yes SVN is exactly what I'm talking about. Product development teams
must be able to reliably and reproducibly build specific software
revisions for products. This necessitates fixing software in time and
under source control to deliver said product releases. So using SVN or
another system they put all the sources and tools to build the product
under source control.
3. A fixed distribution with restricted updates sounds similar to
Debian's definition of the 'stable' distribution. There will be no
direct security team support for Emdebian - at least until emdebian
becomes an accepted port within Debian itself. emdebian is likely to
continue tracking Debian unstable or possibly testing, rather than
stable, until such time.
No I don't expect a stable distribution. I agree that this is not the
job of the Emdebian developers. The fixed distribution I speak of is
created downstream by the product development teams taking Emdebian into
their own environments. They are ultimately responsible for putting
together a system that works for their application.
Specifically, I think embedded Linux systems are unlike classic Linux
systems in a few important ways. They are used as the foundation for
many embedded products, where one of the prerequisites in the
development process is the ability to tightly monitor and control
changes to the OS,
I would disagree, personally, because I want emdebian to work the
Debian way: easy and transparent updates to always have the most
recent packages available - across the board. If users choose not to
upgrade, fine, but when they do upgrade, the entire distribution is
available for updates and is upgraded in a single, transparent and
largely non-interactive manner.
Understood, but I'll just point out that what you are describing is an
end user model where the end user directly uses the operating system
(and maintains it). In the case of many embedded development efforts,
this task is performed by a development team and not the end user.
emdebian does it differently - emdebian, like debian, is a constantly
rolling update. Updates are always available, updates are implemented
across the entire package list according to the results of
reproducible builds within Debian.
Yes I understand this is how Emdebian will and should work. What I've
described is what may happen downstream of Emdebian.
usually involving some form of source control for
both the sources and the toolchain that are used to build the
specific embedded product. This is necessary, so that a product
based on the OS will have high stability and have been rigorously
tested before being deployed.
Until emdebian has a release team that can manage such a process, I'm
not sure that is practical. That release team, IMHO, is likely to be
the main Debian release team when/if emdebian is accepted into the
fold. Duplicating that enormous effort is, IMHO, a complete waste of
time.
See above, not the Emdebian team's problem. This is all downstream in
project teams using Emdebian.
Often following initial development, changes to these systems follow
a very conservative approach where it is desirable to be selective in
pulling in updates and changing fundamental system tools such as
toolchains. As such, there is a great desire to be flexible and
current in source and tools during initial development (like a
classic Linux model leveraging package management tools), but once an
embedded product is "baked", there is real value add to gaining finer
control over how one brings new source and new tools (toochains) into
the mix.
This sounds much more like the attitude of Fedora and RHEL.
Personally, I moved to Debian to get away from "baked" distributions
because once baked, they are incredibly difficult to update and upgrade.
Baking interrupts the flow of updates that Debian provides.
I'd compare it to:
1. striving to push a boat out of a fast river and towards the shore.
2. Lifting the boat out of the water and putting it on a trolley.
3. Leaving it on the trolley for a while.
4. At some future date, moving the trolley downstream and trying to
relaunch the boat AND then catch up with where the boat would have
been if you hadn't taken it out.
That kind of hiatus breaks things - badly.
Not once, prior to moving to Debian, was I able to successfully
upgrade a GNU/Linux installation. apt-get dist-upgrade solves *all*
those problems and apt-get upgrade keeps things in sync.
I'll agree with you wholeheartedly. Your analogy is good too.
Unfortunately, this is how embedded development teams working on
embedded products are forced to work. Once again, however, it is not a
problem for Emdebian. I don't expect that to ever be a 'baked'
distribution.
1. Be able to easily put a set of Emdebian source packages specific
to a given release of their embedded system under local source control.
Emdebian is primarily concerned with binaries of reduced size -
managed by a set of tools and patches that strip out unwanted data,
documentation and components. The source remains upstream and emdebian
will keep up to date with the upstream changes.
This stripping out is exactly what makes Emdebian very attractive to
embedded development teams considering Linux as an OS. There are
companies who provide Linux distributions that have already done this,
but as you said, these baked distributions cause their own problems.
They take embedded developers further away from the upstream sources,
and cause long lags in the availability of patches and fixes coming from
those sources. This is why I think the Emdebian effort is interesting
an will be popular as a basis for embedded development.
2. Be able to leverage package management tools to intelligently
bring in a minimal set of sources and libraries to move forward to
get new features or bugfixes (this is in the stage following the
fluid initial development stage, when the embedded system is in it's
conservative update mode).
apt-get update && apt-get upgrade are not restricted - without a
release team to limit the flow of packages from testing into stable,
all available updates are always installed, each time apt-get upgrade
is run.
Actually, tools like aptitude allow you to pretty well control what
pieces of your system you upgrade, so long as you turn off automatic
upgrading. apt-get update refreshes your cache, but the packages are
not installed unless you decide to. In this manner it may be possible
to populate a local mirror with packages built from a set of sources
that have been refreshed. Using aptitude, the development team could
test their root filesystem by incorporating only those libraries and
packages they pick and choose (with aptitude resolving dependencies).
3. Have those pieces of the host environment that pertain to managing
and building a set of packages be self contained and well known
enough to be themselves placed under local source control.
That phrase again. In your opinion, is 'apt' under 'local source
control' and what precisely do you mean by local source control?
Once again this is all back to the development team trying to have a
reproducible software image, and I'm talking about SVN for those pieces
4. Be able to construct a toolchain that is largely independent of
the package sources, since changing toolchains following initial
development is far more risky than bringing in new sources.
Debian manages that transition and emdebian will benefit from the
updated compiler and toolchain.
New Debian packages will necessitate the use of the current Debian
toolchain - the new packages won't build otherwise. That debian
toolchain is upgraded as time moves by. It's just the way Debian (and
emdebian) work.
If you want the latest package, you cannot expect it to compile under
an old toolchain - it might, it might not, Debian won't care either
way. The Debian build instructions were built, tested on the other
ports then reproducibly retested again and again using the *new*
toolchain. Debian packages are always required to build on whatever is
the current Debian 'Default Compiler'. Currently, that is gcc-4.1. 4.2
is in the wings and once uploaded to Debian unstable, all packages in
Debian unstable will be required to build reproducibly using 4.2 on
*all* supported architectures. emdebian seeks to become on of those
supported ports. To do so, emdebian *must* keep in step with the
parent: Debian.
The buildd logs for any source package are plain to see in Debian - as
is the reproducible nature of each build, over and over again.
Good points all. Yes this is of course the way it has to be for
Emdebian. I'm just pointing out that it is the complete opposite for
downstream embedded development teams, but that is once again their
problem to solve. Historically, it is believed that in product
development/testing more time is incurred by changing toolchains than
the occasional problem of having new software that won't work properly
with older compilers. Teams do change tools over time, but not on a
constant basis.
Many times these
systems may not be a Debian environment (bureacracy...). The ability
to easily integrate the building of source packages into such an
environment would be nice.
I'm not sure it is something emdebian will support. emdebian *wants*
ever closer ties to Debian - that is why we're here. We want embedded
Debian GNU/Linux, not embedded GNU/Linux independent of Debian. That
is what OE and GPE already do.
I'm just describing something that happens downstream. Not Emdebian's
problem to solve.
There may be an easy way to achieve point 1 already. I am looking at
how I might setup my own source repository (under source control)
where I can store Emdebian packages that I pull from the main Debian
mirrors. Perhaps the tools being discussed can help with this (or at
least not make it harder).
Private repositories are the easy part. Managing a restricted set of
updates so that the whole continues to build is tricky. Debian does
this job for us - in the most part - why dump that method?
Point 2 is one that I personally find to be of most interest. I
really want to be able to leverage the Debian packaging tools and
apt's dependency management to upgrade pieces of my target root
filesystem,
emdebian will do that. emdebian also seeks to modify the Debian apt
type tools to foster easier cross-building - as a necessary precursor
to even closer ties to emdebian's parent group: Debian.
but do so by bringing in the necessary sources for
libraries and dependent packages in a minimally inclusive source set
that can then be built and the sources put under source control.
This, at least to me, seems like an enormous amount of wasted effort.
Once again I'm not advocating this for Emdebian, but describing what
would happen downstream in a development group. By and large I agree
that it would probably be better to update as much stuff as possible on
an occasional basis and have a consistent snapshot of Emdebian. This is
probably what is best to strive for. Unfortunately what happens in a
product development group is that one has to weigh the QA risks of
changing more than you need to change to get one part of a system past a
nasty bug. Overall things usually get refreshed every so often on a
wide scale. The problem comes when there is a critical bug that needs
some new version of a single package.
What confuses
me is that there are really two kinds of dependencies. Build
dependencies and installation dependencies, and I'm not sure how the
two can be used to the effect I am describing. Ultimately both need
to be managed. I haven't heard anyone propose the cross equivalent
functionality of 'apt-get build-dep <package>'.
The likelihood is that by keeping up with the rolling Debian update
set, emdebian will be able to keep up with the Debian build-time
dependencies and therefore the runtime dependencies. There will be
differences in the 'essential' package list in emdebian compared to
Debian but other than that, package names will remain the same in
emdebian and Debian.
Runtime dependencies are what matter to users but they mainly arise
from build-time dependencies. Build-time dependencies matter to the
maintainers and are usually only installed by those wanting to build
packages.
emdebian hopes to simplify the dependency situation within debian
itself by a process of fully qualifying all dependencies and stripping
out unnecessary dependencies - as well as making default components
into separate, optional, packages with their own dependencies. This
includes build-time and runtime dependencies.
Good stuff.
Point 3 is a tricky one, but central and also one I find to be of
interest. In a "normal" linux system, the system is its own source
control if you are building Debian packages for local installtion.
For cross development, this is inherently not the case, as we all
realize, and an overriding concern is being able to reproduce build
results in the future.
How is that different to Debian? Debian buildd's and scripts like
pbuilder are used for exactly this purpose - to ensure that Debian
builds are reproducible at all times and on all supported
architectures. That is what is holding back Etch - each time there is
a failure to match a build to the results from a comparable
(successful) build, a FTBFS bug is generated. The package Fails To
Build From Source - a release critical bug that delays the movement of
that package into the next "baked" release. In doing so, it may also
block the move of any package that depends on the one failing to build
in a reproducible manner.
emdebian follows this principle - a package won't move into emdebian
until it builds cleanly in Debian. i.e. emdebian relies on
reproducible builds even during the constant flow of updates.
To this end, the pieces integral to
reproducibly building target packages need to be as seperable as
possible from the host Debian installation (i.e. no weird or unknown
dependencies on the host Debian environment).
Emdebian has a host of intricate and complex dependencies on Debian -
emdebian is debian stripped down. The dependencies will not be unknown
(this is free software, everything is public) - I can't say if these
dependencies could be construed as 'weird'. It depends who is making
that judgement.
What I'm talking about here is that when building on a host in a crossed
environment for a target embedded system, two Debian environments are
involved, and the process is either seperable or it isn't. If it is not
seperable, then there are dependencies of the installed
libraries/packages on the host system that link it to the final target
system. In a non-seperable system you will only ever be able to build
new software for the target using the same host (the two are linked by
"weird" dependencies). If the system is seperable, then you could come
along with a different host (still Debian), move some set of stuff over
to the new host, and continue to update and build software for the
target system. The set of things you would move from one host to the
second host are the state that really belongs to the target. At minimum
this is the set of source packages being built for the target. The
question I was asking, really, is whether there were any host
dependencies beyond that set of source packages.
emdebian is not about being removed from Debian - quite the opposite,
emdebian seeks to become an integral part of the Debian release cycle
alongside the existing ports like amd64, arm, powerpc and m68k etc..
emdebian seeks to join it's (eventual) Policy with Debian Policy. To
merge the buildd work into the overall Debian buildd network.
Why this is important,
is so that if Emdebian is part of a larger build environment that
serves the purpose of building the OS, it is at least a reliably
reproducible part, and those pieces critical to reproducibility can
be properly source controlled and managed.
I think you may be thinking of Emdebian in terms of OpenEmbedded. To
me, at least, your approach seems to suit their methods rather than
the methods within emdebian.
OpenEmbedded is an OS builder. Emdebian is just embedded Debian - a
prebuilt OS. Emdebian isn't about building a new OS, it's about
stripping down an *EXISTING* OS into the resource limits of an
embedded device.
I'll check out OpenEmbedded. I see your point about Emdebian having its
roots in stripping down the existing OS and not about trying to be a new
OS, but I don't think that's all it will ever be used for. What is
Debian if not full featured Linux with great package and package
dependency handling. That is precicely what is needed as the basis for
many embedded development projects.
Point 4 is a fact of life. It is never ever easy to change a
toolchain once the product ships.
Debian does this for us. It's happening now with gcc-4.2 in experimental.
New toolchains need to be
requalified against all of the software, and thoroughly regression
tested.
This is done by debian maintainers building their packages and fixing
FTBFS bugs caused by the transition. It's not limited to gcc, there
has been an enormous effort to get the python 2.3 -> 2.4 transition
completed, before that there were C++ transitions related to gcc,
there was the transition from XFree86 to XOrg. Debian has a long
history of "requalifiying" the entire unstable distribution against
each and every transition in the normal flow of the Debian update
stream from experimental to unstable to testing and (eventually) to
stable.
Emdebian will benefit from all these fixes and will only take packages
that have completed whatever transition is relevant at that time.
Understood.
In due course, emdebian hopes to implement a few transitions of our
own - the separation of translation support into dedicated packages
and tools to manage them, the resolution of a host of untidy and
unruly dependencies that are unclear, unwanted and over complicated, a
solution to the differing requirements for the 'essential' package
list the list goes on.
More good stuff.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
-Jim Heck
Reply to: