[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Emdebian



Neil,

Thanks for reading and responding to my post. Your points are excellent and have increased my overall understanding of the Emdebian effort. First off, let me say that my main motivation in writing my points was not to try to in any way challenge or alter the course of the Emdebian effort, but rather to shed light on how Emdebian _may_ be used in ways some of the Emdebian developers have not considered. In effect, I expect that Emdebian will be upstream of many embedded development projects (hopefully many of which will be able to contribute back to the main effort over time). In other words, Emdebian will be exactly what you describe, Debian for embedded devices. Very tightly integrated and leveraging the overall Debian effort. However, what you may not see is the many development teams that use Emdebian as the basis for products (in essence "downstream" of the Emdebian distribution, as Emdebian is downstream of Debian).

I hoped to explain what happens in these kinds of "downstream" projects and why. Bear in mind as you read my responses, that this is all from the perspective of a development team working on a product that has an embedded processor that uses Linux as a base OS. Such a project would have a team of some size that is responsible for developing the product, providing updates and fixes over a product lifecycle, etc. This is the classic TIVO model, in which a product makes use of Linux to create a product that leverages Linux as the OS, while performing some other main function. In my opinion, this is one of the things that makes Linux great, it's ability to be harnessed to create new and interesting products that might not otherwise have existed. Embedded Linux is especially special in this manner, and for this I think Emdebian has great promise. I will try to clarify somewhat what I am talking about.

Neil Williams wrote:
On 08/11/06 21:06:28, Jim Heck - Sun Microsystems wrote:

2. I'm confused by "local source control" - is this a private VCS (CVS/SVN/arch etc.) (if so, why?) or is this some form of control in terms of restrictions of freedom, isolation of the source from the main Debian archives and redistribution with non-free software?

Yes SVN is exactly what I'm talking about. Product development teams must be able to reliably and reproducibly build specific software revisions for products. This necessitates fixing software in time and under source control to deliver said product releases. So using SVN or another system they put all the sources and tools to build the product under source control.
3. A fixed distribution with restricted updates sounds similar to Debian's definition of the 'stable' distribution. There will be no direct security team support for Emdebian - at least until emdebian becomes an accepted port within Debian itself. emdebian is likely to continue tracking Debian unstable or possibly testing, rather than stable, until such time.

No I don't expect a stable distribution. I agree that this is not the job of the Emdebian developers. The fixed distribution I speak of is created downstream by the product development teams taking Emdebian into their own environments. They are ultimately responsible for putting together a system that works for their application.
Specifically, I think embedded Linux systems are unlike classic Linux systems in a few important ways. They are used as the foundation for many embedded products, where one of the prerequisites in the development process is the ability to tightly monitor and control changes to the OS,

I would disagree, personally, because I want emdebian to work the Debian way: easy and transparent updates to always have the most recent packages available - across the board. If users choose not to upgrade, fine, but when they do upgrade, the entire distribution is available for updates and is upgraded in a single, transparent and largely non-interactive manner.

Understood, but I'll just point out that what you are describing is an end user model where the end user directly uses the operating system (and maintains it). In the case of many embedded development efforts, this task is performed by a development team and not the end user.

emdebian does it differently - emdebian, like debian, is a constantly rolling update. Updates are always available, updates are implemented across the entire package list according to the results of reproducible builds within Debian.

Yes I understand this is how Emdebian will and should work. What I've described is what may happen downstream of Emdebian.
 usually involving some form of source control for
both the sources and the toolchain that are used to build the specific embedded product. This is necessary, so that a product based on the OS will have high stability and have been rigorously tested before being deployed.

Until emdebian has a release team that can manage such a process, I'm not sure that is practical. That release team, IMHO, is likely to be the main Debian release team when/if emdebian is accepted into the fold. Duplicating that enormous effort is, IMHO, a complete waste of time.
See above, not the Emdebian team's problem. This is all downstream in project teams using Emdebian.

Often following initial development, changes to these systems follow a very conservative approach where it is desirable to be selective in pulling in updates and changing fundamental system tools such as toolchains. As such, there is a great desire to be flexible and current in source and tools during initial development (like a classic Linux model leveraging package management tools), but once an embedded product is "baked", there is real value add to gaining finer control over how one brings new source and new tools (toochains) into the mix.

This sounds much more like the attitude of Fedora and RHEL. Personally, I moved to Debian to get away from "baked" distributions because once baked, they are incredibly difficult to update and upgrade.

Baking interrupts the flow of updates that Debian provides.

I'd compare it to:
1. striving to push a boat out of a fast river and towards the shore.
2. Lifting the boat out of the water and putting it on a trolley.
3. Leaving it on the trolley for a while.
4. At some future date, moving the trolley downstream and trying to relaunch the boat AND then catch up with where the boat would have been if you hadn't taken it out.

That kind of hiatus breaks things - badly.

Not once, prior to moving to Debian, was I able to successfully upgrade a GNU/Linux installation. apt-get dist-upgrade solves *all* those problems and apt-get upgrade keeps things in sync.

I'll agree with you wholeheartedly. Your analogy is good too. Unfortunately, this is how embedded development teams working on embedded products are forced to work. Once again, however, it is not a problem for Emdebian. I don't expect that to ever be a 'baked' distribution.
1. Be able to easily put a set of Emdebian source packages specific to a given release of their embedded system under local source control.

Emdebian is primarily concerned with binaries of reduced size - managed by a set of tools and patches that strip out unwanted data, documentation and components. The source remains upstream and emdebian will keep up to date with the upstream changes.

This stripping out is exactly what makes Emdebian very attractive to embedded development teams considering Linux as an OS. There are companies who provide Linux distributions that have already done this, but as you said, these baked distributions cause their own problems. They take embedded developers further away from the upstream sources, and cause long lags in the availability of patches and fixes coming from those sources. This is why I think the Emdebian effort is interesting an will be popular as a basis for embedded development.
2. Be able to leverage package management tools to intelligently bring in a minimal set of sources and libraries to move forward to get new features or bugfixes (this is in the stage following the fluid initial development stage, when the embedded system is in it's conservative update mode).

apt-get update && apt-get upgrade are not restricted - without a release team to limit the flow of packages from testing into stable, all available updates are always installed, each time apt-get upgrade is run.

Actually, tools like aptitude allow you to pretty well control what pieces of your system you upgrade, so long as you turn off automatic upgrading. apt-get update refreshes your cache, but the packages are not installed unless you decide to. In this manner it may be possible to populate a local mirror with packages built from a set of sources that have been refreshed. Using aptitude, the development team could test their root filesystem by incorporating only those libraries and packages they pick and choose (with aptitude resolving dependencies).
3. Have those pieces of the host environment that pertain to managing and building a set of packages be self contained and well known enough to be themselves placed under local source control.

That phrase again. In your opinion, is 'apt' under 'local source control' and what precisely do you mean by local source control?

Once again this is all back to the development team trying to have a reproducible software image, and I'm talking about SVN for those pieces
4. Be able to construct a toolchain that is largely independent of the package sources, since changing toolchains following initial development is far more risky than bringing in new sources.

Debian manages that transition and emdebian will benefit from the updated compiler and toolchain.

New Debian packages will necessitate the use of the current Debian toolchain - the new packages won't build otherwise. That debian toolchain is upgraded as time moves by. It's just the way Debian (and emdebian) work.

If you want the latest package, you cannot expect it to compile under an old toolchain - it might, it might not, Debian won't care either way. The Debian build instructions were built, tested on the other ports then reproducibly retested again and again using the *new* toolchain. Debian packages are always required to build on whatever is the current Debian 'Default Compiler'. Currently, that is gcc-4.1. 4.2 is in the wings and once uploaded to Debian unstable, all packages in Debian unstable will be required to build reproducibly using 4.2 on *all* supported architectures. emdebian seeks to become on of those supported ports. To do so, emdebian *must* keep in step with the parent: Debian.

The buildd logs for any source package are plain to see in Debian - as is the reproducible nature of each build, over and over again.
Good points all. Yes this is of course the way it has to be for Emdebian. I'm just pointing out that it is the complete opposite for downstream embedded development teams, but that is once again their problem to solve. Historically, it is believed that in product development/testing more time is incurred by changing toolchains than the occasional problem of having new software that won't work properly with older compilers. Teams do change tools over time, but not on a constant basis.
Many times these
systems may not be a Debian environment (bureacracy...). The ability
to easily integrate the building of source packages into such an
environment would be nice.

I'm not sure it is something emdebian will support. emdebian *wants* ever closer ties to Debian - that is why we're here. We want embedded Debian GNU/Linux, not embedded GNU/Linux independent of Debian. That is what OE and GPE already do.
I'm just describing something that happens downstream. Not Emdebian's problem to solve.

There may be an easy way to achieve point 1 already. I am looking at how I might setup my own source repository (under source control) where I can store Emdebian packages that I pull from the main Debian mirrors. Perhaps the tools being discussed can help with this (or at least not make it harder).

Private repositories are the easy part. Managing a restricted set of updates so that the whole continues to build is tricky. Debian does this job for us - in the most part - why dump that method?

Point 2 is one that I personally find to be of most interest. I really want to be able to leverage the Debian packaging tools and apt's dependency management to upgrade pieces of my target root filesystem,

emdebian will do that. emdebian also seeks to modify the Debian apt type tools to foster easier cross-building - as a necessary precursor to even closer ties to emdebian's parent group: Debian.

but do so by bringing in the necessary sources for
libraries and dependent packages in a minimally inclusive source set that can then be built and the sources put under source control.

This, at least to me, seems like an enormous amount of wasted effort.

Once again I'm not advocating this for Emdebian, but describing what would happen downstream in a development group. By and large I agree that it would probably be better to update as much stuff as possible on an occasional basis and have a consistent snapshot of Emdebian. This is probably what is best to strive for. Unfortunately what happens in a product development group is that one has to weigh the QA risks of changing more than you need to change to get one part of a system past a nasty bug. Overall things usually get refreshed every so often on a wide scale. The problem comes when there is a critical bug that needs some new version of a single package.
What confuses
me is that there are really two kinds of dependencies. Build dependencies and installation dependencies, and I'm not sure how the two can be used to the effect I am describing. Ultimately both need to be managed. I haven't heard anyone propose the cross equivalent functionality of 'apt-get build-dep <package>'.

The likelihood is that by keeping up with the rolling Debian update set, emdebian will be able to keep up with the Debian build-time dependencies and therefore the runtime dependencies. There will be differences in the 'essential' package list in emdebian compared to Debian but other than that, package names will remain the same in emdebian and Debian.

Runtime dependencies are what matter to users but they mainly arise from build-time dependencies. Build-time dependencies matter to the maintainers and are usually only installed by those wanting to build packages.

emdebian hopes to simplify the dependency situation within debian itself by a process of fully qualifying all dependencies and stripping out unnecessary dependencies - as well as making default components into separate, optional, packages with their own dependencies. This includes build-time and runtime dependencies.

Good stuff.
Point 3 is a tricky one, but central and also one I find to be of interest. In a "normal" linux system, the system is its own source control if you are building Debian packages for local installtion. For cross development, this is inherently not the case, as we all realize, and an overriding concern is being able to reproduce build results in the future.

How is that different to Debian? Debian buildd's and scripts like pbuilder are used for exactly this purpose - to ensure that Debian builds are reproducible at all times and on all supported architectures. That is what is holding back Etch - each time there is a failure to match a build to the results from a comparable (successful) build, a FTBFS bug is generated. The package Fails To Build From Source - a release critical bug that delays the movement of that package into the next "baked" release. In doing so, it may also block the move of any package that depends on the one failing to build in a reproducible manner.

emdebian follows this principle - a package won't move into emdebian until it builds cleanly in Debian. i.e. emdebian relies on reproducible builds even during the constant flow of updates.

To this end, the pieces integral to
reproducibly building target packages need to be as seperable as possible from the host Debian installation (i.e. no weird or unknown dependencies on the host Debian environment).

Emdebian has a host of intricate and complex dependencies on Debian - emdebian is debian stripped down. The dependencies will not be unknown (this is free software, everything is public) - I can't say if these dependencies could be construed as 'weird'. It depends who is making that judgement.

What I'm talking about here is that when building on a host in a crossed environment for a target embedded system, two Debian environments are involved, and the process is either seperable or it isn't. If it is not seperable, then there are dependencies of the installed libraries/packages on the host system that link it to the final target system. In a non-seperable system you will only ever be able to build new software for the target using the same host (the two are linked by "weird" dependencies). If the system is seperable, then you could come along with a different host (still Debian), move some set of stuff over to the new host, and continue to update and build software for the target system. The set of things you would move from one host to the second host are the state that really belongs to the target. At minimum this is the set of source packages being built for the target. The question I was asking, really, is whether there were any host dependencies beyond that set of source packages.
emdebian is not about being removed from Debian - quite the opposite, emdebian seeks to become an integral part of the Debian release cycle alongside the existing ports like amd64, arm, powerpc and m68k etc.. emdebian seeks to join it's (eventual) Policy with Debian Policy. To merge the buildd work into the overall Debian buildd network.

Why this is important,
is so that if Emdebian is part of a larger build environment that serves the purpose of building the OS, it is at least a reliably reproducible part, and those pieces critical to reproducibility can be properly source controlled and managed.

I think you may be thinking of Emdebian in terms of OpenEmbedded. To me, at least, your approach seems to suit their methods rather than the methods within emdebian.

OpenEmbedded is an OS builder. Emdebian is just embedded Debian - a prebuilt OS. Emdebian isn't about building a new OS, it's about stripping down an *EXISTING* OS into the resource limits of an embedded device.

I'll check out OpenEmbedded. I see your point about Emdebian having its roots in stripping down the existing OS and not about trying to be a new OS, but I don't think that's all it will ever be used for. What is Debian if not full featured Linux with great package and package dependency handling. That is precicely what is needed as the basis for many embedded development projects.
Point 4 is a fact of life. It is never ever easy to change a toolchain once the product ships.

Debian does this for us. It's happening now with gcc-4.2 in experimental.

New toolchains need to be
requalified against all of the software, and thoroughly regression tested.

This is done by debian maintainers building their packages and fixing FTBFS bugs caused by the transition. It's not limited to gcc, there has been an enormous effort to get the python 2.3 -> 2.4 transition completed, before that there were C++ transitions related to gcc, there was the transition from XFree86 to XOrg. Debian has a long history of "requalifiying" the entire unstable distribution against each and every transition in the normal flow of the Debian update stream from experimental to unstable to testing and (eventually) to stable.

Emdebian will benefit from all these fixes and will only take packages that have completed whatever transition is relevant at that time.

Understood.
In due course, emdebian hopes to implement a few transitions of our own - the separation of translation support into dedicated packages and tools to manage them, the resolution of a host of untidy and unruly dependencies that are unclear, unwanted and over complicated, a solution to the differing requirements for the 'essential' package list the list goes on.

More good stuff.

--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/


-Jim Heck



Reply to: