[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: itp: static bins / resolving static debian issues



Justin Wells <jread@semiotek.com> writes:

> On Tue, Aug 24, 1999 at 12:20:03PM +0200, goswin.brederlow@student.uni-tuebingen.de wrote:
> > I like dynamic linking and it saved me several times so far.
> > Again thats an opinion.
> 
> Yes. Now you need to tell me how it saved you; and why this is 
> relevant to static recovery tools (the current proposal; and nobody
> is proposing that dynamic tools be removed!)

I can easily do "dpkg --root /test -i libc" and set an LD_PRELOAD to
that lib. Then I can restart the services one by one and see if they
work or restart all using wildchars and a for loop. That way it takes
me 2 seconds to test a new lib. With static bins I have to extract
each one and test each one, probably in a complete chroot enviroment,
which is a lot more work.
The case where I realy needed to test the libc before using was on my
m68k system when Debian updated to libc6. The lib and the kernel had
certain bugs that got trigered by most useage of the lib but weren´t
allways happening, so I had to be able to run the complete system
under the new lib ad switch back to the old instantly by exiting the
shell and thereby killing the LD_PRELOAD.
Static binaries would have ment a second system, for which I didn´t
have the space.
 
> > But if Debian eliminates that opinion, that will be a bug.
> 
> Presumably you meant that option (static linking): yes it would be

No, I ment dynamically linked boot programs and that for all of
them. If Debian forces me to use a static ls, mount, test, fsck, thats
not an option. If it asks me thats fine.

> > Does static linking make downtime cheaper or tolerable?
> 
> It makes downtime more cheaper and more tolerable in several ways:
> 
>   -- It is cheaper and more tolerable if the clients don't notice that
>      there was a failure at all--dynamic linking would require a reboot, 
>      killing the servers the client is using. Live recovery keeps the 
>      servers up, and for example clients may not ever know there was 
>      a problem with the NFS server, since to them it appeared to 
>      continue working.

Exactly the same will happen with a static and a dynamic nfs
server. If any needed lib is broken both will fail. Do you agree on
that? If yes, all your further arguments are nil, if not thats
something you got to convince me about.

> > > Proposition #4-- Failure of the C library can occur under Debian
> > 
> > Sure, but does static linking prevent that? No it doesn´t.
> > As soon as a new libc is released, the autobuild demons will compile
> > it and all static linked binaries and upload them. You type "apt-get
> > install upgrade" and then you have them all installed and your system
> > goes down.
> 
> Yes, static linking prevents the situation you describe. The new 
> binaries that get installed in your system, and which fail, will not 
> result in an apparent failure to your NFS and http clients (except for
> CGI). The existing, old versions of your servers are still running 
> and are linked and loaded. They are unaffected by the catastrophic 
> failure you have just wrecked on your system. You can calmly pull 
> out your static recovery tools and fix it. 

If the NFS server is dynamically linked, it will continue to run. If
its a static binary, you will be asked whther to update and restart
it, which you can skip. Then the nfs server also keeps running.
So the nfs will allways keep running. 0:0 for both.

If you install a broken libc you will also install all static
binaries, which will be broken as well on 99.9% of all cases.
So no matter if ar, tar and gzip are static or dynamic, you can´t
revert to the old libc and old static bins without rebooting or a
directory containing the old bins and libs. Again 0:0 for both.

> To be fair, another concept I demanded earlier on that was omitted 
> from the message you're responding to is that the statics need to 
> bootstrap themselves. They need to install the new statics in a 
> subdirectory, run a test on them, and only replace themselves if 
> the test succeeds. 

I did that after I got a new drive for my m68k system. Boot with the
rescue disk and copy the system partition (/ /usr /var). Reboot the
normal system and mount the backup to /target (or somewhere else).
For an update you then do "cd /target && chroot ." and then just as
normal. If it doesn´t work exit fill revert to the old libs and bins.
You can restore the test system when something goes wrong with "dpkg
--root=/target" in most cases, otherwise you need to copy the system
partitions again.

Using that method you will only have a downtime when a bug brings the
kernel down and the few seconds while stoping and restarting services.
Any other method can fail no matter if you use static or dynamic binaries.

>...
> In this case statics are an extra level of redundancy that make your 
> OS a little bit more fault tolerance. Redundancy is a well known  
> benefit when you are looking for reliability, and this case is a 
> good illustration of why. 

But you don´t exactly have redundance. You can´t delete one and let
the others take over. You just have duplicate code in the binaries,
which also duplicates any errors you might have. You can use cat
instead of cp or other things, but only manually, not during scripts
or during boot (or do you want to chnage scripts as well ?) Also those 
tools contains a lot of equal source if linked statically, so its
likely that they all fail at the same time.

> For example, you might lose the ability to copy files, or unpack 
> archives, but your mount command might be left intact. Using your
> mount command you can mount a disk from another machine and run 
> a static copy command off the mounted partition. Alternately, maybe
> you have lost mount, but you have a telnet session open on another 
> machine and you have some packing tools and cat--you paste a binary
> from the other machine over, and you're live. This is extreme, but 
> if your server is important enough, possibly worthwhile. 

If your server is important enough, you have a set of the basic tools
in a spare partition, so if something goes wrong you can use
those. And those can be dynamically linked or statically linked, just
as one likes.

> Finally I am not certain I agree that a disk error ONLY increases with 
> the number of blocks on a given medium. If the disk error is actually
> caused by a buggy kernel, or buggy memory, then it is also likely to 
> increase linearly with the number of inodes. Since a typical dynamic
> depends on 10-12 inodes, and a typical static depends on only  2
> (itself plus the directory it is in) once again the probability 
> of a dynamic getting wiped out is much higher.

A static bin is bigger, therby covering more diskspace and using more
inodes. Actually anything bigger than 16 K uses 3 inodes, then you
have single, double and tripple indirection inodes. Tripple
indirection should not happen, but you will need more double
indirection for a static bin thatn for a dynamic bin and that for each 
dynamic bin. Lets say you need 1 inode more for each static bin and
you have 20 static bins, that means 20 extra inodes. But you only save 
10-12 inodes for the libs, which are still needed for other programs
and can still cause harm.

Overall you only loose filesystm/disk security by using static bins.

But from my experience thats irrelevant anyway. I never saw a
filesystem break without blowing up completly or to a level where
formating is easier than restoreing.
Same for drives. They die, but they don´t loose just one block.

> > Now conside the case of a broken libc and dynamic/static linking:
> > 
> > static linking:
> >         You have a broken system with 50+ broken binaries which you
> >         must replace from the rescuedisk, extract from the base or
> >         from an old  deb file. You probably won´t find an old .deb
> >         file on the server anymore and you might have to recompile a
> >         lot of packages after bugfixing the libc.
> 
> That sounds like two commands:
> 
>     cd /
>     tar -xvzf /floppy/statics.tgz
> 
> Or equivalent ar command on a .deb, so what's your point? 

Why not

LD_PRELOAD=/floppy/libc.so cp /floppy/libc.so /lib

Just as easy and the libc and will fit on a disk whereas statics.tgz
and a static tar and gzip might not.

Remember, when you update the lib, any static binaries will be updated 
as well, unless you test the update package by package, so you have to 
include tar and gzip on the disk and have all static bins in the
statics.tgz.
 
> > dynamic linking:
> >         You have a broken system just as well, but only one broken
> >         lib. Copying that from the rescuedisk is easy. You can just as 
> >         well allways keep a backup of a working libc on the harddrive
> >         and copy that back into the system.
> 
> How would you perform that copy? I think you are forgetting that you 
> are arguing AGAINST a statically linked copy command, static su, 
> and static root shell. So you cannot assume you are allowed to 
> copy files, your copy command and ability to become root do not exist.

Just as you copy files with a broken static cp. The Problem is the
same with static and dynamic linking. You have to replace the broken
peace of software. An extra partition with working versions is needed
for static and dynamic binaries, so you don´t gain with static
binaries.

> Note that this is not an opinion, it is a fact about dynamic binaries
> that they don't work when their library is unavailable.

The risk that the library is unavailable is neglectable. Its just one
package. Its far more likely that one of mount, ls, fsck, sh,... is
missing, since they are far more.

> You are NOT certain to have a root shell in the case of a hardware 
> failure, a hack, or a software bug. You are not even certain to have
> a root shell if you caused it by administrator error (many people 
> use things like "sudo" as they provide extra security).

Thats right and might be a problem. If you work remotely, you should
use ssh to login as root. Then you have a root shell. If sshd fails
during your mangling the system, your lost.
If you work on the system locally you can login with sash as shell.
The problem is to get a shell and sash solves that nicely, but after
that it doesn´t realy mater whether you have static or dynamic
binaries.

> > >    Proposition #5-- Most servers will survive a C library failure 
> > >              if the server is already running 
> > 
> > Most servers will survive the installation of a broken static binary
> > just as well as broken dynamic lib.
> > 
> > Proof: The server is alread running.
> 
> You are not making any point here. Proposition #5 is a point I made 
> to establish that there was a possibility of live recovery. You have
> just demonstrated, again, that there is a possibility of live recovery.
> 
> Thank you for making my point?

I (and you) made my point as well. My point is that having a static
cp, ls, mount, tar and so on doesn´t improve reliability. Is just for
the admins taste what he prefers and with what he can cope best in an
emergency.

> > >    Proposition #6-- If a failure of the C library occurs, and the 
> > >              servers are still running, then on a system where 
> > > 	     downtime is considered unacceptable or difficult, you
> > > 	     should fix the problem without a reboot if that is
> > > 	     possible
> > 
> > Same with broken static binaries. What if cp is broken?
> 
> Then I use cat. Or if that is broken too, I try dd. Or if that is 
> broken as well, I play around with mount and mv, or just mount 
> by itself might get me going. Or if those are both broken--now I
> start getting creative: does grep work? If so I can make a pattern
> that matches everything. If that fails and I have ed, I still might
> have a chance.
> 
> Statics are highly redundant, that's what makes them so useful during
> a system failure.

If cp is broken something so basic is broken that all the other bins
will most likely break as well. Knowing that there might be a chance
and testing all possibilities takes longer than typing LD_PRELOAD to
use an working library.

> > But both the dynamic and static binary depend on exactly the same
> > source and if thats broekn the bin is broken, no matter if its staic
> > or dynamically linked. You gain or loose nothing. The chance that
> > dynamic linking fails but static not is 0.
> 
> Of course I am talking about at runtime, because compile time is 
> really totally irrelevant to the discussion. I assume you wrote this 
> in a brief moment of distraction, because the rest of what you wrote
> was fairly intelligent. 

The main danger of a system failure as i see it is during an
update. So the argument is relevant. Static binaries don´t prevent you 
from harware failures any more than dynamic binaries.

So the question is: "Does static linking of the basic tools make
updates any saver?"

The answere is no. And thats were the source come into play. You say
that during an update for example the libc is broken and thus the
basic tools fail. But then the static binaries will all contain the
broken code, since they use the same source, or it won´t be executed
with the dynamically linked binaries and won´t harm. But if the bug is 
in all or most static bins you gained nothing.

> However, just to drive the point home, try this:
> 
>     rm /lib/*

Thats stupidity. Nothing protects you from stupidity.
By the way, the restore method mentioned above (the one with
LD_PRELOAD) restores the system nicely.
 
> now check whether your statics still work, and whether anything 
> else does. 
> 
> "The chance that dynamic linking fails but static not is 0" is an 
> absolutely ridiculous statmeent--obviously written in a moment of 
> distraction. It's not even close to the truth. 

Tell me a situation where static linking works but dynamic linking
fails during an update. Stupidity doesn´t count as a reason for static 
linking and assume sash or a root shell is awaylable.

> > The only benefit you get from static bins is that all bins are rebuild 
> > with each lib update and unresolved symbols will be detected
> > automatically.
> 
> As if my C compiler would be working during a system failure! All I 
> am talking about is enough functionality to copy a few already 
> compiled libraries onto my limping system.
> 
> You expect me to have a functioning GCC!?

No thats to do with the possible of unresolved symbols. I had that
after library updates a few times. (actually only with non-Debian
software :) By compiling static binaries such unresolved symbols might
be detected by the autobuild demons, because static linking fails.
With dynamically linked bins, those will not be linked against the new 
libs on each update and an unresolved symbol bug might go unnoticed
till after an update.

Of cause such a bug won´t stay unnoticed more than a few minutes and
thus never make it to stable, but you never know what strange old
software Debian users might have installed.

> > >    Proposition #12-- The presence of static recovery tools will 
> > >              not pose any difficulties to ordinary use of the system
> > 
> > You can´t boot any longer on lowmem systems. I think thats a
> > difficulty.
> 
> I used to run a Linux that had all static binaries. That was a very long
> time ago, dynamic linking wasn't working very well in Linux back then. 
> I don't recall being unable to boot, and I only had 4M of memory in
> my 386sx25. 

Yeah, try that today. With less than 5 MB you couldn´t install hamm
and potato is worse.

> Secondly, the static recovery tools have nothing to do with the boot 
> process and would have absolutely no effect on it. The first potential
> opportunity for a static to run would be when you logged in as root,
> and since the static shell (sash) uses less memory than bash does 
> (even with dynamic linking, bash uses more per-process) this is 
> a complete non-issue.

If you only want sash and a backup set of cp, ls, tar,... statically
linked in another location that fine with me and then your right. That 
would be a good way to recover quickly from broken updates, but having 
a backup set of dynamically linked bins and libs works just as well
but takes less space.

But you proposed to have all basic bins static, which is a completly
different thing.

>...
> them until their system fails. This is owing either to:
> 
>    1) Not enough experience. This shouldn't be held against them.

Thats the reason why you should inform the new user of the
possibilities of statically linked packages. Why should they be forced 
on the user just because the small minority of admins for high
availability systems isn´t able to read? There isn´t such a thing as
not enough experience, but bad information.
 
>    2) Too much experience. Everyone knows all Unixes have static recovery.

I don´t and its not true. What you call too much experience is false
information, so inform the users.

> I got burned by #2, it never occurred to me that Debian WOULDN'T 
> install the statics until I need them, and then I was horrified to 
> discover that it didn't. 
> 
> Now *I* know better, but there are many who don't, and who are going 
> to wind up in the same boat as me (whether due to too much or due to 
> too little experience).

Oh how I (and many others) would wind up with static binaries.

> > Did you have a set of rescue bins and libs installed in a seperate
> > directory or did you use a chroot enviroment to test first? No, than
> > its your own carelesness.
> 
> The first time I didn't, because I assumed that Debian installed statics
> by default, the way every other decent Unix system I've ever used does. 
> (Yes, there are some indecent ones, I assumed Debian was not among them.)

What would you have done if it installed a broken mount? The next time 
you would have booted your system would have been dead. Testing is the 
key, static linking doesn´t replace that.

> After that I've always had the static shell in place as the root 
> shell, and the recovery tools provided by it have been adequate 
> (and useful).

I agree with you that you need to become root somehow, so a static
shell is a good thing, but apart from that its all taste and not
requirement.

> Your second suggestions is a truly useful one, and that may just be
> the solution: Some informative question during the installation could 
> ask the user whether they want statics or not, accompanied by a 
> paragraph explaining why they would, and why they wouldn't.
> 
> In that case, there would be no users who didn't know they needed 
> it, because they would read that question and know that they did.
> 
> I would accept that as a solution, providing the default answer to
> the question was "YES, I want static recovery tools", since as I 
> said, the default should be the safer and more reliable system.

The recovery tools could just as well be dynamic, so the question
should be "Use a recovery tool and/or a rescue disk?"

> > Any bugs in any library will also show up in the static bins, so you
> > gain nothing by static linking. Any update to a broken lib will also
> > install broken static bins.
> 
> Which is why I keep harping so much about bootstrapping. Your static 
> tools should test themselves out before replacing themselves. So 
> should your package manager. And if these things are all static, 
> this is so much easier to do (don't have to test again just 
> because you install a C library, and fewer dependencies to watch).

Thats a good idea and was brought forward by Falk Hueffner some time
ago on debian-admintool I think. He proposed to include a test script
for use during build and after/during installation. The testscript
should be placed in the control.tgz and could be executed during
installation. dpkg could then be setup to test packages in a chroot
enviroment first or it could otherwise be made sure that packages can
be reverted to the old state safely.

Dependencies aren´t a problem there, computers can handle that quite
well. :)

> > A rescue disk is provided, more is not needed. Maybe more is possible, 
> > bot not required.

Thats from the point of policy, not convenenience.

> You see, this is what bugs me. You just read a very long message
> which presented a long and detailed argument against the statement
> you just wrote. Nevertheless, you feel you can state this opinion
> without backing it up.
> 
> A rescue disk, as I have exhaustively demonstrated, is an absolutely
> unacceptable rescue solution on two very large classes of servers:
>
> those where a reboot is a logistical problem; and those where a 
> reboot is flat out unacceptable because of critical services.

Which is a very small percentage of debian users. A recovery tool is
an option, a very good one and I beg you make a package for it. But
for 99.9% of all cases using a rescuedisk is a valid, clean and
quicker options. It´s not allways possible, but those systems are few
and thus it should be optional.

> > Why should that ever happen? Is it more likely for dpkg to remove libc 
> > than sash? Is /bin save from errors more than /lib is? Remember, when
> > you use static binaries you will have /lib in /bin several
> > times.
> 
> 1) Yes. The dynamics have far more complex and numerous runtime
>    dependencies, and that makes the probability for error much greater

dpkg handles that quite nicely and people are working on automatik
source dependencies, which will also provide safe ways to set the bin
dependencies.

> 2) If it removed sash but not the dynamics, you can fix it! But if   
>    it removes the dynamics, and you have no sash, you have to reboot.

I agree with you on sash thats a good idea and several people said so
as well and it might make it to base with ther next release.

> > Downtime of an NFS server is hardly any problem. If you get it up
> > withing 15 minutes it just starts working again. No problem there even 
> > with root-NFS. I tested that several times so far.
> 
> Maybe for YOU downtime of an NFS server is not a problem!!!

Its a problem for me, but not for the NFS server. :)

May the Source be with you.
                        Goswin


Reply to: