[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#842634: Bug#851877: fails every time



On Sat, Oct 06, 2018 at 11:38:59PM +0200, Santiago Vila wrote:
> On Mon, 15 May 2017, Adam Borowski wrote:
>
> > > Looking at /etc/hosts within the schroot, I see:
> > > 127.0.0.1       localhost
> > > 127.0.0.1       localhost ip6-localhost ip6-loopback
> > > 172.28.17.11    abel.debian.org abel
> > >
> > > Modifying /etc/hosts by replacing ::1 with 127.0.0.1 results in being able
> > > to reproduce the issue on other machines as well.
> >
> > So it's a fully _reproducible_ bug, with a well-defined immediate cause
> > (even if we haven't identified the indirect cause yet) -- unlike the
> > original report by Santiago Villa.  Thus, it looks we have two different
> > bugs that just happen to trigger the same failure mode.
> >
> > And thus, even if we fix the schroot issue, Santiago's bug likely won't be
> > fixed.
>
> Hello everybody.
>
> I'd like to clarify that most probably there was only one bug here
> after all, namely, the one in schroot (#842634).
>
> I initially reported this as "random" because I had a mix of
> successful builds and failed builds, but most probably the
> autobuilders in which it failed were always the same, the ones in
> which the build succeeded were always the same, and I just failed to
> recognize the pattern.
>
> Fortunately you have found the real reason for the bug (while I was
> missing from the discussion :-), and I believe this was the only
> reason it failed for me last year.
>
> Now a simple question: Do you think the workaround you prepared could
> still be useful at all (I personally don't think so), or should I just
> reassign this to schroot and use "affects"?
>
> Thanks.

The root cause of the bug is glibc's implementation of gethostent. The
chroot's hosts file is filled by schroot by running `getent hosts`, and
this is what prints the wrong information. If you peek inside the source
of glibc's getent program, then this part of it can be extracted to the
following (LGPLv2.1, and formatting is the GNU style so don't blame me):

> #include <arpa/inet.h>
> #include <netdb.h>
> #include <stdlib.h>
> #include <stdio.h>
>
> /* putchar_unlocked exists on BSDs, but not fputs_unlocked */
> #define fputs_unlocked fputs
>
> /* This is for hosts */
> static void
> print_hosts (struct hostent *host)
> {
>   unsigned int cnt;
>
>   for (cnt = 0; host->h_addr_list[cnt] != NULL; ++cnt)
>     {
>       char buf[INET6_ADDRSTRLEN];
>       const char *ip = inet_ntop (host->h_addrtype, host->h_addr_list[cnt],
>                                   buf, sizeof (buf));
>
>       printf ("%-15s %s", ip, host->h_name);
>
>       unsigned int i;
>       for (i = 0; host->h_aliases[i] != NULL; ++i)
>         {
>           putchar_unlocked (' ');
>           fputs_unlocked (host->h_aliases[i], stdout);
>         }
>       putchar_unlocked ('\n');
>     }
> }
>
> int
> main (int argc, char **argv)
> {
>   struct hostent *host;
>
>   sethostent (0);
>   while ((host = gethostent ()) != NULL)
>     print_hosts (host);
>   endhostent ();
>   return 0;
> }

If you compile and run this on Linux, you will get the same output as
`getent hosts`, with `::1' being turned into `127.0.0.1'. However, on a
BSD (userland) system, you get a more sane output that doesn't rewrite
IPv6 addresses. So, as this demonstrates, the root problem is that
gethostent in glibc mangles its input. What I don't know is if this is
desired behaviour; I guess someone should file a bug report upstream and
see what they say...


Reply to: