Debian Bug report logs - #21164
Realtime lock in linuxthreads

version graph

Package: libc6; Maintainer for libc6 is GNU Libc Maintainers <debian-glibc@lists.debian.org>; Source for libc6 is src:glibc (PTS, buildd, popcon).

Reported by: erikyyy@studbox.uni-stuttgart.de

Date: Wed, 15 Apr 1998 09:03:01 UTC

Severity: normal

Found in version 2.0.7pre1-4

Done: Joel Klecker <jk@espy.org>

Bug is archived. No further changes may be made.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org:
Bug#21164. (full text, mbox, link).


Acknowledgement sent to erikyyy@studbox.uni-stuttgart.de:
New bug report received and forwarded.

Your message didn't have a Package: line at the start (in the pseudo-header following the real mail header), or didn't have a psuedo-header at all.

This makes it much harder for us to categorise and deal with your problem report; please ensure that you say which package(s) and version(s) the problem is with next time. Some time in the future the problem reports system may start rejecting such messages.

(full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: erikyyy@studbox.uni-stuttgart.de
To: submit@bugs.debian.org
Subject: Realtime lock in linuxthreads
Date: Wed, 15 Apr 1998 10:53:13 +0200 (CEST)
[Message part 1 (text/plain, inline)]
Package: libc6
Version: 2.0.7pre1-4

i am using a quite updated debian hamm system.
i386, no SMP

untar the attachment.

cd iii
make

get root. start src/yyys.

you will see, that after a few seconds the system locks down (you cannot
move the mouse), then my emergency system will kill the program.

now look at this code carefully (yyys.cpp).
i think it shouldn't lock down the system ?

you may of course remove my "antirealtimelocksystem" if you believe, that
this is the bug, but be careful ! you must have a high priority realtime
console shell available, otherwise you cannot kill the program, once it
locks down the system.

(the "antirealtimelocksystem" was created, because i had to press RESET
too often. it uses fork(), NO threads!)


byebye
Erik


--
EMAIL: erikyyy@studbox.uni-stuttgart.de                  \\\\
       thieleek@tick.informatik.uni-stuttgart.de         o `QQ'_
IRC:   erikyyy                                            /   __8
WWW:   http://wwwcip.rus.uni-stuttgart.de/~inf24628/      '  `
       http://tick.informatik.uni-stuttgart.de/~thieleek/
[libc6bug.tgz (application/x-gtar, inline)]

Bug assigned to package `libc6'. Request was from jdassen@wi.leidenuniv.nl to control@bugs.debian.org. (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Juan Cespedes <cespedes@debian.org>:
Bug#21164; Package libc6. (full text, mbox, link).


Acknowledgement sent to erikyyy@studbox.uni-stuttgart.de:
Extra info received and forwarded to list. Copy sent to Juan Cespedes <cespedes@debian.org>. (full text, mbox, link).


Message #12 received at 21164@bugs.debian.org (full text, mbox, reply):

From: erikyyy@studbox.uni-stuttgart.de
To: 21164@bugs.debian.org
Subject: did my report work ?
Date: Thu, 16 Apr 1998 14:59:42 +0200 (CEST)
Package: libc6
Version: 2.0.7pre1-4

sorry about this email, but the automatic bug reply said, that
my email had no "Pseudo Header". Some other debian man, said me that this
was wrong (i gave him the email, too)

so now i do not know, wether everything went right ?


byebye
Erik

--
EMAIL: erikyyy@studbox.uni-stuttgart.de                  \\\\
       thieleek@tick.informatik.uni-stuttgart.de         o `QQ'_
IRC:   erikyyy                                            /   __8
WWW:   http://wwwcip.rus.uni-stuttgart.de/~inf24628/      '  `
       http://tick.informatik.uni-stuttgart.de/~thieleek/



Information forwarded to debian-bugs-dist@lists.debian.org, Juan Cespedes <cespedes@debian.org>:
Bug#21164; Package libc6. (full text, mbox, link).


Acknowledgement sent to jdassen@wi.leidenuniv.nl:
Extra info received and forwarded to list. Copy sent to Juan Cespedes <cespedes@debian.org>. (full text, mbox, link).


Message #17 received at 21164@bugs.debian.org (full text, mbox, reply):

From: jdassen@wi.leidenuniv.nl
To: 21164@bugs.debian.org
Subject: Re: Bug#21164: did my report work ?
Date: Thu, 16 Apr 1998 15:49:30 +0200
On Thu, Apr 16, 1998 at 02:59:42PM +0200, erikyyy@studbox.uni-stuttgart.de wrote:
> sorry about this email, but the automatic bug reply said, that my email
> had no "Pseudo Header". Some other debian man, said me that this was wrong
> (i gave him the email, too)

Your message had a Package: pseudoheader, but that wasn't recognised as
such, because the bug tracking system can't handle MIME (that's bug #5629).
This means that the bug tracking system assigned it to package "unknown".

> so now i do not know, wether everything went right ?

I've instructed the bugtracking system to reassign your bug to libc6, so
it's OK now. Next time, please don't use MIME for the bug tracking system.

Ray
-- 
PATRIOTISM  A great British writer once said that if he had to choose 
between betraying his country and betraying a friend he hoped he would
have the decency to betray his country.                                      
- The Hipcrime Vocab by Chad C. Mulligan 


Information forwarded to debian-bugs-dist@lists.debian.org, Dale Scheetz <dwarf@polaris.net>:
Bug#21164; Package libc6. (full text, mbox, link).


Acknowledgement sent to erikyyy@studbox.uni-stuttgart.de:
Extra info received and forwarded to list. Copy sent to Dale Scheetz <dwarf@polaris.net>. (full text, mbox, link).


Message #22 received at 21164@bugs.debian.org (full text, mbox, reply):

From: erikyyy@studbox.uni-stuttgart.de
To: 21164@bugs.debian.org
Subject: Re: [Linux-Threads] realtime lock bug tracked down (fwd)
Date: Fri, 24 Apr 1998 10:19:05 +0200 (CEST)
hi. someone else knows something about the problem:
(not me)
---------- Forwarded message ----------
Date: Thu, 23 Apr 1998 14:52:25 +0200 (CEST)
From: Jos van de Ven <josvdv@josvdv.sci.kun.nl>
Reply-To: Jos van de Ven <josvdv@sci.kun.nl>
To: erikyyy@studbox.uni-stuttgart.de, linux-threads@MageNet.com
Subject: Re: [Linux-Threads] realtime lock bug tracked down

> and i am looking forward to a fix :-)))
I can say something REALLY easy. I tried your program, and...
THERE WAS NO BUG..... :-))

Ok, now the long story. I was using kernel 2.1.97, with SMP
enabled. I tried both versions of your program, and didn't
notice anything special.

I thought it might be a problem of the SMP. So I tried 2.0.29
and 2.0.33 without SMP. There the problem appeared. Next
option could be that the problem is in the kernel, and that
it has been fixed between 2.0.33 and 2.1.97. This appears to
be the case. Even on a UP-system with 2.1.97 I don't see
any problems occur. I didn't have time to look any further,
but maybe this will help.

I would like to hear if on your system it works too to use
kernel 2.1.97 or some kernel near to that one.
BTW: That kernel has a minor bug: don't try to make the
lp support as built in in the kernel. Only lp as a module
works. Versions earlier don't have THIS bug...

Succes!
Greetings,

Jos van de Ven
(josvdv@sci.kun.nl)




Information forwarded to debian-bugs-dist@lists.debian.org, Juan Cespedes <cespedes@debian.org>:
Bug#21164; Package libc6. (full text, mbox, link).


Acknowledgement sent to Herbert Xu <herbert@gondor.apana.org.au>:
Extra info received and forwarded to list. Copy sent to Juan Cespedes <cespedes@debian.org>. (full text, mbox, link).


Message #27 received at 21164@bugs.debian.org (full text, mbox, reply):

From: Herbert Xu <herbert@gondor.apana.org.au>
To: erikyyy@studbox.uni-stuttgart.de
Cc: 21164@bugs.debian.org, debian-devel@lists.debian.org
Subject: Re: 21164 must be fixed before 2.0.34 comes out!
Date: Sat, 25 Apr 1998 12:22:40 +1000
In article <Pine.LNX.3.96.980424102419.472D-100000@vulcain.yyydom> you wrote:
> bug 21164 does not seem to be a bug in the libc.
> it is instead a kernel bug.

FWIW, I just tried it on my Debian 2.0 2.0.32 machine:

$ ./yyys
Killed

-- 
Debian GNU/Linux 1.3 is out! ( http://www.debian.org/ )
Email:  Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Information forwarded to debian-bugs-dist@lists.debian.org, Juan Cespedes <cespedes@debian.org>:
Bug#21164; Package libc6. (full text, mbox, link).


Acknowledgement sent to Herbert Xu <herbert@gondor.apana.org.au>:
Extra info received and forwarded to list. Copy sent to Juan Cespedes <cespedes@debian.org>. (full text, mbox, link).


Message #32 received at 21164@bugs.debian.org (full text, mbox, reply):

From: Herbert Xu <herbert@gondor.apana.org.au>
To: josvdv@sci.kun.nl
Cc: 21164@bugs.debian.org, erikyyy@studbox.uni-stuttgart.de
Subject: Re: Reply on bug-report 21164
Date: Sun, 26 Apr 1998 11:28:43 +1000 (EST)
josvdv@pi.net wrote:
> You should run it as root. There should appear a message on the syslog too.
> If it's not run as root, it will just get killed. ("$" seems to be a prompt
> for a normal user, not for root!)
> Please try it again, and report your experiences.

There may or may not be a bug.  I currently don't have the time to check which
is the case.  But in any case, this bug is not serious enough to delay the
release since to exploit it you need to be root and if you're root, you *are*
allowed to do anything, including locking up the system.

I will investigate this when I get some time.

-- 
Debian GNU/Linux 1.3 is out! ( http://www.debian.org/ )
Email:  Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Information forwarded to debian-bugs-dist@lists.debian.org, Juan Cespedes <cespedes@debian.org>:
Bug#21164; Package libc6. (full text, mbox, link).


Acknowledgement sent to erikyyy@studbox.uni-stuttgart.de:
Extra info received and forwarded to list. Copy sent to Juan Cespedes <cespedes@debian.org>. (full text, mbox, link).


Message #37 received at 21164@bugs.debian.org (full text, mbox, reply):

From: erikyyy@studbox.uni-stuttgart.de
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: josvdv@sci.kun.nl, 21164@bugs.debian.org
Subject: Re: Reply on bug-report 21164
Date: Mon, 27 Apr 1998 15:36:27 +0200 (CEST)
On Sun, 26 Apr 1998, Herbert Xu wrote:

> josvdv@pi.net wrote:
> > You should run it as root. There should appear a message on the syslog too.
> > If it's not run as root, it will just get killed. ("$" seems to be a prompt
> > for a normal user, not for root!)
> > Please try it again, and report your experiences.
> 
> There may or may not be a bug.  I currently don't have the time to check which
> is the case.  But in any case, this bug is not serious enough to delay the
> release since to exploit it you need to be root and if you're root, you *are*
> allowed to do anything, including locking up the system.
> 
> I will investigate this when I get some time.

of course root may lock down the system.

but this is a bug. there is a difference between a bug and a program that
locks down the system.

the problem is, that because of this bug you cannot use _realtime threads_
at all in linux !
(of course only root may do this)

btw. this obviously seems to be a kernel bug, because it happens with
libc5 and libc6 on 2.0.33, but not with libc* on 2.1.??.

byebye
Erik

--
EMAIL: erikyyy@studbox.uni-stuttgart.de                  \\\\
       thieleek@tick.informatik.uni-stuttgart.de         o `QQ'_
IRC:   erikyyy                                            /   __8
WWW:   http://wwwcip.rus.uni-stuttgart.de/~inf24628/      '  `
       http://tick.informatik.uni-stuttgart.de/~thieleek/




Information forwarded to debian-bugs-dist@lists.debian.org, Dale Scheetz <dwarf@polaris.net>:
Bug#21164; Package libc6. (full text, mbox, link).


Acknowledgement sent to erikyyy@studbox.uni-stuttgart.de:
Extra info received and forwarded to list. Copy sent to Dale Scheetz <dwarf@polaris.net>. (full text, mbox, link).


Message #42 received at 21164@bugs.debian.org (full text, mbox, reply):

From: erikyyy@studbox.uni-stuttgart.de
To: 21164@bugs.debian.org
Subject: Realtime lock in linuxthreads still not fixed.
Date: Fri, 9 Oct 1998 11:05:33 +0200
hi.

my initial email which created this thread was sent
Wed, 15 Apr 1998 !

a long time has passed, and i checked the situation again.
libc6           2.0.7t-1
is running on my 100% updated debian hamm system.
(no SMP, Pentium133, 2.0.34)

the bug is still there.
i know that only processes running as root can use realtime
threads and lock down the system. but due to this bug even
root cannot use realtime threads, because they may lock the
system down without any reason.

so this is not a security problem for usual systems. but if
you watch the growing amount of applications that use realtime
threads or processes (sound daemons, players, video etc.) and
do this by running suid root (!!) the situation might look
worse.

this bug IS important and must be fixed. (unfortunately i cannot
do it...)
is anyone working on it ?

i have spread the initial bug report to many people, many
of them reproducing it, some talking of spinlocks in the kernel
others talking about SMP systems where the bug doesn't occur,
but nobody fixes it ?


byebye
Erik


PS: mail me if you want newest bugreproducer as email .tgz
(i am not quite sure wether the one on 21164 is uptodate)

--
EMAIL: erikyyy@studbox.uni-stuttgart.de                  \\\\
       thieleek@tick.informatik.uni-stuttgart.de         o `QQ'_
IRC:   erikyyy                                            /   __8
WWW:   http://wwwcip.rus.uni-stuttgart.de/~inf24628/      '  `
       http://tick.informatik.uni-stuttgart.de/~thieleek/


Information forwarded to debian-bugs-dist@lists.debian.org, Dale Scheetz <dwarf@polaris.net>:
Bug#21164; Package libc6. (full text, mbox, link).


Acknowledgement sent to Torsten Landschoff <t.landschoff@gmx.net>:
Extra info received and forwarded to list. Copy sent to Dale Scheetz <dwarf@polaris.net>. (full text, mbox, link).


Message #47 received at 21164@bugs.debian.org (full text, mbox, reply):

From: Torsten Landschoff <t.landschoff@gmx.net>
To: 21164@bugs.debian.org
Cc: linux-threads@MageNet.com
Subject: Lock in linuxthreads tracked
Date: Wed, 21 Oct 1998 00:24:49 +0200
Hi Folks, 

I think I tracked down a problem with realtime-threads as reported to
bugs.debian.org under reference number #21164 (I write this because
this is crossposted to the linux-threads mailing list).

The problem shows up when two threads do the following (I stripped the
non-interesting things as, for example, the sched_param structure for
setschedparam - the priorities for both threads are equal. I also
stripped the "pthread_"-Prefix to make it look a bit more lean :-)):


   Thread 1 (realtime):		Thread 2:

pthread_create -> Thread 2	waiting to become eligible
				
/* wait for child to signal	
   termination... */		
				
mutex_lock(&mutex);		
cond_wait(&cond, &mutex);	setschedparam(self(), SCHED_FIFO);
		 		do_something(tm)...
				.
				.
				mutex_lock(&mutex);
				cond_signal(&cond);
mutex_unlock(&mutex);		mutex_unlock(&mutex);
				
/* join with child... */  
				setschedparam(self(), SCHED_OTHER);
pthread_join(com_thread, 0);	<<--- the program locks up here
				return 0;
return 0;



This example locks your computer hard (I will soon explain, why), so
do not do this :) If you start this program on a console you can kill
it with ^C, and everything will be fine...

After looking into the sources I discovered what happened. The
linuxthreads-Library uses spinlocks to protect the integrity of the
internal data structures - for example the accounting information of
the running threads. So, before changing the priority back to
SCHED_OTHER, the spinlock belonging to Thread 2 is acquired in
pthread.c (I removed error handling):

int pthread_setschedparam(pthread_t thread, int policy,
                          const struct sched_param *param)
{
  pthread_handle handle = thread_handle(thread);
  pthread_descr th;

  acquire(&handle->h_spinlock);

  th = handle->h_descr;
  __sched_setscheduler(th->p_pid, policy, param);  <<-- (indirect) problem
  th->p_priority = policy == SCHED_OTHER ? 0 : param->sched_priority;
  release(&handle->h_spinlock);
  return 0;
}

At the time when __sched_setscheduler is called the thread is
suspended at once because there is an eligible realtime thread in the
runqueue - Thread 1.

Thread 1 continues to run until it hits	pthread_join, which tries to
acquire the lock for Thread 2 (which is still locked by Thread 2):

In join.c:

int pthread_join(pthread_t thread_id, void ** thread_return)
{
  volatile pthread_descr self = thread_self();
  pthread_handle handle = thread_handle(thread_id);
  pthread_descr th;

  acquire(&handle->h_spinlock);  <<-- LOCK! PROBLEM! 

  [remainder of pthread_join deleted...]
}

acquire does exactly the following (from spinlock.h):

static inline void acquire(int * spinlock)
{
  while (testandset(spinlock)) __sched_yield();
}

where sched_yield runs the next available realtimetask (if any) or the
next "normal" process...

Problem is: Thread 1 is still runable and is the only realtime
thread. This way Thread 1 is run again, which does sched_yield. This
way Thread 2 never gets the CPU.


I would like to comment some messages from #21164 of the Debian bug
tracking system:

From: Jos van de Ven <josvdv@josvdv.sci.kun.nl>
Subject: Re: [Linux-Threads] realtime lock bug tracked down

>> and i am looking forward to a fix :-)))
>I can say something REALLY easy. I tried your program, and...
>THERE WAS NO BUG..... :-))
>
>Ok, now the long story. I was using kernel 2.1.97, with SMP
>enabled. I tried both versions of your program, and didn't
>notice anything special.

>I thought it might be a problem of the SMP. So I tried 2.0.29
>and 2.0.33 without SMP. There the problem appeared. Next
>option could be that the problem is in the kernel, and that
>it has been fixed between 2.0.33 and 2.1.97. This appears to
>be the case. Even on a UP-system with 2.1.97 I don't see
>any problems occur. I didn't have time to look any further,
>but maybe this will help.

I do not know why it works with Kernel 2.1.97. Perhaps the kernel
suspends processes which issue a sched_yield for some HZ - I am not
sure. But I am not surprised that there is no problem with a
SMP-System: The second thread will just be assigned to the other CPU
and will release the lock. Everything will be fine.

Okay, I guess I sufficiently rendered out the problem. Question: What
can we do about it? The following alternatives come to mind:

1. make pthread_join balance out the priority of both threads.

   I am not sure if this will suffice because there might be some
   other races. But it will be a start :)

2. avoid the problem by balancing out the priorities by hand

   I removed the second setschedparam in Thread 2 and the problem
   disappeared. This will work but it remains a bug.

3. patch the kernel to suspend a process on sched_yield. Or at least
   skip the thread on the next dispatcher run.

   This might be the best solution.

4. find some other way for locking internal structures in
   linuxthreads. 

   This will work to but there will be a major performance penalty.

Okay, this will be enough for now...

cu
	Torsten

PS: Please CC me any answers because I am not on this list...


Information forwarded to debian-bugs-dist@lists.debian.org, Dale Scheetz <dwarf@polaris.net>:
Bug#21164; Package libc6. (full text, mbox, link).


Acknowledgement sent to Xavier Leroy <Xavier.Leroy@inria.fr>:
Extra info received and forwarded to list. Copy sent to Dale Scheetz <dwarf@polaris.net>. (full text, mbox, link).


Message #52 received at 21164@bugs.debian.org (full text, mbox, reply):

From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: linux-threads@MageNet.com, 21164@bugs.debian.org
Subject: Re: [Linux-Threads] Lock in linuxthreads tracked
Date: Wed, 21 Oct 1998 10:07:25 +0200
> I think I tracked down a problem with realtime-threads as reported to
> bugs.debian.org under reference number #21164 (I write this because
> this is crossposted to the linux-threads mailing list).

> [High-priority thread spinning on a spinlock and preventing a low-priority
>  thread from releasing the spinlock]

Yes, this is a known problem with LinuxThreads.

It is being fixed in the development versions of LinuxThreads (the
ones available with development versions of glibc 2.1) by replacing
spinlocks with non-busy-waiting locks based on compare-and-swap.

Another way to avoid the problem is to keep spinlocks, but after
spending some time spinning on a lock, put the spinning lock to sleep
for a short time using nanosleep().  It works OK, but hurts realtime
performance quite a lot in case of heavy contention on a spinlock.

> Okay, I guess I sufficiently rendered out the problem. Question: What
> can we do about it? The following alternatives come to mind:
> 
> 1. make pthread_join balance out the priority of both threads.
> 
>    I am not sure if this will suffice because there might be some
>    other races. But it will be a start :)

There will be some other races.  Spinlocks are used in many different
places.

> 2. avoid the problem by balancing out the priorities by hand
> 
>    I removed the second setschedparam in Thread 2 and the problem
>    disappeared. This will work but it remains a bug.

Right.  I consider priority scheduling essentially unuseable in the
current LinuxThreads releases.

> 3. patch the kernel to suspend a process on sched_yield. Or at least
>    skip the thread on the next dispatcher run.

That would certainly cure the problem, however you lose some realtime-ness
(i.e. there is no guarantee that the highest-priority thread waiting
for a spinlock gets it).

> 4. find some other way for locking internal structures in
>    linuxthreads. 
>    This will work to but there will be a major performance penalty.

The approach based on compare-and-swap seems to be just as fast as
spinlocks when there is low contention.  It might be a little more
expensive in case of heavy contention, though.

Regards,

- Xavier Leroy


Reply sent to Joel Klecker <jk@espy.org>:
You have taken responsibility. (full text, mbox, link).


Notification sent to erikyyy@studbox.uni-stuttgart.de:
Bug acknowledged by developer. (full text, mbox, link).


Message #57 received at 21164-done@bugs.debian.org (full text, mbox, reply):

From: Joel Klecker <jk@espy.org>
To: 21164-done@bugs.debian.org
Subject: Re: [Linux-Threads] Lock in linuxthreads tracked
Date: Tue, 21 Sep 1999 16:38:31 -0700
Since upstream claims this bug was fixed in glibc 2.1, and glibc 2.1 
is out, I am closing it.
-- 
Joel Klecker (aka Espy)                    Debian GNU/Linux Developer
<URL:mailto:jk@espy.org>                 <URL:mailto:espy@debian.org>
<URL:http://web.espy.org/>               <URL:http://www.debian.org/>


Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Thu Apr 25 09:41:00 2024; Machine Name: buxtehude

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.