[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#584314: base: System freezes at random time after Resume from Suspend (Regression)



Andreas Berger wrote:

> ok, i narrowed it down, but it is:
>
> found:     linux-image-2.6.36-trunk-686, version 2.6.36-1~experimental.1
> not found: linux-image-2.6.37-rc4-686,   version 2.6.37~rc4-1~experimental.1
>
> and this time i think i got a complete call trace, is attached

Nice.  Alas, after looking at the Debian changelog and "git shortlog
v2.6.36..v2.6.37-rc4" output, no particular change jumps out as likely
to have fixed this corruption (and the places the kernel panicked
don't give any obvious clue).

Some ideas for narrowing it down:

 - could you try suspending in single-user mode (i.e., kernel
   parameters "single debug"), to rule out a problem in the i915
   driver?

 - likewise, does unloading other modules before suspend help?

 - if nothing else gives a hint: can you bisect to find the fix?  It
   works like this:

1. Reproduce the bug with the unpatched kernel.

	# apt-get install git-core build-essential
	$ git clone git://github.com/torvalds/linux.git; # kernel.org is down
	$ cd linux
	$ git checkout v2.6.36
	$ make localmodconfig; # minimal configuration
	$ make deb-pkg;	# with -j<n> for parallel build if wanted
	# dpkg -i ../<linux-image package name>
	# reboot
	... test test test ...

Hopefully it reproduces the bug.  Otherwise, declare victory and we
can figure out how Debian-specific changes screwed it up.

2. Reproduce the fix.

	$ cd ~/src/linux
	$ git checkout v2.6.37-rc4
	$ yes "" | make silentoldconfig; # reuse configuration
	$ make deb-pkg
	# dpkg -i ../<linux-image package name>
	# reboot
	... test test test ...

Hopefully it does _not_ reproduce the bug.  If not, try again after
copying Debian's config-2.6.37-rc4-686 as ~/src/linux/.config and
rebuild --- if that fixes it, declare victory and we can figure out
which configuration change fixed it, and if that doesn't fix it, we
can look for a relevant Debian-specific patch.

3. Great --- so v2.6.36 reproduces the bug and v2.6.37-rc4 reproduces
the fix.  Tell git:

	$ cd ~/src/linux
	$ git bisect start v2.6.37-rc4 v2.6.36

Git checks out a revision halfway between to test.

	$ yes "" | make silentoldconfig; # reuse configuration
	$ make deb-pkg
	# dpkg -i ../<linux-image package name>
	# reboot
	... test test test ...
	$ cd ~/src/linux
	$ git bisect good; # if it crashes
	$ git bisect bad; # if it is stable
	$ git bisect skip; # if some other bug makes it hard to test

Yes, "good" means "successfully demonstrates the bug".  The naming is
a little confusing because git bisect is usually used to find changes
introducing bugs rather than changes fixing them.

4. Repeat until bored:

	$ make silentoldconfig
	$ make deb-pkg
	# dpkg -i ../<linux-image package>
	# reboot
	... test test test ...
	$ cd ~/src/linux
	$ git bisect good / bad / skip

Eventually it will tell the "first bad commit" (i.e., the fix), which
was what was wanted.  If you get bored before then, that's still
useful --- "git bisect log" will tell the results so far.  (Even a
few rounds can narrow things down a lot.)  If the gitk package is
installed, you can run "git bisect visualize" at any time to watch the
range of changes potentially containing the fix narrowing.

"man git-bisect" and /usr/share/doc/git-doc/git-bisect-lk2009.html
from the git-doc package have details.

Thanks much for your help so far!
Jonathan



Reply to: