Issue Details (XML | Word | Printable)

Key: FL-2789
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Blocker Blocker
Assignee: António Meireles [aka doniphon]
Reporter: Tomas Forsman
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Foresight Linux

Kernel 2.6.40.** wont boot, but 2.6.38.8-3 works fine.

Created: 28/Aug/11 05:00 PM   Updated: 11/Jan/13 10:25 PM
Component/s: Base Operating System
Affects Version/s: 2.5.x
Fix Version/s: None
Security Level: Public (Everyone can see this issue)

Time Tracking:
Not Specified

File Attachments: 1. File cpuinfo (0.6 kB)
2. File dmidecode (11 kB)
3. File embee-1-cpuinfo (0.6 kB)
4. File embee-1-dmidecode (12 kB)
5. File embee-1-lspci-vv (6 kB)
6. File embee-2-cpuinfo (2 kB)
7. File embee-2-diff-2.6.35.11-2.6.35.12 (1 kB)
8. File embee-2-dmidecode (12 kB)
9. File embee-2-lspci-vv (7 kB)
10. File initrd.diff (58 kB)
11. Text File log.log (51 kB)
12. File lspci (11 kB)

Environment: group-gnome-dist=2.5.1+2011.08.25-0.1-4[~!gcc.core]


 Description  « Hide
lsusb
http://pastebin.com/vkS7Hrfn

lspci
http://pastebin.com/Me2kJkXe

sudo dmidecode
http://pastebin.com/fNsCd7YE

sudo lspci -v
http://pastebin.com/RMyB9Jat

sudo cat /proc/cpuinfo
http://pastebin.com/mhTcjtdb

lsusb -v
http://pastebin.com/138BUG6v

lshal
http://pastebin.com/S9w5Scnk

All info above is from 2.6.38.8 kernel, as 2.6.40** kernel wont boot at all, gets stuck around 50% of the green line that goes from left to right. And can't get info from what goes wrong.

Where hoping that you might see a hardware that might cause the issue with 2.6.40 kernels.

So embee and Jaget have this issues today with 2.6.40 kernels.



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Tomas Forsman added a comment - 28/Aug/11 05:02 PM
Added file log.log that contains "dmesg"

Mark Trompell added a comment - 02/Sep/11 02:02 AM
I have a similiar issue, remarkable on an AMD CPU family 15 too, In my case a turion, but the issue are probably AMD k8 CPUs

eMBee added a comment - 02/Sep/11 06:04 AM
correction: i have not tested 2.6.40*
i have the issue with 2.6.38.8-3 and -4
the last successfully booting kernel is 2.6.35.11-2,
versions 2.6.35.12-2-1 to 2.6.38.8-2-1 were not tested.

greetings, eMBee.


Mark Trompell added a comment - 02/Sep/11 06:19 AM
attached cpuinfo/dmidecode and lspci -vv from my box

eMBee added a comment - 02/Sep/11 08:41 AM
i have been looking for relevant reports and found one case where someone has his machine hang for like 145 seconds and then continues.
is that the case with any of you? (it is not the case for me, it hangs permanently)

Tomas Forsman added a comment - 02/Sep/11 10:50 AM - edited
Jaget has now tested without splash and quite option in boot.

First it hangs at: joystick (USB gamepad)
unplugging it....

it now hangs at: input3 keyboard
unplugging keyboard (no laptop computer)

now it hangs at: nash-hotplug (75)

So it seems it hangs when trying to get stuff going......

without any usb and keyboard, he gets as last message:

[0.602609] nash-hotplug (75): /proc/75/oom_adj is deprecated, please use /proc/75/oom_score_adj instead

and stops there


eMBee added a comment - 03/Sep/11 05:01 AM - edited
i now tested the following kernels on embee-1:
2.6.40.4-2-foresight.i686
2.6.38.8-4-fl.smp.gcc4.4.x86.i686
2.6.38.8-3-fl.smp.gcc4.4.x86.i686
2.6.38.8-2-fl.smp.gcc4.4.x86.i686
2.6.38.8-1-fl.smp.gcc4.4.x86.i686
2.6.38.7-1-fl.smp.gcc4.4.x86.i686
2.6.38.6-1-fl.smp.gcc4.4.x86.i686
2.6.35.13-1-fl.smp.gcc4.4.x86.i686
2.6.35.12-3-fl.smp.gcc4.4.x86.i686
2.6.35.12-2-fl.smp.gcc4.4.x86.i686
2.6.35.11-2-fl.smp.gcc4.4.x86.i686

they ALL fail except for the last one: linux 2.6.35.11-2-fl.smp.gcc4.4.x86.i686

linux 2.6.35.{12,13} hang at i think the same place, but the clocksource line doesn't show.
also, in difference to jaget it hangs after the nash-hotplug line.

from searching for the clocksource message i found references explaining that the clocksource thing was added in 2.6.38:
http://www.gossamer-threads.com/lists/engine?do=post_view_flat;post=1373553;page=1;mh=-1;list=linux;sb=post_latest_reply;so=ASC

one person who had problems hanging (for 145 seconds only, then it continued) at clocksource found it due to a patch introduced in 2.6.38.
i tried all the suggestions there and they didn't apply to me.

i think that mark and i have a different problem from jaget and different from the person in the link.

one question for jaget: does the machine hang permanently? did he try waiting for 3 or 5 minutes?

greetings, eMBee.


eMBee added a comment - 03/Sep/11 05:15 AM
i also don't have any troubles booting with a usb keyboard

Marcus Bruér added a comment - 03/Sep/11 12:45 PM
There, finally got my account back. I dont use USB keyboard on that computer. Yes I've been waiting for a long time, but nothing happends.

António Meireles [aka doniphon] added a comment - 03/Sep/11 04:58 PM - edited
ok...
so the common issue here is x86 arch (32 bits) and AMD CPU family 15 right ?
(or anyone having AMD related issues in x86_64 arches too ?)

Marcus Bruér added a comment - 04/Sep/11 03:28 PM
Yes.

eMBee added a comment - 05/Sep/11 12:21 AM
yes to CPU family 15, but my machines are Intel

also, marcus is running 2.6.38.8-3-fl.smp.gcc4.4.x86_64

my own AMD x86_64 machine is CPU family 16, and is not affected.

(also i tested
2.6.40.4-2-foresight.i686
2.6.38.8-4-fl.smp.gcc4.4.x86.i686
2.6.38.7-1-fl.smp.gcc4.4.x86.i686
2.6.38.6-1-fl.smp.gcc4.4.x86.i686
2.6.35.13-1-fl.smp.gcc4.4.x86.i686
2.6.35.12-2-fl.smp.gcc4.4.x86.i686
2.6.35.11-2-fl.smp.gcc4.4.x86.i686
on embee-2 now.

all except 2.6.35.11-2-fl.smp.gcc4.4.x86.i686 fail with the same symptoms, except that i see some additional USB messages after the clocksource message before the machine hangs)

greetings, eMBee.


Mark Trompell added a comment - 06/Sep/11 04:22 AM
I'm using x86_64 exclusivly,
the box didn't recover for at least 1h, maybe longer, I shut it down at one point.
Notable (un)plugging usb devices is still working and printed to screen. ctrl-alt-del still works.
Fails with vanilla 3.1rc4, next stop for me will be 2.6.39 from linus tree, if that works I try a bisect. Will take a while though as i think I can't do more than 2 builds/day

Mark Trompell added a comment - 06/Sep/11 03:12 PM
I had a working 2.6.35.12 and installed 2.6.38.8 and 2.6.35.11 which both didn't work, that led me to the conclusion that the kernel isn't the issue at all but the initrd is.
So I created an initrd with sudo dracut -H --force /boot/initrd-2.6.40.4-2-foresight.x86_64.img 2.6.40.4-2-foresight.x86_64 (dracut build locally from dracut:source=devtools.rpath.org@fl:2-devel) and was able to boot into that kernel. Splash doesn't work though.
So from that tests I would suspect mkinitrd is the issue here.

António Meireles [aka doniphon] added a comment - 06/Sep/11 03:36 PM
ok. it seems something previously static in working kernels turned into a .ko and is not being added to the initrd by mkinitrd but is by dracut.

MarkT is gonna compare initrd contents of both initrds to spot root issue.


eMBee added a comment - 06/Sep/11 11:12 PM
i can confirm marks observation
on one machine (call it embee-3) i upgraded the kernel first from 2.6.35.11 to 2.6.38.4. the kernel worked, but then i ran updateall, and after that the new kernel failed. updateall caused the initrd to be regenerated (as can be seen by the date of the initrd.img vs the symlink created for the vmlinux file.)

unfortunately the remaking of the initrd does not create a backup.

greetings, eMBee.


Marcus Bruér added a comment - 09/Sep/11 04:34 PM
My diff -u on good and bad initrd: http://pastebin.com/DNYNeCY6

eMBee added a comment - 12/Sep/11 10:54 PM
sorry for the wait, here is one diff attached: initrd-2.6.35.11-2-fl.smp.gcc4.4.x86.i686.img is working and initrd-2.6.35.12-2-fl.smp.gcc4.4.x86.i686.img is the broken one

Mark Trompell added a comment - 16/Sep/11 07:36 AM
The latest move to dracut in fl:2-devel fixed it for me.

António Meireles [aka doniphon] added a comment - 16/Sep/11 07:49 AM
guys...

ANYONE still having issues ?


Tomas Forsman added a comment - 16/Sep/11 03:26 PM
an confirm that it works for jaget aka markus Burér too, no more issue about this one

eMBee added a comment - 23/Sep/11 12:57 PM
i moved to dracut on embee-1 and retested:
3.0.4-2-foresight.i686
2.6.38.8-4-fl.smp.gcc4.4.x86.i686
2.6.38.8-3-fl.smp.gcc4.4.x86.i686
2.6.38.8-2-fl.smp.gcc4.4.x86.i686
2.6.38.8-1-fl.smp.gcc4.4.x86.i686
2.6.38.7-1-fl.smp.gcc4.4.x86.i686
2.6.38.6-1-fl.smp.gcc4.4.x86.i686
2.6.35.13-1-fl.smp.gcc4.4.x86.i686
2.6.35.12-3-fl.smp.gcc4.4.x86.i686
2.6.35.12-2-fl.smp.gcc4.4.x86.i686
2.6.35.11-2-fl.smp.gcc4.4.x86.i686

they all work now!
the only 'cosmetic' issue is that for 2.6.38 and 3.0.4 the splash bar stops when entering runlevel 5.
the bar continues correctly on 2.6.35

greetings, eMBee.