Results 1 to 6 of 6

Thread: The kernel related to acpi hangs on nehalem

  1. #1
    Join Date
    Jul 2010
    Posts
    125

    The kernel related to acpi hangs on nehalem

    I have comprised and seen a number of actually strange freeze of kernel commencing 2.6.32 to 2.6.34 on Westmere CPUs. The accurate cause (CPU, Kernel or Hardware) is not recognized, so I am going to register the facts. When the crash occurs, there is __not anything__ on the console. Not on video comfort and not on serial console. The said apparatus has two Westmere CPUs, every by means of 4 cores. By means of hyperthreading, "cat /proc/cpuinfo" determine to demonstrate 16 cores. I be able to frequently repeat the crash by copying a couple of terabytes of date to this apparatus. Determine to turn on hyperthreading and the smallest amount ACPI hold up or maintain to get hyperthreading sustain or maintain. Additional than by means of the intention of does not seem to be superior sufficient. On this meticulous board, lmsensor logic is broken. So I utilized "supero doctor" commencing from SuperMicro to monitor the CPU temperature and they encompass been standard up till the crash, so I ruled out the overheating theory but it does not solve the predicament. The machine still crashes.

  2. #2
    Join Date
    Nov 2009
    Posts
    1,416

    Re: The kernel related to acpi hangs on nehalem

    Here you have to maintain in mind by means of the intention of the majority of the people who had ACPI tribulations had problem booting up. In my case, the boot up went immediately extremely well and the crash is extremely random. If feels similar to it would occur when the system is down-scaling, and I suppose the voltage down-scaling by means of the intention of shut down the complete system. This is the first time I have encompassed seen a linux crash without a trace. To be honest, I am tranquil 50-50 on whether it's a hardware problem (Westmere, SuperMicro MB) or kernel problem. Subsequently I am going to twist on the debugging logic in ACPI sub-system in the kernel. I determine to bring up to date you the entire of what I locate. In the meantime, I would be grateful for several assist or ideas to endeavor out.

  3. #3
    Join Date
    Nov 2009
    Posts
    1,292

    Re: The kernel related to acpi hangs on nehalem

    I am in addition scrutinizing comparable tribulations on a HP Z800. If I run a number of heavy numerical computation the system reboots within one hour or so. I am endeavoring acpi=off and it seems to run for at present give pleasure to let me be familiar by means of if you figure out the source of the tribulations. I endeavor a lot of approaches lately. In view of the fact with the intention of my box frequently die during data transfer commencing network, I utilized acpi.power_nocheck as boot param and even customized the kernel code to skip by means of the intention of to create certain the network driver is not position into wrong power mode. I administer to crashed the entire 2.6.34. kernels by means of ACPI on with this. I actually start to imagine my predicament is to some extent hardware related, and spiraling ACPI off immediately make it much less likely to occur.

  4. #4
    Join Date
    Nov 2009
    Posts
    1,269

    Re: The kernel related to acpi hangs on nehalem

    If completely nothing runs, it might be a BIOS bug, make sure for newer BIOS by means of something regarding ACPI in the revolutionize log. In view of the fact with the intention of the crash occurs in kernel (not user space, not X), setter would not assist. I spent a number of time researching the entire the BIOS alternative. There are a lot! My box (X8DTN+) is working with AMI v02.68, and the most recent one I be able to get commencing SuperMicro is 2.0b, I guess by means of the intention of means the series. I might not locate several revolutionize Log, You say you get the crash 'copying terabytes' in excess of the network. I am giving the impression of being for information similar to the following: How fast is your network ? 1Gb ? 100Mb ? Are you utilizing rsync over ssh, nfs, or incredible else.

  5. #5
    Join Date
    Mar 2010
    Posts
    515

    Re: The kernel related to acpi hangs on nehalem

    I had originated an issue by means of the module 'preloadtrace' by means of the intention of cause’s west mere to collide regularly. If you encompass this module loaded, I had suggest by means of the intention of you eliminate the package by means of the intention of installs it. The important numerical calculation consisted in a permanent evaluation of a multi-threaded fft. The size of the data selection does not actually matter, and the average load in excess of the entire 8 cores was ~90%. I in addition run the similar calculation on windows 7, and it did not generate several errors. The bios I am utilizing (the most recent version or description obtainable) does not have encompassed a setting for "C State Package Limit Setting", on the other hand according to HP it be supposed to previously restrain a fix for the Westmere C6 state conversion bug. Possibly with the intention of bios fix does not run by means of linux 2.6.32 kernel. I in addition tried 2.6.35 by means of the similar results.

  6. #6
    Join Date
    Apr 2010
    Posts
    237

    Re: The kernel related to acpi hangs on nehalem

    In my test, I be able to more often than not crash the box within a day by utilizing "rsync -e ssh" data commencing from the additional box to this Westmere box. It's in excess of Gigabyte network. I endeavor the heavy load shift toward early on by working with 20-30 "openssl speed" to load up the entire 16 cores (commencing 2 CPU) the entire the time, additional than the predicament did not occur. I imagine the tribulations determine to not come into view in heavy load, additional than rather in C-state transition. Right, the "processor.max_cstate=0" unaccompanied would not do it. I give the impression of being at the kernel source and utilize the on top of by means of "intel_idle.max_cstate=0", which completely put out of action the intel idle driver. Subsequent to that, your powertop determine to give the impression of being crippled. Perhaps ACPI_PROCESSOR_COMPONENT would be helpful too.

Similar Threads

  1. Kernel hangs at the ACPI in Grub
    By Danel in forum Operating Systems
    Replies: 5
    Last Post: 29-12-2010, 12:26 AM
  2. Change from ACPI Uniprocessor to ACPI Multiprocessor
    By Wafeeq in forum Motherboard Processor & RAM
    Replies: 6
    Last Post: 18-10-2010, 07:53 PM
  3. What are the general tendencies related to the Linux Kernel?
    By Sammiel in forum Operating Systems
    Replies: 5
    Last Post: 23-02-2010, 06:05 AM
  4. Information related to the Virtualization in Linux Kernel
    By Edwards in forum Operating Systems
    Replies: 5
    Last Post: 23-02-2010, 05:32 AM
  5. Replies: 2
    Last Post: 17-01-2008, 10:58 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,716,748,954.48261 seconds with 17 queries