SMP boot problems with RHEL AS3

Rick Stevens rstevens at vitalstream.com
Fri Jul 16 18:20:35 UTC 2004


Benjamin Hornberger wrote:
> Thanks a lot for your help! But what is apic?? Sorry for my lack of 
> knowledge...

That's OK.  It's geek stuff, and I certainly don't expect you to know
what it is.  Allow me to slightly nerdify you...

(tapping Benjamin gently on the head with the Nerd Wand)

APIC = Advanced Programmable Interrupt Controller.  It's hardware that
routes interrupts to the least-loaded CPU and changes the interrupt
environment to point at it for faster interrupt response.  If it's
disabled, the CPUs have to do the work programmatically.

There's a SLIGHT performance penalty with APICs disabled for 99% of the
systems I've seen.  We have about 20 Athlon SMP webservers that are
doing about 90Mbps each with "noapic" set (yes, the net cables glow
cherry red).  Not too shabby.

BTW, Benjamin, we prefer bottom posting here (post your replies AFTER
what you're replying to).  It keeps the flow of the message clearer
and allows you to comment in multiple places as I did in the first
message.

> At 10:56 AM 7/16/2004 -0700, you wrote:
> 
>> Benjamin Hornberger wrote:
>>
>>> Hi all,
>>> I am having strange problem with RHEL AS3 on a dual-processor 
>>> machine. When I boot the SMP kernel, the machine hangs during boot, 
>>> but, if I am patient, after ca. 30 minutes finally comes up. When I 
>>> boot the non-SMP kernel, it comes up ok (also waiting ca. 15 sec at 
>>> the same point in the boot sequence).
>>> Below is a cutout of the boot messages for the SMP kernel before it 
>>> hangs, and the corresponding part of the non-SMP boot (don't laugh -- 
>>> I took snapshots with a Digicam and typed it down. There might be 
>>> some typos. It's just too fast to catch everything. ).
>>> It's a Monarch ULB 2000 workstation with two AMD Athlon MP 2000+ 
>>> processors, 3 GB RAM, two Seagate SCSI hard drives on an on-board 
>>> controller (we boot from sda), a CD-RW and a DVD-ROM as primary and 
>>> secondary master, and two Maxtor IDE drives as primary and secondary 
>>> slaves (to be RAIDed) on the IDE bus. The machine ran fine under 
>>> Redhat 7.2 and Redhat 9 before. It has been up2dated and runs now 
>>> kernel 2.4.21-15.0.3.EL(smp). The problem has been there already 
>>> right after the installation with kernel 2.4.21-4.EL(smp).
>>
>>
>> There is a common problem with the way some SMP Athlon motherboards set
>> up the APIC and the SMP kernels.  Uniprocessor kernels don't enable the
>> APIC so they won't hit the problem.  The fix is to disable the APICs
>> at boot time on the SMP kernels by adding the "noapic" boot option.
>>
>> If you use lilo to boot, hit "CTRL-X" and type in "linux noapic" at the
>> "boot:" command line.
>>
>> If you use grub, hit "e", scroll down to the "kernel" line, add
>> " noapic" (don't forget the leading space) to the end of the line, hit
>> "ENTER" and press "b" to boot.
>>
>> If the "noapic" option works for you, update your boot loader config
>> file to use that option.  If you use lilo, don't forget to run lilo
>> after tweaking the file.
>>
>> NOTE: "noapic" will slow the average system down a very tiny amount.  I
>> doubt that you'll even be able to notice it without benchmarking.  It
>> will have a greater effect if the machine gets hammered with interrupts,
>> but that's pretty rare except with a real-time kernel.
>>
>>> --------------------------------
>>> Boot messages:
>>> SMP kernel:
>>> ...
>>> Initializing Cryptograhic API
>>> NET4: Linux TCP/IP 1.0 for NET4.0
>>> IP: routing cache hash table of 32768 buckets, 256Kbytes
>>> TCP: Hash tables configured (established 524288 bind 65536)
>>> Linux IP multicast router 0.06 plus PIM-SM
>>> Initializing IPsec netlink socket
>>> NET4: Unix domain sockets 1.0/SMP for Linux NET4.0
>>> RAMDISK: Compressed image found at block 0
>>> Freeing initrd memory: 311k freed
>>> VFS: Mounted root (ext2 filesystem).
>>> Red Hat nash version 3.5.13 starting
>>> Loading scsi_mod.o module
>>> SCSI subsystem driver Revision: 1.00
>>> Loading sd_mod.o module
>>> Loading aic7xxx.0 module
>>> scsi0 : Adapted AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
>>>         <Adaptec aic7899 Ultra160 SCSI adapter>
>>>         aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
>>> scsi1 : Adapted AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev. 6.2.36
>>>         <Adapted aic7899 Ultra160 SCSI adapter>
>>>         aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
>>> blk: queue f7fd6414, I/O limit 4095Mb (mask 0xffffffff)
>>> (scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
>>> (scsi0:A:1): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
>>>   Vendor: SEAGATE   Model: ST336752LW        Rev: 0004
>>>   Type:   Direct-Access                      ANSI SCSI revision: 03
>>> blk: queue f7fd7214, I/0 limit 4095Mb (mask 0xffffffff)
>>>   Vendor: SEAGATE   Model: ST336752LW        Rev: 0004
>>>   Type:   Direct-Access                      ANSI SCSI revision: 03
>>> blk: queue f7fd7414, I/O limit 4095Mb (mask 0xffffffff)
>>> scsi0:A:0:0: Tagged Queuing enabled.   Depth 32
>>> scsi0:A:1:0: Tagged Queuing enabled.   Depth 32
>>> **** hangs ca. 30 minutes here, but then comes up *****
>>
>>
>> Classic APIC hang there.  "noapic" should solve this issue.
>>
>>> non-SMP kernel:
>>> ...
>>> NET4: Unix domain sockets 1.0/SMP for Linux NET4.0
>>> RAMDISK: Compressed image found at block 0
>>> Freeing initrd memory: 304k freed
>>> VFS: Mounted root (ext2 filesystem).
>>> Red Hat nash version 3.5.13 starting
>>> Loading scsi_mod.o module
>>> SCSI subsystem driver Revision: 1.00
>>> Loading sd_mod.o module
>>> Loading aic7xxx.0 module
>>> AMD756: dev 9005:00cf, router pirq : 1 get irq : 10
>>> PCI: Found IRQ 10 for device 00:0a.0
>>> PCI: Sharing IRQ 10 with 02:04.0
>>> AMD756: dev 9005:00cf, router pirq : 2 get irq : 11
>>> PCI: Found IRQ 11 for device 00:0a.1
>>> PCI: Sharing IRQ 11 WITH 01:05.0
>>> scsi0 : Adapted AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
>>>         <Adaptec aic7899 Ultra160 SCSI adapter>
>>>         aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
>>> scsi1 : Adapted AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev. 6.2.36
>>>         <Adapted aic7899 Ultra160 SCSI adapter>
>>>         aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
>>> blk: queue f7fd6414, I/O limit 4095Mb (mask 0xffffffff)
>>> ***** waits ca. 15 sec here *****
>>
>>
>> That's the firmware download and/or SCSI controller chip self checks.
>> It's normal and there's no way to disable it.  If it waits more than
>> 30 seconds, you may have other issues.
>>
>>> (scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
>>> (scsi0:A:1): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
>>>   Vendor: SEAGATE   Model: ST336752LW        Rev: 0004
>>>   Type:   Direct-Access                      ANSI SCSI revision: 03
>>> blk: queue f7fd7214, I/0 limit 4095Mb (mask 0xffffffff)
>>>   Vendor: SEAGATE   Model: ST336752LW        Rev: 0004
>>>   Type:   Direct-Access                      ANSI SCSI revision: 03
>>> blk: queue f7fd7414, I/O limit 4095Mb (mask 0xffffffff)
>>> scsi0:A:0:0: Tagged Queuing enabled.   Depth 32
>>> scsi0:A:1:0: Tagged Queuing enabled.   Depth 32
>>> ...
>>> ***** comes up fine *****
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Benjamin Hornberger
> mailto:bho at gmx.net
> http://www.hornberger.info/
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
----------------------------------------------------------------------
- Rick Stevens, Senior Systems Engineer     rstevens at vitalstream.com -
- VitalStream, Inc.                       http://www.vitalstream.com -
-                                                                    -
- Treat each day as if it's your last...a lot of crying and whining  -
-      usually gets you what you want!              -- Sam Sledge    -
----------------------------------------------------------------------





More information about the Redhat-install-list mailing list