SMP boot problems with RHEL AS3

Rick Stevens rstevens at vitalstream.com
Fri Jul 16 17:56:13 UTC 2004


Benjamin Hornberger wrote:
> Hi all,
> 
> I am having strange problem with RHEL AS3 on a dual-processor machine. 
> When I boot the SMP kernel, the machine hangs during boot, but, if I am 
> patient, after ca. 30 minutes finally comes up. When I boot the non-SMP 
> kernel, it comes up ok (also waiting ca. 15 sec at the same point in the 
> boot sequence).
> 
> Below is a cutout of the boot messages for the SMP kernel before it 
> hangs, and the corresponding part of the non-SMP boot (don't laugh -- I 
> took snapshots with a Digicam and typed it down. There might be some 
> typos. It's just too fast to catch everything. ).
> 
> It's a Monarch ULB 2000 workstation with two AMD Athlon MP 2000+ 
> processors, 3 GB RAM, two Seagate SCSI hard drives on an on-board 
> controller (we boot from sda), a CD-RW and a DVD-ROM as primary and 
> secondary master, and two Maxtor IDE drives as primary and secondary 
> slaves (to be RAIDed) on the IDE bus. The machine ran fine under Redhat 
> 7.2 and Redhat 9 before. It has been up2dated and runs now kernel 
> 2.4.21-15.0.3.EL(smp). The problem has been there already right after 
> the installation with kernel 2.4.21-4.EL(smp).

There is a common problem with the way some SMP Athlon motherboards set
up the APIC and the SMP kernels.  Uniprocessor kernels don't enable the
APIC so they won't hit the problem.  The fix is to disable the APICs
at boot time on the SMP kernels by adding the "noapic" boot option.

If you use lilo to boot, hit "CTRL-X" and type in "linux noapic" at the
"boot:" command line.

If you use grub, hit "e", scroll down to the "kernel" line, add
" noapic" (don't forget the leading space) to the end of the line, hit
"ENTER" and press "b" to boot.

If the "noapic" option works for you, update your boot loader config
file to use that option.  If you use lilo, don't forget to run lilo
after tweaking the file.

NOTE: "noapic" will slow the average system down a very tiny amount.  I
doubt that you'll even be able to notice it without benchmarking.  It
will have a greater effect if the machine gets hammered with interrupts,
but that's pretty rare except with a real-time kernel.

> --------------------------------
> Boot messages:
> 
> SMP kernel:
> 
> ...
> Initializing Cryptograhic API
> NET4: Linux TCP/IP 1.0 for NET4.0
> IP: routing cache hash table of 32768 buckets, 256Kbytes
> TCP: Hash tables configured (established 524288 bind 65536)
> Linux IP multicast router 0.06 plus PIM-SM
> Initializing IPsec netlink socket
> NET4: Unix domain sockets 1.0/SMP for Linux NET4.0
> RAMDISK: Compressed image found at block 0
> Freeing initrd memory: 311k freed
> VFS: Mounted root (ext2 filesystem).
> Red Hat nash version 3.5.13 starting
> Loading scsi_mod.o module
> SCSI subsystem driver Revision: 1.00
> Loading sd_mod.o module
> Loading aic7xxx.0 module
> scsi0 : Adapted AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
>         <Adaptec aic7899 Ultra160 SCSI adapter>
>         aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
> 
> scsi1 : Adapted AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev. 6.2.36
>         <Adapted aic7899 Ultra160 SCSI adapter>
>         aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
> 
> blk: queue f7fd6414, I/O limit 4095Mb (mask 0xffffffff)
> (scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
> (scsi0:A:1): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
>   Vendor: SEAGATE   Model: ST336752LW        Rev: 0004
>   Type:   Direct-Access                      ANSI SCSI revision: 03
> blk: queue f7fd7214, I/0 limit 4095Mb (mask 0xffffffff)
>   Vendor: SEAGATE   Model: ST336752LW        Rev: 0004
>   Type:   Direct-Access                      ANSI SCSI revision: 03
> blk: queue f7fd7414, I/O limit 4095Mb (mask 0xffffffff)
> scsi0:A:0:0: Tagged Queuing enabled.   Depth 32
> scsi0:A:1:0: Tagged Queuing enabled.   Depth 32
> 
> **** hangs ca. 30 minutes here, but then comes up *****

Classic APIC hang there.  "noapic" should solve this issue.

> non-SMP kernel:
> 
> ...
> NET4: Unix domain sockets 1.0/SMP for Linux NET4.0
> RAMDISK: Compressed image found at block 0
> Freeing initrd memory: 304k freed
> VFS: Mounted root (ext2 filesystem).
> Red Hat nash version 3.5.13 starting
> Loading scsi_mod.o module
> SCSI subsystem driver Revision: 1.00
> Loading sd_mod.o module
> Loading aic7xxx.0 module
> AMD756: dev 9005:00cf, router pirq : 1 get irq : 10
> PCI: Found IRQ 10 for device 00:0a.0
> PCI: Sharing IRQ 10 with 02:04.0
> AMD756: dev 9005:00cf, router pirq : 2 get irq : 11
> PCI: Found IRQ 11 for device 00:0a.1
> PCI: Sharing IRQ 11 WITH 01:05.0
> scsi0 : Adapted AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
>         <Adaptec aic7899 Ultra160 SCSI adapter>
>         aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
> 
> scsi1 : Adapted AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev. 6.2.36
>         <Adapted aic7899 Ultra160 SCSI adapter>
>         aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
> 
> blk: queue f7fd6414, I/O limit 4095Mb (mask 0xffffffff)
> 
> ***** waits ca. 15 sec here *****

That's the firmware download and/or SCSI controller chip self checks.
It's normal and there's no way to disable it.  If it waits more than
30 seconds, you may have other issues.

> (scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
> (scsi0:A:1): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
>   Vendor: SEAGATE   Model: ST336752LW        Rev: 0004
>   Type:   Direct-Access                      ANSI SCSI revision: 03
> blk: queue f7fd7214, I/0 limit 4095Mb (mask 0xffffffff)
>   Vendor: SEAGATE   Model: ST336752LW        Rev: 0004
>   Type:   Direct-Access                      ANSI SCSI revision: 03
> blk: queue f7fd7414, I/O limit 4095Mb (mask 0xffffffff)
> scsi0:A:0:0: Tagged Queuing enabled.   Depth 32
> scsi0:A:1:0: Tagged Queuing enabled.   Depth 32
> ...
> ***** comes up fine *****
----------------------------------------------------------------------
- Rick Stevens, Senior Systems Engineer     rstevens at vitalstream.com -
- VitalStream, Inc.                       http://www.vitalstream.com -
-                                                                    -
-      "Doctor!  My brain hurts!"  "It will have to come out!"       -
----------------------------------------------------------------------





More information about the Redhat-install-list mailing list