[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Q]: weird spinlock messages & kernel selection




Hello everybody,

I'm about to deploy a number of dual-cpu dp264's to do some production
work. The systems came from the vendor loaded with RedHat 5.2 with
some tweaks:

- packages from "kernel-2.2" collection upgraded
- glibc 2.1.1-7 installed
- kernel 2.2.7 SMP (I'm not sure there were no patches applied and I'm
  to figure it out yet)

Currently, I see quite a number of messages in the syslog like:

flush_tlb_page: called on 1 from fffffc000033b358 but lock freed on 1

(fffffc000033b358 is inside the do_wp_page() routine)

and

spinlock stuck at fffffc000032a6cc(0) owner swapper at fffffc000032a6cc
b>spinlock grabbed at fffffc00003_2a6cc(0) 0 ticks<4>spinlock stuck at fffffc000032a6cc(0) owner swapper at fffffc000032a6cc
 spinlock grabbed3 at ff<4>spinlock stuck at fffffc000032a6cc(0) owner swapper at fffffc000032a6cc
o>spinlock grabbed at ffffbfc000032a<3>swap_duplicate at fffffc00003489e8: entry 3de0000000000, unused page
spinlock stuck at fffffc000032a6cc(0) owner swapper at fffffc000032a6cc
spinlock grabbed a0t fffffc000032a6cc(0) 0 ticks<<4>spinlock stuck at fffffc000032a6cc(0) owner swapper at fffffc000032a6cc
spinlock grabbed at offfffc000032a6cc(0<4>spinlock stuck at fffffc000032a6cc(0) owner swapper at fffffc000032a6cc
spinlock grabbed at 0fffffc000<4>spinlock stuck at fffffc000032a6cc(0) owner swapper at fffffc000032a6cc
spinlock grabbned at mfffffc<4>spinlock stuck at fffffc000032a6cc(0) owner swapper at fffffc000032a6cc


The latter address (fffffc000032a6cc) is inside the schedule() routine.

The questions are:

- how bad are these messages? If they are not critical, do they
  indicate some performance loss? What can I do about them?

- what kernel would you suggest for a dual dp264? What set of patches?
  This is going to be a production machine and it will have a really
  high load on all components: CPUs, network, and the disk, so I'm
  looking for something that not only "kinda works," but is really
  able to handle a high load without being rebooted on a monthly basis
  (shudder, recalling our lx164's running 2.0.35)

As a byproduct, I promise to document my experience to save those poor
souls that will try to do the same in the future...


-- 
Alexander L. Belikoff
Bloomberg L.P.
abel@vallinor4.com, abel@bloomberg.net



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index] []