[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Unable to load interpreter // VM thrashing or hardware ???



I hope that I am not going on everyone's nerves, but the sudden
instabilities of my two Alpha LXs frustrate me. I thought that I
have yet another hardware problem, but now I start to doubt this.
And from some posts to the initial thread, it seems that I am
not the only one having these problems.

First, thanks for the replies received so far; at least, I understand
now that the error message indicates insufficient
memory.  However, I don't understand how the VM system gets in
this state, so I'd like to discern between possible hardware and
software reasons.  Some observations:

(1) The problem (reproducible for me with the code given at the end
of this mail) shows up with both the RH 5.1 provided milo and 2.0.35
source, as well as with the Oct 10, 1998 milo, pristine 2.0.35 sources
plus the alpha-2.0.35-0.2 patches by Jay.

(2) The problem appears with the default /proc/sys/kernel/{file-max,
inode-max}, as well as with file-max=2048 and inode-max=8192.  Further,
why would that affect anything with respect to 

(3) The old panacea of setenv MEMORY_SIZE 252 doesn't help either

(4) Watching a bit with free, shows an interesting pattern:  For
several iterations in the test (4-5), the free amount of virtual
memory is where I expect it to be, well > 100MB (for my machine,
details follow with the source code). Suddenly, in the fifth or sixth
iteration, the virtual memory falls rapidly to 0; obviously, this
is when the problem occurs.

(5) The same testcase (adjusted for memory sizes) runs for hours on
both an UX(!) (kernel compiled from pristine 2.0.35 plus alpha-patches
plus the 2.0.35-0.1 alpha patches by Jay, plus a fairly recent ldmilo
+ pal-codes by Jay, plus a larger value of SHMMAX in shmparam.h, some
black magic at least at some point necessary for the UX) and an Intel
machine after a default RH 5.2 installation.

(6) As I said earlier, I suspected initially hardware problems, since one
of the afflicted machines had a number of problems in the past.  However, the
second machine has been running perfectly for several weeks, only that we
now have an application that apparently leads to heavy paging.

It would be great if someone were willing to try whether he/she can
reproduce the problem on a similar configuration.  For me, the problem
shows up within an hour; provided one kills the test-executable by
Ctrl-C fairly quickly, the systems recover or can at least be shut
down cleanly.

My systems are both LX 533MHz, 256MB RAM, 2 x 136512k swap space (I
thought on an Alpha with a page size of 8k this was fine?), Adaptec
2940U/UW, respectively.  Both run RH 5.1, and normally a kernel 
compiled from the RH provided 2.0.35 sources.  I remember hearing
rumours that there are definitely problems on large memory machines
>= 512M (both LX/UX); could anyone more knowledgeable comment on
those issues?

Well, finally, here is the test case (with slight modifications 
identical to code from D.Gilbert posted here several months ago):
The SIZE variable is chosen for my RAM size, i.e. I enforce 
swapping: this same test runs happily with SIZE being set to 240*256*..
which on an empty machine seems to fit in RAM.

---------------------------------------------------------------------

#include <stdio.h>
#include <stdlib.h>

#define SIZE (260*256*1024*sizeof(int))
int main() {
  volatile int * volatile pl=(int *)malloc(SIZE);

  int a;

  int s;
  
  int sum1,sum2;

  printf("Using %d bytes.\n",SIZE);
  for(s=1056;;s++) {
    printf("Filling with seed of s=0x%x \n",s);
    srandom(s);
sum1=0;
    for(a=0;a<(SIZE/sizeof(int));a++) {
      int tmp1=random();
      sum1+=tmp1;

      pl[a]=tmp1;
    };

    srandom(s);

    printf("Testing with seed of s=0x%x \n",s);
    for(a=0,sum2=0;a<(SIZE/sizeof(int));a++) {
      int tmp1=random();
      int tmp2=pl[a];

      sum2+=tmp1;
      if (tmp1!=tmp2) {
        fprintf(stderr,"MISMATCH ERROR! at a=0x%x (random=0x%x array=0x%x
reread=0x%x seed=
0x%x)\n",a,tmp1,tmp2,pl[a],s);
      };
    };
    printf("Completed pass with seed of s=0x%x sum1=0x%x
sum2=0x%x\n",s,sum1,sum2);
    if (sum1!=sum2) {
      printf("Internal inconsistency! Sums differ! sum1=0x%x
sum2=0x%x\n",sum1,sum2);
      exit(2);
    };
  };
};




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index] []