[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[linux-lvm] Problems with LVM snapshots



For the past two nights, my Scalix machine running RHEL 4 (update 2) has
failed during it backup process.  That process basically creates an LVM
snapshot, allows me to tar up the data and then removes the snapshot.  I
can attache the script if needed.

The important part about this is that both of the past two nights, as
the backup script goes to create the LVM Snapshot, it fails, which soon
causes Scalix to stop responding, which for all practical matters causes
the server to become useless.  In both cases, I've had to reboot the
server, enter maintenance mode, remove the snapshot and then reboot
again to get the machine back in service.  The good news to this point
is that the machine is not experienced any corruption or file system
failures whent he system "freezes", but I don't like the situtation.
Besides this the backups haven't worked.  Needless to say I need to get
this fixed ASAP.

Outside of what I've described, I have this snippet
from /var/log/messages which shows the event:

========================================================================
Oct 27 18:00:01 concorde crond(pam_unix)[21037]: session opened for user
root by (uid=0)
Oct 27 18:00:02 concorde kernel: lvcreate: page allocation failure.
order:1, mode:0xd0
Oct 27 18:00:02 concorde kernel:  [<c013fa77>] __alloc_pages+0x28b/0x29d
Oct 27 18:00:02 concorde kernel:  [<c013faa1>] __get_free_pages
+0x18/0x24
Oct 27 18:00:02 concorde kernel:  [<c01423f8>] kmem_getpages+0x1c/0xbb
Oct 27 18:00:02 concorde kernel:  [<c0142f46>] cache_grow+0xab/0x138
Oct 27 18:00:02 concorde kernel:  [<c0143138>] cache_alloc_refill
+0x165/0x19d
Oct 27 18:00:02 concorde kernel:  [<c014350c>] __kmalloc+0x76/0x88
Oct 27 18:00:02 concorde kernel:  [<c013e709>] mempool_resize+0x86/0x13f
Oct 27 18:00:02 concorde kernel:  [<f8bb1322>] resize_pool+0x3a/0xa2
[dm_mod]
Oct 27 18:00:02 concorde kernel:  [<f8bb24c3>] kcopyd_client_create
+0x71/0x9f [dm_mod]
Oct 27 18:00:02 concorde kernel:  [<f8c73697>] snapshot_ctr+0x231/0x2b8
[dm_snapshot]
Oct 27 18:00:02 concorde kernel:  [<f8bae185>] dm_table_add_target
+0xfc/0x169 [dm_mod]
Oct 27 18:00:02 concorde kernel:  [<f8bb020c>] populate_table+0x8a/0xaf
[dm_mod]
Oct 27 18:00:02 concorde kernel:  [<f8bb0268>] table_load+0x37/0x123
[dm_mod]
Oct 27 18:00:02 concorde kernel:  [<f8bb0ce3>] ctl_ioctl+0xd1/0x144
[dm_mod]
Oct 27 18:00:02 concorde kernel:  [<f8bb0231>] table_load+0x0/0x123
[dm_mod]
Oct 27 18:00:02 concorde kernel:  [<c0165b5e>] sys_ioctl+0x227/0x269
Oct 27 18:00:02 concorde kernel:  [<c02c7377>] syscall_call+0x7/0xb
Oct 27 18:00:02 concorde kernel: Mem-info:
Oct 27 18:00:02 concorde kernel: DMA per-cpu:
Oct 27 18:00:02 concorde kernel: cpu 0 hot: low 2, high 6, batch 1
Oct 27 18:00:02 concorde kernel: cpu 0 cold: low 0, high 2, batch 1
Oct 27 18:00:03 concorde kernel: cpu 1 hot: low 2, high 6, batch 1
Oct 27 18:00:03 concorde kernel: cpu 1 cold: low 0, high 2, batch 1
Oct 27 18:00:03 concorde kernel: Normal per-cpu:
Oct 27 18:00:03 concorde kernel: cpu 0 hot: low 32, high 96, batch 16
Oct 27 18:00:03 concorde kernel: cpu 0 cold: low 0, high 32, batch 16
Oct 27 18:00:03 concorde kernel: cpu 1 hot: low 32, high 96, batch 16
Oct 27 18:00:03 concorde kernel: cpu 1 cold: low 0, high 32, batch 16
Oct 27 18:00:03 concorde kernel: HighMem per-cpu:
Oct 27 18:00:03 concorde kernel: cpu 0 hot: low 14, high 42, batch 7
Oct 27 18:00:03 concorde kernel: cpu 0 cold: low 0, high 14, batch 7
Oct 27 18:00:03 concorde kernel: cpu 1 hot: low 14, high 42, batch 7
Oct 27 18:00:03 concorde kernel: cpu 1 cold: low 0, high 14, batch 7
Oct 27 18:00:03 concorde kernel: 
Oct 27 18:00:03 concorde kernel: Free pages:       16132kB (364kB
HighMem)
Oct 27 18:00:03 concorde kernel: Active:163817 inactive:73927 dirty:26
writeback:0 unstable:0 free:4033 slab:12358 mapped:63944 pagetables:1818
Oct 27 18:00:03 concorde kernel: DMA free:12632kB min:16kB low:32kB
high:48kB active:0kB inactive:0kB present:16384kB pages_scanned:56
all_unreclaimable? yes
Oct 27 18:00:03 concorde kernel: protections[]: 0 0 0
Oct 27 18:00:03 concorde kernel: Normal free:3136kB min:928kB low:1856kB
high:2784kB active:578536kB inactive:245936kB present:901120kB
pages_scanned:0 all_unreclaimable? no
Oct 27 18:00:03 concorde kernel: protections[]: 0 0 0
Oct 27 18:00:03 concorde kernel: HighMem free:364kB min:128kB low:256kB
high:384kB active:76732kB inactive:49772kB present:131008kB
pages_scanned:0 all_unreclaimable? no
Oct 27 18:00:03 concorde kernel: protections[]: 0 0 0
Oct 27 18:00:03 concorde kernel: DMA: 2*4kB 4*8kB 3*16kB 4*32kB 4*64kB
1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 2*4096kB = 12632kB
Oct 27 18:00:03 concorde kernel: Normal: 636*4kB 50*8kB 4*16kB 4*32kB
0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3136kB
Oct 27 18:00:03 concorde kernel: HighMem: 1*4kB 1*8kB 0*16kB 1*32kB
1*64kB 2*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 364kB
Oct 27 18:00:03 concorde kernel: Swap cache: add 2, delete 0, find 0/0,
race 0+0
Oct 27 18:00:03 concorde kernel: Free swap:       2096432kB
Oct 27 18:00:03 concorde kernel: 262128 pages of RAM
Oct 27 18:00:03 concorde kernel: 32752 pages of HIGHMEM
Oct 27 18:00:04 concorde kernel: 3458 reserved pages
Oct 27 18:00:04 concorde kernel: 176132 pages shared
Oct 27 18:00:04 concorde kernel: 2 pages swap cached
Oct 27 18:00:04 concorde kernel: device-mapper: Could not create kcopyd
client
Oct 27 18:00:04 concorde kernel: device-mapper: error adding target to
table
========================================================================

As of right now, that's all the info I have except for the backup script
and maybe some hardware information.  If anything else is needed please
let me know.

-- 
Kevin L. Collins, MCSE
Systems Manager
Nesbitt Engineering, Inc.

Attachment: signature.asc
Description: This is a digitally signed message part


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]