Kernel oops+crash on repeated auditd restarts

Mon Apr 23 16:05:27 UTC 2012

This works for me. Thanks, Marcelo!

Cheers,
peter

On Fri, Apr 20, 2012 at 7:14 PM, Marcelo Cerri
<mhcerri at linux.vnet.ibm.com> wrote:
>
> I took a look at the source code and made some tests. It seems to be a
> problem with the reference count of the fsnotify_mark structure.
>
> This error occurs because the fsnotify_mark_destroy function
> (which runs in a separated kthread) is trying to iterate through a mark
> that is already freed.
>
> Looking at the fsnotify_destroy_mark function (not confuse with
> fsnotify_mark_destroy), which adds a mark to destroy_list to be freed
> later by fsnotify_mark_destroy, I noticed that it does not increment
> the reference count for the reference added to the destroy_list and
> usually the callers dispose the references they held after calling
> fsnotify_destroy_mark.
>
> The patch below increments the reference count of a mark when it is
> added to the destroy list. It seems to solve the issue and it doesn't
> seem to cause any memory leak. Please, can you make some tests in your
> environments and let me know if there is any problem with this patch.
>
> Regarding the synchronize_scru call, I don't think it's causing this
> error. Probably it just make it more frequently because it forces all
> the cpus to schedule, giving the chance to someone else to free the
> mark.
>
> ---
>  fs/notify/mark.c |    1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index f104d56..2985fff 100644
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -150,6 +150,7 @@ void fsnotify_destroy_mark(struct fsnotify_mark
> *mark) spin_unlock(&group->mark_lock);
>    spin_unlock(&mark->lock);
>
> +   fsnotify_get_mark(mark);
>    spin_lock(&destroy_lock);
>    list_add(&mark->destroy_list, &destroy_list);
>    spin_unlock(&destroy_lock);
> --
> 1.7.9.4
>
>
> On Tue, 17 Apr 2012 14:54:29 -0700
> Peter Moody <pmoody at google.com> wrote:
>
>> Last thing. moving synchronize_srcu(&fsnotify_mark_srcu) out of the
>> for(;;) loop in fs/notify/mark.c appears to solve the stability issues
>> for me. I don't know enough about kernel internals to determine if
>> this is doing lots of other bad things to my system or not.
>>
>> Cheers,
>> peter
>>
>> On Tue, Apr 17, 2012 at 11:24 AM, Peter Moody <pmoody at google.com>
>> wrote:
>> > and my config.gz
>> >
>> > On Tue, Apr 17, 2012 at 10:56 AM, Peter Moody <pmoody at google.com>
>> > wrote:
>> >> Here's a trace with debugging turned way up plus a few extra
>> >> printk's added to fs/notify/mark.c. I'm looping through
>> >> private_destroy_list before and after the call to synchronize_srcu.
>> >>
>> >> I can reproduce this reliably with kvm with 2 virtual processors:
>> >> Linux desktop 3.4.0-rc3-oops1+ #1 SMP Tue Apr 17 09:59:44 PDT 2012
>> >> x86_64 GNU/Linux
>> >>
>> >> Cheers,
>> >> peter
>> >>
>> >> On Thu, Apr 5, 2012 at 2:07 PM, Eric Paris <eparis at redhat.com>
>> >> wrote:
>> >>> please please please keep on list.  Everything you say might help
>> >>> track it down!
>> >>>
>> >>> On Thu, 2012-04-05 at 14:03 -0700, Peter Moody wrote:
>> >>>> (please let me know if I should take this off-list)
>> >>>>
>> >>>> One other thing (again, maybe already known), but this seems to
>> >>>> be exacerbated by SMP. On my machine, I can't reproduce the
>> >>>> crash if I booth with maxcpus=1.
>> >>>>
>> >>>> Still hunting.
>> >>>>
>> >>>> Cheers,
>> >>>> peter
>> >>>>
>> >>>> On Tue, Apr 3, 2012 at 9:15 AM, Peter Moody <pmoody at google.com>
>> >>>> wrote:
>> >>>> > This may already be known, but the issue seems to be limited
>> >>>> > to watch rules. With any watch rules, I can reliably crash my
>> >>>> > machine while freeing a watch rule after only
>> >>>> > starting/stopping auditd a few times. With no watch rules, I
>> >>>> > have no issues.
>> >>>> >
>> >>>> > Cheers,
>> >>>> > peter
>> >>>> >
>> >>>> > On Wed, Mar 28, 2012 at 11:44 PM, Valentin Avram
>> >>>> > <aval13 at gmail.com> wrote:
>> >>>> >> Yes, i know that patch. It made it into kernel 3.2.2. I
>> >>>> >> tested it successfully (oops in 3.2.1, no oops in 3.2.9), but
>> >>>> >> this oops i'm seeing is also in 3.2.9.
>> >>>> >>
>> >>>> >> I monitored changelogs since 3.2.1 to 3.2.12 but there were
>> >>>> >> no fixes either in audit subsystem or in fsnotify. I'll try
>> >>>> >> to reproduce in latest 3.2.13 and repost the oops, but i'm
>> >>>> >> 99% confident it will be the same.
>> >>>> >>
>> >>>> >> Sadly nobody except you seems to pay attention to this
>> >>>> >> problem, probably because it requires special conditions to
>> >>>> >> reproduce (really, who starts and stops auditd every 5
>> >>>> >> seconds on a production server?). We only ran into it because
>> >>>> >> one of our servers would randomly oops and then freeze about
>> >>>> >> each month after stopping and then starting
>> >>>> >>
>> >>>> >> auditd
>> >>>> >>
>> >>>> >> every morning (and the stop-start sequence was needed to
>> >>>> >> workaround a bug somewhere that would hang a
>> >>>> >>
>> >>>> >> gzip
>> >>>> >>
>> >>>> >> running on a file outside a watched folder).
>> >>>> >>
>> >>>> >> Anyway, as a last note, i have a feeling that the oops is not
>> >>>> >> exactly random, there is a pattern, just that i haven't
>> >>>> >> figured it out completely yet.
>> >>>> >>
>> >>>> >> Will keep you
>> >>>> >>
>> >>>> >> uptodate
>> >>>> >>
>> >>>> >> with the things i find out.
>> >>>> >>
>> >>>> >> V.
>> >>>> >>
>> >>>> >> On Mar 29, 2012 4:14 AM, "Eric Paris" <eparis at redhat.com>
>> >>>> >> wrote:
>> >>>> >>>
>> >>>> >>> That patch fixes a BUG() .  The report has a NULL ptr deref
>> >>>> >>> and some apparent list correuption....  Sadly they aren't
>> >>>> >>> the same....
>> >>>> >>>
>> >>>> >>> On Wed, 2012-03-28 at 15:42 -0700, Peter Moody wrote:
>> >>>> >>> > fyi: this patch [1] seems to fix the issue for me. The
>> >>>> >>> > explanation in the subject would reliably oops my machine.
>> >>>> >>> >
>> >>>> >>> > [1]
>> >>>> >>> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fed474857efbed79cd390d0aee224231ca718f63
>> >>>> >>> >
>> >>>> >>> > On Wed, Mar 28, 2012 at 1:51 PM, Peter Moody
>> >>>> >>> > <pmoody at google.com> wrote:
>> >>>> >>> > > Are you still able to reliably reproduce this oops? I'm
>> >>>> >>> > > trying to track this down because this bug (or a very
>> >>>> >>> > > similar bug) is causing some significant headaches here
>> >>>> >>> > > at work, but I haven't had a lot of luck. I'm using
>> >>>> >>> > > usermode linux, though, so that might be interfering
>> >>>> >>> > > with things.
>> >>>> >>> > >
>> >>>> >>> > > On Mon, Mar 5, 2012 at 12:35 AM, Valentin Avram
>> >>>> >>> > > <aval13 at gmail.com> wrote:
>> >>>> >>> > >> Finally i found some time and spare server to retest
>> >>>> >>> > >> the oops and list_add
>> >>>> >>> > >> corruptions i was getting with the 3.x kernels and
>> >>>> >>> > >> auditd 2.1.3.
>> >>>> >>> > >>
>> >>>> >>> > >> I tested now with gentoo's latest stable
>> >>>> >>> > >> 3.2.1-gentoo-r2 and kernel.org's
>> >>>> >>> > >> 3.2.9.
>> >>>> >>> > >>
>> >>>> >>> > >> Both get the oops/BUG in the same way and after that,
>> >>>> >>> > >> they keep pouring
>> >>>> >>> > >> list_add corruptions with audit_prune_tre(truncated?)
>> >>>> >>> > >> and auditctl as comms.
>> >>>> >>> > >>
>> >>>> >>> > >> Since this is not about Gentoo's kernel only, i'll post
>> >>>> >>> > >> here the oops in
>> >>>> >>> > >> 3.2.9 and also attach some list_add corruptions.
>> >>>> >>> > >>
>> >>>> >>> > >> 3.2.9 BUG:
>> >>>> >>> > >>
>> >>>> >>> > >> kernel: [  301.240011] BUG: unable to handle kernel
>> >>>> >>> > >> NULL pointer dereference
>> >>>> >>> > >> at   (null)
>> >>>> >>> > >> kernel: [  301.240305] IP: [<c1238dd0>]
>> >>>> >>> > >> __list_del_entry+0x20/0xe0 kernel: [  301.240481] *pdpt
>> >>>> >>> > >> = 0000000000000000 *pde = f000ddc8f000ddc8
>> >>>> >>> > >> kernel: [  301.240698] Oops: 0000 [#1] SMP
>> >>>> >>> > >> kernel: [  301.240910]
>> >>>> >>> > >> kernel: [  301.241030] Pid: 642, comm: fsnotify_mark
>> >>>> >>> > >> Not tainted 3.2.9-drbd-version3 #1 Dell Inc. PowerEdge
>> >>>> >>> > >> 2950/0CX396 kernel: [  301.241370] EIP:
>> >>>> >>> > >> 0060:[<c1238dd0>] EFLAGS: 00010287 CPU: 6 kernel:
>> >>>> >>> > >> [  301.241498] EIP is at __list_del_entry+0x20/0xe0
>> >>>> >>> > >> kernel: [  301.241623] EAX: f4fae544 EBX: f47cffa4 ECX:
>> >>>> >>> > >> ffffffff EDX: 00000000 kernel: [  301.241751] ESI:
>> >>>> >>> > >> f4fae544 EDI: f4fae508 EBP: f47cff7c ESP: f47cff64
>> >>>> >>> > >> kernel: [  301.241879]  DS: 007b ES: 007b FS: 00d8 GS:
>> >>>> >>> > >> 0000 SS: 0068 kernel: [  301.242005] Process
>> >>>> >>> > >> fsnotify_mark (pid: 642, ti=f47ce000 task=f4f47c00
>> >>>> >>> > >> task.ti=f47ce000) kernel: [  301.242207] Stack:
>> >>>> >>> > >> kernel: [  301.242327]  c10813c0 f47cffa4 f4f47c00
>> >>>> >>> > >> f4e70888 f47cff7c f47cffa4 f47cffb8 c10f6976
>> >>>> >>> > >> kernel: [  301.242882]  ffffffc3 f4f47c00 f4f47c00
>> >>>> >>> > >> 00000000 f4f47c00 c10530c0 f47cff9c f47cff9c
>> >>>> >>> > >> kernel: [  301.243438]  f4fae544 f4fae544 f4c47f58
>> >>>> >>> > >> 00000000 c10f68f0 f47cffe4 c1052834 00000000
>> >>>> >>> > >> kernel: [  301.243995] Call Trace:
>> >>>> >>> > >> kernel: [  301.244119]  [<c10813c0>] ?
>> >>>> >>> > >> rcu_check_callbacks+0x110/0x110
>> >>>> >>> > >> kernel: [  301.244248]  [<c10f6976>]
>> >>>> >>> > >> fsnotify_mark_destroy+0x86/0x120 kernel: [  301.244377]
>> >>>> >>> > >>  [<c10530c0>] ? abort_exclusive_wait+0x80/0x80 kernel:
>> >>>> >>> > >> [  301.244504]  [<c10f68f0>] ?
>> >>>> >>> > >> fsnotify_put_mark+0x30/0x30 kernel: [  301.244631]
>> >>>> >>> > >>  [<c1052834>] kthread+0x74/0x80 kernel: [  301.244756]
>> >>>> >>> > >>  [<c10527c0>] ? kthread_flush_work_fn+0x10/0x10 kernel:
>> >>>> >>> > >> [  301.244885]  [<c1582ab6>]
>> >>>> >>> > >> kernel_thread_helper+0x6/0xd kernel: [  301.245011]
>> >>>> >>> > >> Code: 55 f4 8b 45 f8 e9 75 ff ff ff 90 55 89 e5 53 83
>> >>>> >>> > >> ec 14 8b 08 8b 50 04 81 f9 00 01 10 00 74 24 81 fa 00
>> >>>> >>> > >> 02 20 00 0f 84 8e 00 00 00 <8b> 1a 39 d8 75 62 8b 59 04
>> >>>> >>> > >> 39 d8 75 35 89 51 04 89 0a 83 c4 14
>> >>>> >>> > >> kernel: [  301.248195] EIP: [<c1238dd0>]
>> >>>> >>> > >> __list_del_entry+0x20/0xe0 SS:ESP
>> >>>> >>> > >> 0068:f47cff64
>> >>>> >>> > >> kernel: [  301.248414] CR2: 0000000000000000
>> >>>> >>> > >> kernel: [  301.248538] ---[ end trace
>> >>>> >>> > >> 15082dbfb353f84c ]---
>> >>>> >>> > >>
>> >>>> >>> > >> The kernel was compiled with the following DEBUG
>> >>>> >>> > >> support (the bolded one
>> >>>> >>> > >> were requested by Gentoo's Dev:
>> >>>> >>> > >> CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
>> >>>> >>> > >> CONFIG_SLUB_DEBUG=y
>> >>>> >>> > >> CONFIG_HAVE_DMA_API_DEBUG=y
>> >>>> >>> > >> CONFIG_X86_DEBUGCTLMSR=y
>> >>>> >>> > >> CONFIG_PNP_DEBUG_MESSAGES=y
>> >>>> >>> > >> CONFIG_AIC94XX_DEBUG=y
>> >>>> >>> > >> CONFIG_USB_DEBUG=y
>> >>>> >>> > >> CONFIG_DEBUG_KERNEL=y
>> >>>> >>> > >> CONFIG_SCHED_DEBUG=y
>> >>>> >>> > >> CONFIG_DEBUG_RT_MUTEXES=y
>> >>>> >>> > >> CONFIG_DEBUG_PI_LIST=y
>> >>>> >>> > >> CONFIG_DEBUG_BUGVERBOSE=y
>> >>>> >>> > >> CONFIG_DEBUG_INFO=y
>> >>>> >>> > >> CONFIG_DEBUG_MEMORY_INIT=y
>> >>>> >>> > >> CONFIG_DEBUG_LIST=y
>> >>>> >>> > >> CONFIG_DEBUG_STACKOVERFLOW=y
>> >>>> >>> > >> CONFIG_DEBUG_RODATA=y
>> >>>> >>> > >> CONFIG_DEBUG_RODATA_TEST=y
>> >>>> >>> > >>
>> >>>> >>> > >> I attached the kernel config i used for 3.2.9 to
>> >>>> >>> > >> generate this oops and
>> >>>> >>> > >> warnings.
>> >>>> >>> > >>
>> >>>> >>> > >> From the list_add warnings that come after, out of 805
>> >>>> >>> > >> warnings i processed,
>> >>>> >>> > >> after masking with XXXXX the PID and next= values that
>> >>>> >>> > >> kept changing in
>> >>>> >>> > >> every one, i got 26 types of MD5. I also attached the
>> >>>> >>> > >> files relevant as an
>> >>>> >>> > >> archive to this email.
>> >>>> >>> > >>
>> >>>> >>> > >> The Gentoo bug i opened is sleeping, it seems nobody
>> >>>> >>> > >> has the time to at
>> >>>> >>> > >> least test to confirm or not the problems i'm seeing
>> >>>> >>> > >> (or everybody's thinking that nobody would restart
>> >>>> >>> > >> auditd so often, so the bug it's not that
>> >>>> >>> > >> serious).
>> >>>> >>> > >>
>> >>>> >>> > >>
>> >>>> >>> > >> Thank you for your time.
>> >>>> >>> > >>
>> >>>> >>> > >> On Wed, Feb 8, 2012 at 6:11 PM, Valentin Avram
>> >>>> >>> > >> <aval13 at gmail.com> wrote:
>> >>>> >>> > >>
>> >>>> >>> > >>
>> >>>> >>> > >> --
>> >>>> >>> > >> Linux-audit mailing list
>> >>>> >>> > >> Linux-audit at redhat.com
>> >>>> >>> > >> https://www.redhat.com/mailman/listinfo/linux-audit
>> >>>> >>> > >
>> >>>> >>> > >
>> >>>> >>> > >
>> >>>> >>> > > --
>> >>>> >>> > > Peter Moody      Google    1.650.253.7306
>> >>>> >>> > > Security Engineer  pgp:0xC3410038
>> >>>> >>> >
>> >>>> >>> >
>> >>>> >>> >
>> >>>> >>>
>> >>>> >>>
>> >>>> >>
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > --
>> >>>> > Peter Moody      Google    1.650.253.7306
>> >>>> > Security Engineer  pgp:0xC3410038
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Peter Moody      Google    1.650.253.7306
>> >> Security Engineer  pgp:0xC3410038
>> >
>> >
>> >
>> > --
>> > Peter Moody      Google    1.650.253.7306
>> > Security Engineer  pgp:0xC3410038
>>
>>
>>
>

-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038