[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [rhelv5-list] missing automount entries under rhel5.1 with updates



On Fri, 2007-12-28 at 17:58 -0800, Deke Clinger wrote:
> On Wed, 26 Dec 2007, Ian Kent wrote:
> > 
> > > 
> > > My problem now is that the direct map mount point (/prj) is 'leaking'
> > > entries: after a reboot I'll see my 940-odd directories. But within a
> > > few hours the subdirectories begin to fall out, sometimes leaving only
> > > 50-100 entries. Trying to cd into a missing directory gets a 'no such
> > > diretory'. A reboot will put all the entries back for a few hours.
> > 
> > Or you could just "service autofs restart", which should work OK and
> > allow existing mounts to continue being used.
> 
> restart/reload isn't working exactly as I'd hope.
> 
> On the release described above, 'service autofs reload' (which sends a
> hup to the automounter) says 'Reloading maps' and starts unmounting
> all the idle direct mounts (~6500 entries).

This sound a bit like a bug that has been fixed.
The idle direct mount triggers shouldn't be umounted across a HUP signal
if they are still present in the map.

> 
> 'service autofs restart' says:
> 
> Stopping automount:                                        [FAILED]
> Starting automount: automount: program is already running.
>                                                            [FAILED]

But this could well be a completely separate issue.

> 
> This takes a long time and unmounts all the idle direct mounts. In
> both cases there's a long stream of 'umounted direct mount' log
> entries in the syslog.

Yes, it does take a long time but it's possible it is taking longer than
it should. Hopefully that will come out during our work here.

> 
> On 2.6.18-8.el5xen with autofs-5.0.1-0.rc2.42 'service autofs reload'
> says 'Reloading maps' and the syslog is quiet. 'service autofs
> restart' says:
> 
> Stopping automount:                                        [  OK  ]
> Starting automount:                                        [  OK  ]
> 
> and the syslog shows a lot of umount/mount events. As far as I can
> tell the automount configs between these hosts are identical.

With a largish number of entries in the map even small changes, possibly
made when fixing another issue, could cause this change in behavior.

Shutdown for large maps is a problem that has been around for a long
time. The problem exists because, in the init script, we can't just wait
forever for autofs to exit as it might not. We don't have a way of
knowing if autofs is continuing after being unable to shutdown or if it
is still shutting down. Because of this there is a delay loop in the
init script stop() function that checks if automount has gone and exits
if so otherwise it waits about 90 seconds before giving up. As I say,
the shutdown is fairly slow (perhaps too slow and there's a new problem
I'm not aware of yet) for a large number of mounts but you may get some
joy from tweaking the number of iterations or the sleep delay in the
init script function stop().

The real solution to this problem is coming although I don't have a time
frame. The way we intend to fix it is to use a command utility that
talks to the daemon, issues action requests and gets a return status so
we then know what the daemon is doing. The first step toward this has
been to implement the ability to dynamically change the logging level
for automounts by communicating through named pipes setup when autofs
starts. This was recently done by Jeff Moyer and is present in coming
RHEL 5.2 update. However, quite a bit more work needs to be done to
define how this command channel facility will function for other
requests, such as shutdown, before we can go further with this.

So, bottom line is we need to live with this a while longer.

> 
> > > Previous releases (RHEL5 GA, kernel 2.6.18-8.el5xen and
> > > autofs-5.0.1-0.rc2.42, for example) do not have this problem. I have a
> > > guest with 120 days uptime that still sees all the subdirs under /prj.
> > > 
> > > Has anyone else seen problems like this? I'm not seeing any new
> > > Bugzilla entries that look similiar. 
> > 
> > I haven't heard anything about this problem.
> > 
> > How about a debug log?
> 
> I'm happy to send you complete logs off-list if you need to see them.
> 
> > Do you send HUP signals to the daemon?
> 
> Yes. The automounter is hupped (autofs reload) automatically whenever
> we update any of the maps. That appears to be the root cause of this.
> 
> > If you do send the daemon a HUP signal does it refresh the tree or
> > completely break autofs?
> 
> On previous releases, the former. On the current release, the latter.

Yes, sounds very much like the problem I mentioned above which I think
(hope) I've resolved with the RHEL 5.2. 

> 
> > What is the source of your maps?
> 
> Maps are in flat files. 
> 
> > Maybe we should try the latest revision of autofs in RHEL CVS, at least
> > I could find out if I have introduced a regression in 5.1 and not
> > indirectly fixed it already. Clearly it hasn't been through the needed
> > QA but I'm nearly done with my testing and there are many bug fixes
> > included.
> 
> I'd be glad to try it out.

Great, but we probably need to have a bug logged for tracking purposes
in order for me to make the updated autofs available for you to test. If
you can't log a bug against 5.1 yourself I can log it with you as the
reporter (if that's OK with you).

Ian



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]