Fedora Goals -- LSB-compliant/ideal init for FC5+

Tue Jun 7 18:04:17 UTC 2005

Bryan J. Smith <b.j.smith at ieee.org> (thebs413 at earthlink.net) said: 
> Two follow-ups then:  
> 
> 1)  Is there any information, documentation, ideas or any mailing
> archives on this?

Here ya go.

Luke Macken (lmacken at redhat.com) may be looking at some of this
in the near future.

> I've also been looking for an excuse to learn Python/Newt (that's what
> the fedora/redhat-config-* programs are written in so they can target
> both slang and GTK+, correct?), so if there is some opportunity to
> develop there, I'm interested.

At first, I expect there to be very little UI code here; what's
needed is the framework.

Bill
-------------- next part --------------
Some comments/proposals for init work
-------------------------------------

Please refer to any of the varied references on the net for System V
init to get a general idea of how the current init system works.

The current initialization and bootup system has generated various
complaints:

- it's not very fast, or at least appears that way
- it does not have full LSB support, or dependency support
- there's not a good global view of what services that are configured
  for a particular runlevel are actually running
- there's no mechanism for respawning of services except by init
- there's no structured logging of serviced, and if they failed, etc

Over time, various alternative frameworks have popped up, including,
but not limited to:

- minit
- runit
- initng
- serel

Looking at how these stack up:

1) Bootup speed

A lot of these claim "look how much faster we boot!". However, in most
cases they miss the point. For example:

A) The new-fangled init system has:
  1) exec nscd

B) The current init system has:
  1) make sure nscd directories are in place
  2) check to make sure it's not already running
  3) set ulimits correctly
  4) set nice level
  5) setuid to a specified user, if necessary
  6) exec nscd
  7) log whether or not it succeeded

Telling me that A) is faster is meaningless, and, frankly, not useful.
What's needed is to make B) faster. In some cases, this can mean
eliminating some of those steps. But blindly eliminating all of them
is silly.

The other aspect of bootup speed is parallelization. Initially, this
seems like a big win. However, testing of simple naive implementations
show that, at least initially, parallelization isn't a huge benefit.
Generally, this is because disk seek time and other I/O limitations
can dominate. Moreover, a not insignificant portion of boot time is
the parts handled by /etc/rc.sysinit; this is almost entirely linear
in nature (need to load modules first, then check filesystems, then
clean out /tmp, etc.)

2) LSB support

http://www.linuxbase.org/spec//booksets/LSB-Core-generic/LSB-Core-generic.html#TOCSYSINIT

Currently, LSB support is wedged in via chkconfig. chkconfig can
parse LSB standard headers for start and stop levels. For LSB
dependencies, a conversion to the current Sxx/Kxx priorities is
done at script installation time by computing a valid start priority
relative to the dependencies specified.

This does have some problems:
- priorities aren't recomputed if other dependencies are added/changed
- adoption of other LSB features is slow

Moreover, very few of the current initscripts correspond to LSB
standards such as for exit codes - in fact, they're very much
ad-hoc in that regard.

3) No global view of state

Services can be queried individually for status, and there's a global
view of what services are configured to start via
system-config-services. There's no combination of this, however.
(And, the implementation of system-config-services leaves much to
be desired.)

4) No respawn mechanism

Some of the newer implementations actually do handle this.

5) No structured logging

Earlier (prior to FC4) initlog was used. This was a inefficient and
badly coded (by me!) program that basically dumped a program's output
to syslog if it failed, and otherwise logged a simple '<foo> succeeded'.
However, this only happened on startup; there was no monitoring of
services to see if they failed later, or respawning of services if
they did fail. Moreover, since it was just logging program output, it
didn't log in a way that could be easily monitored or picked out.

So, how to solve these problems?
--------------------------------

Obviously, rewrite everything! This would include, at a minimum,
/etc/rc. It may also include /sbin/init, and could extend to
crond, atd, xinetd, and other service frameworks as well.

Features required by this implementation:

- Proper *runtime* dependency support. Starting a service should
  start its dependencies, in-order. Changes to dependent services
  should be immediately handled.

  Out of this work, parallelization can easily fall out. Without this
  work, parallelization is pointless. :)

- Full backwards compatibility. Old-style init scripts aren't going
  to go away anytime soon.

- Full LSB support for LSB init scripts. Makes the spec groups happy,
  even though the number of LSB scripts is basically nil.

- D-BUS support. Services should be exposed via D-BUS, and should be
  available for querying to:

  - check what's configured
  - check what's actually running
  - start them
  - stop them
  - etc.

  The services should also post notifications via D-BUS when they've
  started successfully, and (more importantly) when they haven't,
  or when they've unexpectedly exited.

- Support for respawning services, if necessary. Complete with rate
  limits and other fun stuff.

This should handle most of the concerns above. More speed hopefully
will fall out of this, but if it doesn't, oh well.

Looking at these features, the best way to do this is almost
certainly to add the D-BUS, etc support into the services themselves,
and provide a wrapper for legacy LSB and other initscripts that
provides the D-BUS interface to them.

Things to look at when implementing this:

- initng
  http://jw.dyndns.org/initng/

  The latest new-init proposal, it's starting to get more traction,
  Segfaults immediately when I tried it, though.

- SystemServices
  http://www.gnome.org/~seth/blog/2003/Sep/27

  Beat code out of Seth if necessary. :)  He's got the right idea, even
  if I quibble with some of his implementation. (Not sure that python
  is a good idea here, at least for parts of it.)

Things we'd like to look at, but can't:

- launchd
  http://www.macgeekery.com/tips/all_about_launchd_items_and_how_to_make_one_yourself
  http://developer.apple.com/documentation/Darwin/Reference/ManPages/man8/launchd.8.html
  http://arstechnica.com/reviews/os/macosx-10.4.ars/5

  You can look at the docs. You can probably even play with a
  OS X implementation.

  You can not look at the code.

  launchd is APSL licensed. Looking at the code will contaminate you,
  and you won't be able to write code that implements its features.

  It may be possible to have someone else, who never touches our code,
  to look at the code and writes specs.

  The best solution is to beat Apple into releasing launchd under a
  sane license. Don't hold your breath, though.

Required languages of implementation: C, potentially a little python

  Not much flexibility here. Obviously, the lower level in the bootup
  process you get, the more you're confined to C.