[augeas-devel] Httpd strategy

Francis Giraldeau francis.giraldeau at usherbrooke.ca
Tue Aug 3 16:14:17 UTC 2010


Hi, 

> Given that we can't really list all possible directive or section names,
> the 'generic' approach seems the only one that won't choke on too many
> existing, valid httpd configurations; the downside is that we'll allow
> some illegal constructs, like providing too many or too few arguments to
> directives, or illegal nesting of sections.

Right, I'm fine with this, let's do it in a generic way. 

> 
> What I envision is a lens that knows about three things: sections,
> directives and arguments, and turns a file
> 
>   <Sec1 arg11 arg12>
>      <Sec2 arg21 arg22>
>        Directive1 arg31 arg32
>        Directive2 arg41
>      </Sec2>
>   </Sec1>
> 
> into the tree
> 
>         { "<Sec1" }
>           { "param" = "arg11" } { "param" = "arg12" }
>           { "<Sec2" }
>                 { "param" = "arg21" } { "param" = "arg22" }
>                 { "Directive1"
>                   { "param" = "arg31" } { "param" = "arg32" } }
>                 { "Directive2"
>                   { "param" = "arg41" } }
> 
> This is also pretty much what the httpd lens for RHQ does (attached to
> bug #100 or in their git repo at [1]) As far as I can tell, that will
> avoid all typechecking headaches, and still lead to a reasonable tree.

There is a gotcha with the "<" in front of the section name. What need
it to manage the put ambiguity. Directives nodes don't have "<" and
hence they are differentiable. 

We can't have this in a square lens, otherwise it will be put at the end
of the tag, as this: 

    <Directive>
        ...
    </<Directive>

We could get a tree like this: 

{ "section" = "Sec1"
  { "param" = "arg11" }
  { "param" = "arg12" }
  { "section" = "Sec2" 
    { "param" = "arg21" }
    { "param" = "arg22" }
    { "directive" = "Directive1" 
      { "param" = "arg31" }
      { "param" = "arg32" }
    }
    { "directive" = "Directive2" 
      { "param" = "arg4" }
    }
  }
}

But, you know, it's much less sexy... 

> The RHQ lens needs some work though, since it doesn't accept any section
> name, just a fixed list if names - no wonder, because you need the
> square lens for that.
> 
> > Here are the results parsing benchmark with a representative apache
> > configuration for the two lenses. (average of 10 runs on intel duo
> > 1,8GHz) First test is real time to process the test and the second is
> > total memory allocation reported by valgrind. 
> > 
> >                | time w check  | time wo check 
> > Httpd_exact    | 5,31 s        | 0,34 s
> > Httpd_generic  | 0,09 s        | 0,05 s
> > 
> >                | mem w check   | mem wo check 
> > Httpd_exact    | 1536 Mb       | 61 Mb
> > Httpd_generic  |    3 Mb       |  1 Mb
> 
> That's a very strong argument for using a generic lens; I am actually
> surprised you got the exact lens to typecheck. The ones I've written in
> the past all ran out of memory on a 4GB machine.

One earlier version was doing something like this: 

let directives_regexp = [a-zA-Z0-9]+ - /Directory|.../

This is what it produce for one section name: 

/Director((y[0-9A-Za-z]|[0-9A-Za-xz])[0-9A-Za-z]*|())|
Directo([0-9A-Za-qs-z][0-9A-Za-z]*|())|
Direct([0-9A-Za-np-z][0-9A-Za-z]*|())|
Direc([0-9A-Za-su-z][0-9A-Za-z]*|())|
Dire([0-9A-Zabd-z][0-9A-Za-z]*|())|
Dir([0-9A-Za-df-z][0-9A-Za-z]*|())|
Di([0-9A-Za-qs-z][0-9A-Za-z]*|())|
(D[0-9A-Za-hj-z]|[0-9A-CE-Za-z][0-9A-Za-z])[0-9A-Za-z]*
|D|[0-9A-CE-Za-z]/

And that was creating a huge automaton that was memory intensive. In
fact, automatons are smaller when listing every directives than while
substracting from a general regexp. 

The sampled 1,5Gb for typecheching is the total allocated memory, not
the maximum memory the process occupied in memory, so I was never ran
out of memory. 

But anyway, we will do something generic...

> 
> > also another point to take into consideration. Httpd directives are case
> > insensitive, the generic lens handle this and the exact one doesn't.
> 
> There's an additional wrinkle that I didn't think of before: besides
> making sure that all regexps used in the lens match case insensitively,
> we also need to make sure that path expressions match case
> insensitively, i.e. the path expression
> 
>         /files/etc/httpd/conf/httpd.conf/<sec1/<Sec2/directive1
>         
> should match in the tree above. 
> 
> The most elegant way to achieve this would be to add two new flags to
> tree nodes 'label_nocase' and 'value_nocase'; when they are set, the
> interpreter for path expressions performs comparisons against the node
> label/value without regard for case. They get initialized in get.c: when
> a key or store lens is based on a case-insensitive regexp, we set these
> flags on the tree node that is constructed from them.

Ok, ticket created.

Francis




More information about the augeas-devel mailing list