[augeas-devel] [PATCH 0/6] Escaping in regular expressions

Michael Chapman mike at very.puzzling.org
Sat Oct 8 11:08:00 UTC 2011


Hi all,

I've been looking into how Augeas handles escapes within regexps. I think
I've come across a significant set of problems, both with the bundled
lenses and the way Augeas does its escaping and unescaping.

I first noticed the problem when attempting to use a slash within a bracket
expression in a regexp:

  /[\/]/

This should match the slash only, but when used in Augeas it would also
match backslash.

POSIX regular expressions place no significance on backslashes within
bracket expressions. Since Augeas does not understand the \/ escape at all,
both characters are added to the character class.

Further investigation showed that many of the bundled lenses assume that
escape sequences like \. and \- would be unescaped. For instance, in Rx we
have:

  let email_addr = /[A-Za-z0-9_\+\.-]+@[A-Za-z0-9_\.-]+/

This allows backslashes in email addresses, when it is clear the intent was
only to escape the regexp metacharacters.

I have split fixes for all this into 6 patches. Patches 1 to 3 fix the
Cgconfig, Cron and FAI_DiskConfig lenses respectively. Here the use of
backslashes was actually producing incorrect ranges in character classes.
For example, in Cgconfig:

  let id = /[a-zA-Z0-9_\-\/\.]+/

contains a range from backslash to backslash, and the hyphen never even
made it into the character class.

Patch 4 goes through all the remaining cases of escaping inside bracket
expressions, with the exception of \/ and \\. This fixes things like that
email_addr regexp above.

Patch 5 fixes the escape() and unescape() functions in internal.c. The key
here is that the C-style escapes:

  \a \b \t \n \v \f \r

are common to both strings and regexps, but other escapes are not. These
"extra" escapes are passed through to these functions via an extra
parameter. For strings we allow the extra escapes:

  \" \\

as before. For regexps we use:

  \/ \\

Patch 6 removes \\ from this list of "extra" escapes for regexps. The idea
here is that it removes the need to use quadruple-escape in certain cases.
To match "backslash followed by any character", for instance, previously
one had to use:

  /\\\\./

Now it is sufficient to use:

  /\\./

However this is somewhat of a backward-incompatible change -- lenses with
the quadruple-escape need to be updated. For this reason I have kept this
patch separate, since I am not sure if such a change would be acceptable.

I believe these set of patches greatly simplify the way escapes work in
Augeas. The rules can be summarized as follows:

* \a, \b, \t, \n, \v, \f, \r are always treated as C-style escapes, and are
  replaced with their respective control characters.
* In strings, \" and \\ can be used to represent " and \ respectively.
* In regexps, \/ can be used to represent /. If patch 6 is omitted, \\ can
  be used to represent \ as well.

Questions or comments regarding these patches would be greatly appreciated.

- Michael

Michael Chapman (6):
  Cgconfig: Fix parsing of group names
  Cron: Fix parsing of numeric fields
  FAI_DiskConfig: Fix invalid escape sequence \s
  Fix escape sequences in bracket expressions
  Fix regular expression escaping
  Don't require backslashes to be escaped in regexps

 lenses/aliases.aug               |    2 +-
 lenses/cgconfig.aug              |    2 +-
 lenses/cgrules.aug               |    2 +-
 lenses/cron.aug                  |    2 +-
 lenses/darkice.aug               |    2 +-
 lenses/debctrl.aug               |    6 +++---
 lenses/dhclient.aug              |    2 +-
 lenses/dhcpd.aug                 |    2 +-
 lenses/dnsmasq.aug               |    2 +-
 lenses/exports.aug               |    2 +-
 lenses/fai_diskconfig.aug        |    6 +++---
 lenses/gdm.aug                   |    2 +-
 lenses/grub.aug                  |    2 +-
 lenses/httpd.aug                 |   14 +++++++-------
 lenses/inetd.aug                 |    6 +++---
 lenses/inifile.aug               |    2 +-
 lenses/interfaces.aug            |    2 +-
 lenses/iptables.aug              |    2 +-
 lenses/keepalived.aug            |    2 +-
 lenses/modprobe.aug              |    6 +++---
 lenses/openvpn.aug               |    2 +-
 lenses/pg_hba.aug                |    2 +-
 lenses/phpvars.aug               |    2 +-
 lenses/properties.aug            |    2 +-
 lenses/rx.aug                    |    2 +-
 lenses/shellvars.aug             |    4 ++--
 lenses/shellvars_list.aug        |    4 ++--
 lenses/solaris_system.aug        |    2 +-
 lenses/spacevars.aug             |    2 +-
 lenses/sudoers.aug               |   22 +++++++++++-----------
 lenses/sysconfig.aug             |    2 +-
 lenses/syslog.aug                |    4 ++--
 lenses/tests/test_cgconfig.aug   |    4 ++--
 lenses/wine.aug                  |   10 +++++-----
 lenses/xml.aug                   |    4 ++--
 src/augeas.c                     |    2 +-
 src/get.c                        |    2 +-
 src/internal.c                   |   26 ++++++++++++++++++--------
 src/internal.h                   |    8 ++++++--
 src/lens.c                       |   14 +++++++-------
 src/lexer.l                      |    4 ++--
 src/regexp.c                     |    4 ++--
 tests/modules/pass_cont_line.aug |    2 +-
 43 files changed, 106 insertions(+), 92 deletions(-)

-- 
1.7.6.4




More information about the augeas-devel mailing list