Re: dist-git help wanted: write me a regex!

On Mon, 2009-12-21 at 11:07 -0500, James Cassell wrote:
> On Mon, 21 Dec 2009 01:53:42 -0500, Bruno Wolff III <bruno wolff to> wrote:
> >
> >> /((Mon|Tues?|Wed|Thu(rs?)?|Fri|Sat|Sun)\s+(Jan|Feb|Mar|Apr|May|June?|July?|Aug|Sep|Oct|Nov|Dec)\s+[0-3]?[0-9]\s+(19|20)[0-9][0-9]\s+[A-Za-z0-9\s]+<[^\s ]+@[^\s@>]+>\s+2.[4-6].[0-9.-]+\s*)/
> > I don't think this will catch a period in the comment part of the email
> > address (as people often do after initials). Also if anyone is using  
> > hyphenated
> > names, I don't think those will get picked up. Since those entries are  
> > utf-8,
> > you need to worry about nonascii letters in the name. I am not sure how  
> > those
> > collate compared to ascii letters, but it might be safer to use [^<]+
> > (instead of [A-Za-z0-9\s]+)
> You are correct.  Here's the improved version:
> /((Mon|Tues?|Wed|Thu(rs?)?|Fri|Sat|Sun)\s+(Jan|Feb|Mar|Apr|May|June?|July?|Aug|Sep|Oct|Nov|Dec)\s+[0-3]?[0-9]\s+(19|20)[0-9][0-9]\s+[^<]+<[^\s ]+@[^\s@>]+>\s+2.[4-6].[0-9.-]+\s*)/
I'm having some difficulty applying this.  It's going into a perl file

$logmsg =~ s|/((Mon|Tues?|Wed|Thu(rs?)?|Fri|Sat|Sun)\s+(Jan|Feb|Mar|Apr|
+[^<]+<[^\s ]+@[^\s@>]+>\s+2.[4-6].[0-9.-]+\s*)/|mg

But that gives me Unmatched ( in regex; marked by <-- HERE in m/(( <--
HERE  or something along those lines.

The added "s|...|mg" is coming from other lines in this script which
look like:

$logmsg =~ s|^\s*\d\d*-\d\d*-\d\d*\s*[^<\n]*<[^>\n]*>\s*$|* \n|mg;

so I'm sure I'm screwing something up when putting it in the script.

