> I need a way to search through old email messages quickly and efficiently. > I would use this for listserv archives and, I hope, for personal email. Perl is very fast at matching arbitrary expressions. > For example, if I want to find every message where > "jones" (case insensitive) is found in the cc field and "linux" is found > in the message body, will they allow for that? Let's assume every message contains the string "Subject:" terminated by a new line (\n) and, when present, "CC:" also ends with a new line. Anything following that would be message body until the next message, starting with the "Date:" string. Something like =~ m/Subject:.*?\n.*?CC:.*?jones.*?\n.*?linux.*?Date:/gis might work. Other, more interesting, patterns might be "jones" and "linux" within 250 characters of each other. =~ m/jones.[0,250]linux/gis Would get half of them - we need another match for linux preceding jones. > Suppose you are searching for "Mike Jones" and your message happens to > look like this: > You really ought to talk to Mike > Jones about that issue. > Well, with Mike and Jones on two different lines, it won't match. > We need something that allows us to handle the newline appropriately. I've handled this in perl by reading a multi-line string into a variable, then using the s option to match as a single line (so . matches \n) - see http://perldoc.perl.org/perlreref.html =~ m/Mike\s*\n*\s*Jones/gis Should do the trick.