On Tue, Nov 16, 2010 at 1:50 AM, Mike Miller
<mbmiller+l at gmail.com<mbmiller%2Bl at gmail.com>
> wrote:

> I thought some of you would be interested in this.  See the last two
> paragraphs for the take-home messages.
>
>
>
> So the finding here that might be useful in many situations is that when
> searching for a regexp in a big file, you might do much better to filter
> lines of the big file with a simpler, more inclusive grep, then do the
> regexp search on the stdout from the simple grep.
>
> Mike
>
>
I did enjoy reading that.  I think the real takeaway is that the more
complicated your regular expression, the longer it takes.  And by
complicated I mean the more wildcards and operators you have that induce
backtracking in the regular expression engine... Big logfiles make those
performance hits obvious.

Another trick I've learned with big logfiles is to load them into an SQL
database and then I can write searches as SQL queries.  Depending on the DB
you're using, you may have a nice little gui that makes writing queries and
manipulating results very easy.  I can't take credit for that one - but the
first time I saw someone do that I thought "holy cow.... why didn't I think
of that?"  Of course it helps if your log files are in CSV format or
something similar so you can slam everything into the right column easily.

-Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mn-linux.org/pipermail/tclug-list/attachments/20101116/35f5208b/attachment.htm