On Tue, Nov 16, 2010 at 1:50 AM, Mike Miller <mbmiller+l at gmail.com<mbmiller%2Bl at gmail.com> > wrote: > I thought some of you would be interested in this. See the last two > paragraphs for the take-home messages. > > > > So the finding here that might be useful in many situations is that when > searching for a regexp in a big file, you might do much better to filter > lines of the big file with a simpler, more inclusive grep, then do the > regexp search on the stdout from the simple grep. > > Mike > > I did enjoy reading that. I think the real takeaway is that the more complicated your regular expression, the longer it takes. And by complicated I mean the more wildcards and operators you have that induce backtracking in the regular expression engine... Big logfiles make those performance hits obvious. Another trick I've learned with big logfiles is to load them into an SQL database and then I can write searches as SQL queries. Depending on the DB you're using, you may have a nice little gui that makes writing queries and manipulating results very easy. I can't take credit for that one - but the first time I saw someone do that I thought "holy cow.... why didn't I think of that?" Of course it helps if your log files are in CSV format or something similar so you can slam everything into the right column easily. -Rob -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.mn-linux.org/pipermail/tclug-list/attachments/20101116/35f5208b/attachment.htm