On Fri, 19 Aug 2005, Wakefield, Thad M. wrote:

>>>>
>>>> here's Mike Miller's one-liner:
>>>> [chrome at mailhost:~]$egrep '^From ' ~/Mail/procmail.log | gawk '{print $4" "$5", "$7}' | uniq -c
>
> In my continuing endeavor to understand pipes, redirects and xargs, is
> there a
> reason why you used egrep instead of: gawk '/^From /{print $4,$5", "$7}' ~/Mail/procmail.log ...?

Yes.  I didn't know you could do that with gawk!! ;-)

Now that I know, I compare the two in a speed test on my clunky old 
Solaris SPARC box:

# time ; gawk '/^From /{print $4,$5", "$7}' .procmail/log > /dev/null ; time
13.54u 4.91s 47:09.89 0.6%
47.36u 12.61s 48:13.93 2.0%

# time ; egrep '^From ' .procmail/log | gawk '{print $4,$5", "$7}' > /dev/null ; time
47.36u 12.63s 48:49.93 2.0%
73.84u 18.61s 49:19.48 3.1%

The egrep version took 30.15 s and the gawk-only version took 64.04 s.  So 
I think egrep is faster than gawk, by far.  Thus, this...

egrep 'regexp' file

...appears to be faster than this...

gawk '/regexp/{print $0}'

...though I think they do the same thing.

By the way, thanks for reminding me that the OFS in gawk is a space.  I 
did forget that and it is definitely easier to type a comma instead of " " 
repeatedly (I often have 5 or more space-delimited output fields).

Mike