On Wed, 23 Dec 2009, Scott Raun wrote:

> I've got a file. The file has one line of interest. Alpha characters 
> represent fixed strings, numbers represent variable strings.
>
> 	  ab1cfd2egb3chd4eib5cjd6ek
>
> I'm pretty sure I can write myself a regular expression that will 
> correspond to the numbered strings.  What I need is a tool that will 
> pass the matched string along to standard out.
>
> Suggestions?
>
> To put it in slightly less obscure terms - the line represents a string 
> of html code.  The numbers represent hypertext links and their matching 
> text.  I want to watch for a specific text string, and pass the 
> corresponding link to wget.  I think I've got all the pieces figured out 
> _except_ passing the link to wget.


It's only one line per file?  Usually for this kind of thing I'd write a 
bash script and pass the perl and grep to find the stings to pass to wget. 
It's probably better to just use perl, but I'm not expert enough on perl 
to do the wget call from perl.

Do you have pcregrep?  It's fantastic and you need it.  Use of "grep -E" 
(formerly known as "egrep") is just not as good as pcregrep where you can 
use perl regexes and multiline pattern matching.

For this kind of problem I would usually have perl search for the href and 
make that the beginning of the line so that there are never two hrefs per 
line.  Then I'd use grep or pcregrep to retain only the lines I want 
(those with the right anchor text), then perl again to strip out the 
everything except for the URLs.  Then I'd set up something like this:

IFS="
"

for URL in $( perl 'foo' filename | pcregrep 'baz' | perl 'bar' )
   do wget "$URL"
done


Hope that helps.

Mike