On Wed, 10 Jan 2007, G. Scott Walters wrote:

> I've got a couple hundred PDF files that have been malformed with some 
> extra lines AFTER the EOF. This keeps them from being doing important 
> things like printing, or displaying properly on some versions of 
> Acrobat. Not all PDFs are necessarily effected with this issue...
>
> Since these files are hosted on a linux server, I figured the proper 
> tool to solve this problem would be PERL. The question is, how....if I 
> open the file with a standard open function, won't it read the file til 
> the EOF and not beyond?
>
> I understand that SED might be helpful, but I'm sed-impaired, but I'm 
> working on that.


This should do it:

perl -pi -e 'BEGIN{undef $/} ; s/\A(.+?%%EOF).*\z/$1\n/gs' *.pdf

That will remove everything after the newline following the first %%EOF in 
all .pdf files in the default directory.  I tested it on some files and it 
worked.  It can be used if the file is not corrupted -- it will then leave 
the file unchanged except that it will change the date stamp.  It is 
pretty fast.

Best,
Mike