> 
> I don't know, perhaps the simplest answer is that it is the web host I am using (GoDaddy) sells their web access log data....
> 

(Loren is smart.)

This is precisely why I asked the first question I asked. This looks like a
job from the provider, not some random entity outside. It computes... Here is
a company that wants to crawl deep and catalogue URLs and the web (for either
money-making or nefarious reasons). Here is a company that can make a buck (or
more than a buck) by selling its customers and their traffic as a product.
(Google operates on that model; we, the users, are its product and have been.)
Given the modern use of the internet for moneymaking under this model and that
out very government has gone the length to capture web traffic, this is no real
surprise.

Here is a problem with your weblogs being kept by your webserver provider,
it defeats HTTPS. Looking at logs from HTTPS is just like looking at logs
of HTTP, etc. Also, suppose you encrypted from the client-side and were
deciphering on the server side, it can be defeated. But somebody would have
to break laws (digging in your server-side files) to do it. However, there
is nothing but a terms-and-conditions negotiation with the provider to stop
them from doing anything with your weblogs. Do me a favour and put a print
statement in your Perl script that indicates that unahtorized access of this
URL is disallowed. It will not stop the problem, but it pushes a basis for
making the other party's access to your logs illegal.


Some general comments.

This is our fault... for wonting convenience. Our fault for making technology
so accessible to everyone. Anyone can have a website now, but that convenience
comes with lack of security/privacy, depending on how paranoid one is. We are
just not victimized as often.

I have been running all my webspots from a virtual server for about 10 years
now, starting form Slicehost before they were purchased by Rackspace. Back in
the slicehost days, and to my knowledge still today, they had been very
careful with not allowing inner and outer compromise of user data. To some
extent I am valnurable to loss of privacy. But at the very least I have my
web logs and SQL dbase on my "own" filesystem. There are questions about that
as well. form the inside (Rackspace), even encrypted containers can be accessed
because the system is virtualized. Anyway, the short story is that if you are
getting serviced by somebody else at the software level, you are giving up
a lot of security.

Please update us if you find anything else.