How I killed 13 500 000 pages in the Google search engine
Posted on 2014-04-04 08:00:00
Talk about a loaded title, en par with the quality (or lack there of) of the various click bait titles on the postings I see on Facebook and friends...
I was told by my hosting provider that my index to the FreeBSD mailinglists at http://www.mavetju.org/mail/ was using more bandwidth alone than all of their public websites together. Now this is not much of a record, since they have only low-bandwidth websites, but still...
Looking through the logs, I saw that the Googlebot and the Bingbot and some bot from China were happily fighting over CPU and bandwidth to index all of the files. Going at it on a speed of about 50 requests per seconds for 24 hours per day.
So what could I do? Checking in Google for site:mavetju.org/mail/, I saw that there were about 13 500 000 pages indexed. For what goal? Not much anymore, I have stopped following all except the FreeBSD Announcement mailinglists a couple of years ago. I still use it on my laptops, but that is all.
So... That mailinglist archive has been shut down. You can still find the cached version of it in Google by using the above search terms, but that will disappear too.
And that is the story on how I killed 13 500 000 pages in Google. I wonder how much many computers in their data center that frees up for other things. Probably none...
No comments | Share on Facebook | Share on Twitter