Hi, You can't do this by using the robots.txt file. If the spider is a friendly one that follows the rules, it first downloads the robots.txt file and checks which of the pages on that site it can index. But if it is a bad spider, it won't care about the robots.txt file. You can block the site visitors by IP or by other methods, but only the IP-based method is a good one. However, a bad spider can use more IPs and blocking those spiders won't help you too much. Why do you want to block them? -- Octavian ----- Original Message ----- From: "Dave" <dave.mehler@gmail.com> To: "'Blind sysadmins list'" <blind-sysadmins@lists.hodgsonfamily.org> Sent: Monday, August 31, 2009 1:32 AM Subject: [Blind-sysadmins] robots.txt or harvester blocks
Hello, This one is for webmasters or those who administer web servers. I'm looking for a robots.txt file or perhaps entries to add to an httpd.conf or .htaccess file to block or keep out bad spiders, crawlers that proliferate spam, or any type of marketing thingy, anything other than a search engine. If anyone has anything on this i'm interested. Thanks. Dave.
_______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/mailman/listinfo/blind-sysadmins