ACAP - Automated Content Access Protocol - http://www.the-acap.org/
Standard being developed on behalf of content publishers to communicate permissions information more extensively than is the case with robots.txt. Project documents, implementation and background information. |
All About Search Indexing Robots and Spiders - http://www.searchtools.com/robots/
Search Tools Consulting explains how the search engine programs called "robots" or "spiders" work, and reviews related sites. |
Bots vs Browsers - http://www.botsvsbrowsers.com
This large database lists user agents in categories and distinguishes between robots and browsers. |
HTTP User Agent Index - http://www.siteware.ch/webresources/useragents/db.html
An alphabetical list of user agents and the deployer behind them, compiled by Christoph Rüegg. |
List of Robot Agent Strings - http://www.pgts.com.au/pgtsj/pgtsj0208d.html
A list from PGTS of Web robots with the identifying data they leave in Web site logs. |
Robot IP Address - http://www.briandunning.com/seo/
Brian Dunnintg provides a list of all the major search engine robot IP addresses, by full class C only. |
Robotstxt.org - http://www.robotstxt.org/
Information on the robots.txt Robots Exclusion Standard and other articles about writing well-behaved Web robots. |
Search Engine IP Addresses - http://www.iplists.com/
Lists IP addresses of search engine spiders. Can be searched by IP address. Also links to resources on spiders. |
Search Engine Robots and Other User Agents - http://www.jafsoft.com/searchengines/webbots.html
John A. Fotheringham presents data in tabular form on the robots sent by search engines and other sites to read and index Web pages: their origins, names and IP addresses. |
User Agent String - http://user-agent-string.info/
Tool from ASAP Consulting s.r.o. for detailed user agent string analysis using an online form. Includes databases of browsers and robots. |