Quote:
Originally Posted by
rickhlwong
Dear all,
I want to use robots.txt to control the "spider". can i specify a IP address to ALLOW the website can be accessed by the "spider"??
thank you.
Rick
We block unwanted spiders by IP address using ipchains. The robots.txt file uses the User Agent field in HTTP.
Many aggressive spiders do not follow robots.txt and will have to be blocked using something like ipchains or "insert your favorite" firewall tool