What is a Web Spider?


Web servers usually are not expecting spiders to scrape their data.  Some web sites want their content scraped and become more widespread ( bank websites' real estate listings, business listings, on-line advertisers ).  

Some websites contain content that is free to take - public information ( census, tax roll, court documents, govt. documents )

There are some websites who display un-copyrightable information yet they do not want people to spider them.  This could be your competition or large search engines/websites that block spiders.  They only want to spend their bandwidth on live people viewing content, not data collecting spiders.  Some do not want their data being saved into a database for other websites to use.  Dealing with this situation takes us to a whole new area of spidering: Stealth Spidering. 

< prev | next >