Thursday, March 13, 2008

Accessing 'Must Sign Up to View' sites

  More and more sites try to capture you as a repeat visitor from your casual viewing of their site by forcing you to create an account to view or download their content. We will list a few techniques and services which will help you circumvent this annoying process.
  There is a ton of additional services/techniques more than we will list. Feel free to post your favorite in the comments. No account or sign up is required to post comments. The best technique to use may vary by site and the content you are after, but in general the following suggest should be considered in order.

BugMeNot is a site that allows people to share login accounts for accessing sites.


<BugMeNot Firefox Add On>

Web caches are saved copies of web pages. They can let you view sites who are no longer accessible and sometimes can cache sites not normally accessible.

Google automatically shows a link to cached versions of search results as shown above.

<More about Google caches>

The Wayback Machine (referring to the time machine in 'The Rocky and Bullwinkle Show' cartoon) is another web caching service.

<Internet Archive: Wayback Machine>

  User Agent spoofing is another possible technique. When a web page is requested by your browser some information is sent along. Part of that information is your <user agent> witch identifies your browser and possibly your operating system and their versions. This is to help websites display properly across many software clients and platforms.
  Web search engines have 'spiders' crawling the web indexing web sites. To comply with standards and prevent getting low search engine scores web sites typically do what they can to allow web spiders a larger amount of access to content. Spiders typically are identified by their user agent string. So, by spoofing a web spider's user agent string, you may have some less restricted access to content of sites. The following tools will help you spoof your user agent.

<Firefox add on: User Agent Switcher>
(this may require a quick <Google search of 'user agent list'> to find a list of common user agents to load into the tool.)

<Be The Bot> A web based proxy meant to request pages using a Google or Yahoo spider's user agent string.

