A colleague ran into a very interesting Google protection mechanism. She searched for “soapExtensionTypes” and got a 403 page saying “We’re sorry, but your query looks similar to automated requests from a computer virus or spyware application. To protect our users, we can’t process your request right now.” and a captcha to allow you to continue¬†(try it). It gets even weirder:

  • It takes 3 correct captcha responses to get it to proceed to the search (making really sure you’re a person!)
  • Even if you change your mind, ignore the captcha and search for something else (something safe), it still won’t let you, until you give a correct captcha response. They really do block your access
  • It’s session-specific. If I respond to the captcha correctly enough times to unblock the search results and then do the search from a new browser window, I get it again.

This is interesting. As the 403 page says, Google does this to “protect their users”. This implies that they’re worried about gaming results, otherwise how could a search on anything harm anyone besides me? If that’s the reason though, the search strings on which they decide to enforce this seem peculiar. I won’t rant about “soapExtensionTypes”, it’s reasonable that any way they use to determine which searches to block may get¬†a few wrong. But if this is primarily to prevent gaming the search engine, why do searches like “football tickets” not trigger it? I imagine that’s the type of thing that people would mostly be interested to game.

Oh well. It appears that I am now on Google’s provisional black list, as any search I do is blocked by a captcha (although if it’s a safe search string it only asks for one correct captcha response). I hope it goes back to “just working” soon.