This is all over the web, so sorry if you’ve read it a thousand times already today. However:
Remember Google’s Department of Justice victory, where the search corporation refused to hand our search details over to the US government? This was a combination of a principled stand and protection of their business: search query data, correctly analysed, can be used to game Google Adsense and get the most bang for your advertising buck.
AOL’s search page is a rebranded Google Search. So isn’t it kind of them to release the search data for 500,000 of their users over a three month period? Exactly the kind of data Google went to court to protect? They claimed to be doing so for the research community, but you can bet – now that AOL is dropping subscriptions and going the ad-supported route – that more was afoot.
They pulled the data, but it shouldn’t surprise you to learn that it’s still all over the place. This is the web, after all.
The problem is, although user data is obfuscated (each username is replaced with a random number), you can still link search results to an individual anonymous user. Given people’s propensity to search for things relevant to their lives, this obfuscation becomes more or less meaningless. I’m not too proud to admit that I occasionally search for my own name to find out what people are saying about me, for example, and I know I’m not alone. I’ve also searched for takeaways near me, Elgg, e-learning software, travel insurance and outdoor theatre. It’s easy to build up a picture of someone’s life, and I’m not sure search companies should have this data to begin with.