If you’ve ever mistyped a Google search request, you’ve probably seen the “did you mean?” prompt – a really useful touch that most of us use all the time, usually without thinking about it.
It would take Google a very long time indeed to process all the billions of possible search results and turn them into a dictionary it can check against – so how does this feature work?
It’s apparently actually not as clever as it sounds. Google records when a user types a search request, corrects themselves and enters a similar request – for example, type “Coka Kola” and then research for “Coca Cola” once they’ve realised what they’ve done. If enough people do this, it’s offered as a ‘did you mean?’ entry. Because Google gets such a massive number of hits – most web users find themselves searching it at least once a session – this becomes a useful list. Data from a handful of dud searches might produce weird results, but after hundreds of billions of searches, the mistakes work themselves out.
Above, I’m looking for the classic book Diet for a Small Planet. Once I’ve found it, I can find related books using Amazon’s “customers who bought this item also bought” functionality:
Now, most of us don’t shop like this. If I’m buying books for personal use from Amazon, I might pick up Diet for a Small Planet, plus the latest Douglas Coupland novel, plus maybe a Camille album or an old Doctor Who DVD. Unless we’ve got something specific in mind, we don’t tend to group our purchases. So why are the results on Amazon so focused?
Again, if this system was based on a handful of purchases, it’d probably be way out, and the feature would find itself removed from Amazon’s product pages as quick as you can say, “why is Amazon recommending me Yiddish comedy?” But because Amazon has the luxury of hundreds of billions of purchases (its annual revenue for 2005 was $8.49bn), it can refine those results through volume. People are different, and will buy all kinds of different things; the less relevant picks will be chosen less often. But the more people who have bought Book B at the same time as Book A, the more likely it is that people who are interested in Book A will also be interested in Book B. If I enjoyed Book A, that’s a boon for me – I can easily find something else I’ll probably enjoy. (If I hated it, there’s also a tool for me: LibraryThing’s Unsuggester turns the algorithm on its head very successfully.)
These things don’t require complex algorithms and terrabytes of computing power; they require people. We don’t need to use complicated logic and artificial intelligence to determine what people will likely find useful – we can simply use other people as the computer, store what they do and make some simple deductions.
If we transplanted this methodology to education, the benefits are obvious. One simple tool might allow users to rate the classes they belong to, and then use those to work out similar classes they might also find interesting based on the findings of students who have come before them.
However, this depends on classes persisting year on year (or semester on semester), and raises questions about new classes that couldn’t yet have ratings: would students be willing to take classes that hadn’t been rated if a similar, rated class already existed? What kind of system design techniques could we use to get over this? Again, we can look at sites like Amazon, where we find they promote new products with fewer ratings under “hot new releases”; another method might be to promote new classes above old ones. Finally, it’s possible that another method could be found to ‘rate’ classes, without having to display a score. We’re going to be introducing bookmarks into Elgg; the number of times a class has been bookmarked could be used as a metric to raise its place in search results. It could be combined with tags to provide a topic-sensitive recommendation service.
Institutions might not want to promote their classes like books on Amazon, although I think the idea has merit. However, this could be used for any kind of object within a system – we’re going to be applying it to files, blog posts, etc.
Classes aren’t the end of it. What if we could securely attach actual results, as well as ratings, and recommend things like majors and careers once the user has left the institution?
This kind of functionality requires something most of these systems are missing: alumni. However, this isn’t the only benefit users could get from letting alumni into their social networking systems. Not only does an active alumni population allow current students to use previous students as a resource to recommend classes, content and careers, but a service like this provides direct motivation for alumni to keep their details up-to-date, which in turn benefits the institution and allows them to retain much better data about what happens to people once they leave.
In general, all the applications I’ve outlined above improve as more and more people use them – another reason to keep alumni registered. There are other funding routes, including allowing companies to receive listings of graduating students who are interested in particular areas in exchange for a fee, and allowing private companies to pilot content within a university community and see how it does. This could be used to justify social networking systems to the powers that be within an institution, while providing more useful functionality to teachers and students alike. Institutions often have many thousands of people enrolled; connecting them together is revolutionary enough, but using that community as a computer for its own benefit, harnessing the wisdom of crowds, is a whole new level of social interaction and functionality.