Activity Streams and OAuth: a social web architecture

March 12, 2010 | 3 comments

My previous post was a response to Gartner’s prediction last month that social networking would replace email as the “primary vehicle for interpersonal communications for 20 percent of business users.” In it, I named some properties that would need to be held by any social networking system that would successfully replace email.

  • Ease of use
  • Ubiquity across devices
  • Platform, service and infrastructure independence

My argument boiled down to the following statement:

Email has succeeded because it’s open, standard and decentralized; for social networks to replace it, they must also be open, standard and decentralized.

Email is useful because just about everybody has an email address. I can get in touch with my clients in London, my friends here in Oxford or my grandfather in Austin, Texas, with equal ease, even though all of them are using different infrastructure and software provided by different companies. I use Gmail, but there doesn’t need to be any kind of formal agreement between Google and whoever’s providing my grandfather’s email, say. It just works; nobody owns email as a communications method, and anyone can set up an email server. The same is true with websites: anyone can set one up, and nobody owns the web.

For social communications to be as popular and ubiquitous as email, there must be one social web, and it must be owned by nobody. That means that each socially-aware site or application must implement the same social communication standards.

The best standards aren’t dictated: they evolve through common usage. If you look at HTTP (the protocol that the web relies on), SMTP (one of the protocols behind email) and file formats like RSS and HTML, the common thread behind them is that they’re simple. It turns out that through excellent work at companies like Google, Plaxo, SixApart, Twitter, JanRain and – perhaps incredibly – JPMorgan Chase & co, we already have a number of technologies that collectively embody the properties I listed above.

Notes and server architecture for one possible social web

These are my ideas about how these standards might be used. These aren’t intended as replacements for existing social networking platforms or services; rather, they could easily be added as additional features both to those and to many other types of application. The ability to share isn’t a uniquely required feature of social networking software – think about its usefulness in applications like Word or Google Docs, for example.

With email, you use a software client (Outlook, say, or the Gmail web interface) that speaks to an email server which does the hard business of sending and receiving messages to and from the wider Internet. Here, I will be describing a system where everyone has their own node on the social web, which effectively acts as a client and server. Mine might be here at, for example. It’s my website – my profile on the social web – and it’s where I send social communications. That’s the server side. However, it also acts as the client when I’m accessing resources stored on other peoples’ servers.

Establishing connections and granting permissions

Let’s say I want to make a resource available to my clients. With email, I’d send them each a separate copy. This is both insecure and inefficient: I have no control over what happens to that copy, and each time I send it I create a new version. With some back-and-forth, there could easily be ten or twenty individual copies of a document floating around. (I often bounce software specifications – typically Word documents – around with my clients, and this is something that happens to me regularly. Google Docs is probably a better solution, but not everybody has a Google account.)

With the social web, only one version needs to exist, which I own. If my clients have established a connection with me, I can restrict that resource so that only they may see it. The tricky bit is that in order to know if it’s really them, they must be authenticated in some way.

In monolithic systems like Facebook, where everyone uses the same website, that’s easy: my client must be logged in, and we must have established a friend connection. In a decentralized system, that’s a much harder problem, but not insurmountable. Two technologies will help us:

  • OpenID: the open, decentralized authentication standard, which currently uses a website address as a kind of universal username
  • OAuth: an open protocol that “allows users to share their private resources (e.g. photos, videos, contact lists) stored on one site with another site without having to hand out their username and password.” OAuth provides a secret token to applications that they can use to access authenticated services and resources behind the scenes

Specifically, we’ll need OpenID Connect (or, until that’s up and running, the OpenID / OAuth hybrid protocol), because we’ll be using OpenID to authenticate, OAuth to power our decentralized access permissions, and a number of other protocols and endpoints along the way. It’s much neater if these are all established at once.

Making friends and getting updates

The process would work in the following way. Let’s say I want to make a connection with my friend Marcus Povey.

  1. I visit his site, and see that he is displaying a “connect to me” icon, indicating that it is a node on the social web. Later on, perhaps my browser would detect that this was a social web node in the same way that most browsers detect RSS feeds today, and light up an icon. Chris Messina has started a five part series on the browser as a social agent, which is worth a read.
  2. Either way, I click on “connect to me”. Marcus’s site prompts me for the address of my profile, which I enter. (Later on, my browser does this bit for me.)
  3. My profile address is an OpenID, and through the authentication process my social web node receives an OAuth token from him. No further authentication is required.
  4. On his social web node dashboard, Marcus sees that I’ve established a connection with him. He can ignore it, in which case nothing happens, or he can mark me as a friend (or any other arbitrary designation, which could be unique to the software he’s using).
  5. My social web node periodically checks for activity updates from Marcus’s, signing each request with that OAuth token so it knows who I am. This may be at my direct request; through repeated polling, RSS-style; or the update may be pushed to me through a PubSubHubbub ping.
  6. Depending on the assignation he’s given me, Marcus’s node either responds with just a feed of public activity (if he’s ignored the request), or with additional activity he’s allowed me to see, in Activity Streams format.
  7. Marcus can change my assignation or withdraw my OAuth token at any time from his dashboard. (Of course, throughout all this, the OAuth token mechanism is invisible to both users: it’s simply presented as a social connection.)

Embedded content and interacting directly on other social web nodes

Activity Streams is based on Atom, so content for items like blog posts (and resources like photos, using Atom Media) can be embedded directly in the activity feed. (Rob Dolin from Windows Live has some great examples.)

However, not all content is standard enough to be embeddable. In those cases, I can simply click through from Marcus’s activity update to his site, possibly log in again using OpenID, and interact with the content there. Additionally, by allowing users to log directly into his site via OpenID, Marcus can show selected people restricted content even if they don’t have the full range of social web software.

Friends lists and commenting

Further standards help us add extra functionality. If Marcus gives me permission, I might be able to download his contacts via Portable Contacts. Salmon is a protocol for commenting on distributed resources and allowing those comments to find their way upstream to the original, which is compatible with Activity Streams. Using this, I might be able to comment on Marcus’s activity items from within my dashboard and have them show up in his. Through this mechanism, all his friends could have a conversation on his activity stream items.


So far, so good: we have a simple technological basis for permissive social communications. But if the social web is really going to replace email, we have to address one of the most important features for enterprise users: reliability. Businesses will not accept their critical communications being subject to fail whales.

In my next posts in the series, then, I’ll discuss person-to-person messaging and the thorny issue of guaranteed delivery.

The Open Stack and truly open APIs

May 28, 2009 | Leave a comment

The following is an expanded version of the short talk I gave in Oxford yesterday. My original slides follow the transcript.

There’s a new kind of web development afoot, which marries old-school object-orientated programming techniques with the distributed power of the web.

Application Programming Interfaces (APIs) are published sets of instructions for programmatically querying or extending a service. For example, the Google Maps API allows anyone with some development skills to build applications based around Google Maps; the Twitter API allows anyone to build an application that lets you interact with Twitter (like Tweetie or Tweetdeck). The act of allowing anyone to do this is generally thought of as being “open” – as in, open to anybody. While that’s true, in another sense they’re very closed.

The trouble is, if you write a microblogging application using the Twitter API, you’re locked into Twitter. If you want to write an application for another microblogging service, you have to use their API and start from scratch again. The formats produced by each service’s API are proprietary, as are the methods to query them. They’re incompatible with each other by design, because those services don’t really want you moving users’ data around between them. They want users to sign up with them and then stay there – a great proposition for the service, but a lousy one for the users, who have no real way of importing or exporting the content they’ve created.

Furthermore, there are some situations where this one-API-per-service model breaks down completely. Let’s say you’ve joined a social network and you want to check to see which of your friends is already there. The service provider could use a Gmail API to check your address book, but some users will use Hotmail for their email, so they’ll need to use the Hotmail API as well. Repeat for Yahoo, and every single email provider under the sun – there’s no way anyone could possibly write this feature without a generic API that works across all services.

Enter the Open Stack, which is a set of generic APIs designed to provide common tasks:

  • eXtensible Resource Descriptor (XRD): allows an application to discover resources and APIs relating to a particular page (for example, an application could check a user’s profile page and discover that they have an OpenID). There’s no point in having open APIs if an application can’t find them, and XRD fills this gap.
  • OpenID: allows anyone to log into any service that supports it using the same digital identity (potentially linking that user’s accounts on those services together). A ID is an OpenID, for example; you can log into any OpenID-compatible site with it.
  • OAuth: a way to authenticate access to API-based services without forcing the user to present their username and password for that service. Previously usernames and passwords were passed in order to authenticate APIs, which led to serious security issues. Here a user can easily and securely grant or deny an application’s access to their data. OAuth can also be used to apply granular access controls to a piece of data.
  • Activity Streams: a way to broadcast your activity on each service (for example ‘Ben has published a blog post’, a la Facebook’s river) so that it can be aggregated at a central point.
  • Portable Contacts: a simple way to transmit collections of contacts, like email address books and lists of friends, between services.
  • OpenSocial: provides a framework for hosting remote application widgets on a service.

In programming jargon, a stack is a specific data or platform structure; actually, this is more of a pile of useful technologies that can (in part) be used together.

These generic APIs allow for a more distributed kind of web application. Suddenly, instead of writing support for a particular function from scratch, you can pick up an existing code library and slot it in. You can interact with any service that supports them without writing any further code.

However, there are some unresolved issues that need to be discussed. One major headache is the privacy implications. If I sign up to, I agree to the terms of service and privacy policy there. However, behind the scenes it might be accessing the APIs for, which has a very different privacy policy and terms and conditions. How does that second set of terms and conditions apply, and what’s the legality of passing a user’s data across two services with two different sets of terms, when the user only knows about one of them?

A second issue is a user’s control over their data. When sends data to, it may get cached and duplicated. What happens if I delete the original data on So far in the open stack there is nothing to handle deletion of content. It’s a fair argument that when something is published publicly you lose control over its distribution; however, if access has been limited using OAuth or another method, there’s no way of reaching out an ensuring its removal from every system it’s been sent to. That would be fine if the data had been sent to another person or application with the user’s express permission, because they’ll be aware of the implications. However, if it’s been sent completely behind the scenes, they have no way of knowing that it was sent to begin with.

These are issues with any API-based service, but generalized APIs are likely to be used more frequently. Solutions to all of these problems will be found, but it’s important to note that they’re not there yet – which serves as a warning to applications developers and an opportunity for anyone who wants to step up and provide them. As this kind of open API becomes commonplace, new kinds of web applications will begin to emerge as developers spend less time reinventing the wheel and more time innovating. It’s just one of the things that makes the web as a platform so exciting to work with.