Direct messaging in a social web architecture

Ben Werdmuller — March 31, 2010

This post is the third segment in my series on an architecture for the social web. Previously: How social networks can replace email, which is a non-technical approach to the issues, and my follow-up describing how to build a social web architecture using available technology today.

So what about direct messaging?

In my previous post, I described content notifications in the social web as being Activity Streams updates in response to requests signed with an OAuth key. Each individual contact would have his or her own OAuth key, and the system would adjust delivered content depending on access permissions I had assigned to them.

A private message in this architecture could just be represented as an item of content restricted to a small set of recipients (in the email use case, this is typically just one), with replies delivered using Salmon. The advantage of this approach is that the message doesn’t have to be text; it can be audio, video, a link to live software, or something else entirely.

However, while this is technically feasible, it may not always be desirable. We know from Google Wave, which also pushes the boundaries of person-to-person messaging, that an open definition of what a message contains can get very messy very quickly. Although I was one of the first people to have one, I no longer check my Wave account regularly. I believe this is mostly a user interface issue: Wave is an awesome collaborative document editor (what I’ve heard described as “a massively multiplayer whiteboard”), but not in any way the evolution of email that its development team claimed.

Therefore, I think it’s useful to think about the difference between a document and a message:

  • A message is the body of a communication.
  • A document is a bounded representation of some kind of information.

While in many ways they’re the same, I think it makes sense to make a separation on the UI level. As we’re discussing a decentralized architecture here, some kind of semantic marker in our activity stream feed to mark something as a message would be a useful feature.

Messaging “out of the blue”

You know where you are with an email address. Mine is ben@benwerd.com. Anyone who encounters that string of characters, whether on a website like this one, a business card or a scribbled note on a piece of paper, is able to send me a message from anywhere in the world. In the 17 years I’ve had an email address, the list of friendships and business connections I’ve made, and opportunities I’ve received and developed, through this simple mechanism has been uncountable. It’s also likely to continue far into the future.

Compared to this, visiting someone’s social web profile and sending them a message from their web presence is a hassle. Compare these steps:

  1. Receive the address of someone’s profile
  2. Click the “follow” button either on the profile itself or on the toolbar of your social web compatible browser
  3. Wait for the contact to follow you back
  4. Send your message

To:

  1. Receive someone’s email address
  2. Send a message to that address

It’s simple, ubiquitous, decentralized and universally compatible. In fact, it seems hard to improve on, doesn’t it?

However, as this is a thought experiment about how social networking can replace email, let’s see if we can simplify this process somewhat. In my previous post, I discussed how a connection could be established with OpenID and OAuth through a web-based interface on a social web profile. How can we make this as simple as emailing someone, and cut out most of the steps I’ve listed above?

Connecting programmatically

I propose two additions to my previously discussed mechanism. The first is to expand the connection protocol to include a message. If someone connects to me on LinkedIn or Facebook, I receive some explanatory text from them, so it makes sense to include this feature in our decentralized social web architecture. It is likely that this would be an added parameter to the OAuth request token procedure.

The second is to allow connections to be made programmatically through a custom application. Just as we use email clients now, a social web client could automatically send a connection request. In keeping with our principle of using existing technology where possible, this is a simple OAuth connection request from the application, which includes a user message as described above. The application knows our details because we’ve set our preferences, so we’re never visibly redirected to a web browser to complete authentication. (In fact, this could take place using xAuth, a version of the OAuth protocol being developed for just these sorts of browser-free use cases.)

Whether we can send a follow-up message now depends on the receiving party. We have our OAuth token, and while it remains valid, the receiving social web node may choose to ignore any follow-up requests.

Our procedure has become:

  1. Obtain address of someone’s social web node (you could even infer it using WebFinger)
  2. Send a message to that node, bundled with a connection request

This is significantly better, and is comparable to the simplicity of email.

You may be wondering about the wisdom of adding everyone you contact as a connection. In fact, there’s some precedent for this already in applications like GMail. It’s important to note that not every connection need be a friend: in some ways, you can think of your total list of connections as your contact book. Some are important, some can be safely squirreled away until you need to contact them again. In this context (or any context where people you have a relationship with and people you’ve contacted are merged into one set), an adequate person management interface – or CRM to you and me – becomes important.

Next, and finally: let’s make our distributed social web architecture reliable enough to use in enterprise environments, using message queue protocols like ZeroMQ and AMQP.

Activity Streams and OAuth: a social web architecture

Ben Werdmuller — March 12, 2010

My previous post was a response to Gartner’s prediction last month that social networking would replace email as the “primary vehicle for interpersonal communications for 20 percent of business users.” In it, I named some properties that would need to be held by any social networking system that would successfully replace email.

  • Ease of use
  • Ubiquity across devices
  • Platform, service and infrastructure independence

My argument boiled down to the following statement:

Email has succeeded because it’s open, standard and decentralized; for social networks to replace it, they must also be open, standard and decentralized.

Email is useful because just about everybody has an email address. I can get in touch with my clients in London, my friends here in Oxford or my grandfather in Austin, Texas, with equal ease, even though all of them are using different infrastructure and software provided by different companies. I use Gmail, but there doesn’t need to be any kind of formal agreement between Google and whoever’s providing my grandfather’s email, say. It just works; nobody owns email as a communications method, and anyone can set up an email server. The same is true with websites: anyone can set one up, and nobody owns the web.

For social communications to be as popular and ubiquitous as email, there must be one social web, and it must be owned by nobody. That means that each socially-aware site or application must implement the same social communication standards.

The best standards aren’t dictated: they evolve through common usage. If you look at HTTP (the protocol that the web relies on), SMTP (one of the protocols behind email) and file formats like RSS and HTML, the common thread behind them is that they’re simple. It turns out that through excellent work at companies like Google, Plaxo, SixApart, Twitter, JanRain and – perhaps incredibly – JPMorgan Chase & co, we already have a number of technologies that collectively embody the properties I listed above.

Notes and server architecture for one possible social web

These are my ideas about how these standards might be used. These aren’t intended as replacements for existing social networking platforms or services; rather, they could easily be added as additional features both to those and to many other types of application. The ability to share isn’t a uniquely required feature of social networking software – think about its usefulness in applications like Word or Google Docs, for example.

With email, you use a software client (Outlook, say, or the Gmail web interface) that speaks to an email server which does the hard business of sending and receiving messages to and from the wider Internet. Here, I will be describing a system where everyone has their own node on the social web, which effectively acts as a client and server. Mine might be here at benwerd.com, for example. It’s my website – my profile on the social web – and it’s where I send social communications. That’s the server side. However, it also acts as the client when I’m accessing resources stored on other peoples’ servers.

Establishing connections and granting permissions

Let’s say I want to make a resource available to my clients. With email, I’d send them each a separate copy. This is both insecure and inefficient: I have no control over what happens to that copy, and each time I send it I create a new version. With some back-and-forth, there could easily be ten or twenty individual copies of a document floating around. (I often bounce software specifications – typically Word documents – around with my clients, and this is something that happens to me regularly. Google Docs is probably a better solution, but not everybody has a Google account.)

With the social web, only one version needs to exist, which I own. If my clients have established a connection with me, I can restrict that resource so that only they may see it. The tricky bit is that in order to know if it’s really them, they must be authenticated in some way.

In monolithic systems like Facebook, where everyone uses the same website, that’s easy: my client must be logged in, and we must have established a friend connection. In a decentralized system, that’s a much harder problem, but not insurmountable. Two technologies will help us:

  • OpenID: the open, decentralized authentication standard, which currently uses a website address as a kind of universal username
  • OAuth: an open protocol that “allows users to share their private resources (e.g. photos, videos, contact lists) stored on one site with another site without having to hand out their username and password.” OAuth provides a secret token to applications that they can use to access authenticated services and resources behind the scenes

Specifically, we’ll need OpenID Connect (or, until that’s up and running, the OpenID / OAuth hybrid protocol), because we’ll be using OpenID to authenticate, OAuth to power our decentralized access permissions, and a number of other protocols and endpoints along the way. It’s much neater if these are all established at once.

Making friends and getting updates

The process would work in the following way. Let’s say I want to make a connection with my friend Marcus Povey.

  1. I visit his site, and see that he is displaying a “connect to me” icon, indicating that it is a node on the social web. Later on, perhaps my browser would detect that this was a social web node in the same way that most browsers detect RSS feeds today, and light up an icon. Chris Messina has started a five part series on the browser as a social agent, which is worth a read.
  2. Either way, I click on “connect to me”. Marcus’s site prompts me for the address of my profile, which I enter. (Later on, my browser does this bit for me.)
  3. My profile address is an OpenID, and through the authentication process my social web node receives an OAuth token from him. No further authentication is required.
  4. On his social web node dashboard, Marcus sees that I’ve established a connection with him. He can ignore it, in which case nothing happens, or he can mark me as a friend (or any other arbitrary designation, which could be unique to the software he’s using).
  5. My social web node periodically checks for activity updates from Marcus’s, signing each request with that OAuth token so it knows who I am. This may be at my direct request; through repeated polling, RSS-style; or the update may be pushed to me through a PubSubHubbub ping.
  6. Depending on the assignation he’s given me, Marcus’s node either responds with just a feed of public activity (if he’s ignored the request), or with additional activity he’s allowed me to see, in Activity Streams format.
  7. Marcus can change my assignation or withdraw my OAuth token at any time from his dashboard. (Of course, throughout all this, the OAuth token mechanism is invisible to both users: it’s simply presented as a social connection.)

Embedded content and interacting directly on other social web nodes

Activity Streams is based on Atom, so content for items like blog posts (and resources like photos, using Atom Media) can be embedded directly in the activity feed. (Rob Dolin from Windows Live has some great examples.)

However, not all content is standard enough to be embeddable. In those cases, I can simply click through from Marcus’s activity update to his site, possibly log in again using OpenID, and interact with the content there. Additionally, by allowing users to log directly into his site via OpenID, Marcus can show selected people restricted content even if they don’t have the full range of social web software.

Friends lists and commenting

Further standards help us add extra functionality. If Marcus gives me permission, I might be able to download his contacts via Portable Contacts. Salmon is a protocol for commenting on distributed resources and allowing those comments to find their way upstream to the original, which is compatible with Activity Streams. Using this, I might be able to comment on Marcus’s activity items from within my dashboard and have them show up in his. Through this mechanism, all his friends could have a conversation on his activity stream items.

Reliability

So far, so good: we have a simple technological basis for permissive social communications. But if the social web is really going to replace email, we have to address one of the most important features for enterprise users: reliability. Businesses will not accept their critical communications being subject to fail whales.

In my next posts in the series, then, I’ll discuss person-to-person messaging and the thorny issue of guaranteed delivery.

How social networks can replace email

Ben Werdmuller — February 3, 2010

The analysis firm Gartner just released five key predictions for social software:

  1. By 2014, social networking services will replace e-mail as the primary vehicle for interpersonal communications for 20 percent of business users.
  2. By 2012, over 50 percent of enterprises will use activity streams that include microblogging, but stand-alone enterprise microblogging will have less than 5 percent penetration.
  3. Through 2012, over 70 percent of IT-dominated social media initiatives will fail.
  4. Within five years, 70 percent of collaboration and communications applications designed on PCs will be modeled after user experience lessons from smartphone collaboration applications.
  5. Through 2015, only 25 percent of enterprises will routinely utilize social network analysis to improve performance and productivity.

Social networks replacing email. Really?

I broadly agree with all of these, but that first prediction needs a little more analysis. Let’s think about why email has succeeded:

  • Ease of use
  • Ubiquity across devices
  • Platform, service and infrastructure independence

I access email from my Dell PC, my iPhone, and have in the past used Blackberry phones, Macs, Linux boxes, etc, all the way down to Windows 3.1, using a combination of software that’s included Eudora, Thunderbird, Phoenix, Turnpike, and many more. Right now I use a combination of GMail, Google Apps and self-hosted email addresses; in the past I’ve used Microsoft Exchange in various guises, Yahoo Mail, and so on. No matter which provider or hardware I used, I could email anyone else with an email address, no matter which provider or hardware they used. Email is a completely open, interoperable standard.

Social networking is anything but an open, interoperable standard. If you use Facebook, you can communicate with other people on Facebook, full stop. Even networks based on open source solutions like Elgg are essentially social islands.

What needs to be done?

I strongly believe that social messaging can be significantly more useful to both enterprises and individuals than standard email. Proof-of-concept applications like Google Wave are beginning to show the way: you can make resources available to whoever needs to see them, rather than the current, inherently insecure practice of making copies and sending them out. Whereas email takes inspiration from letters and faxes, the social messaging paradigm is based more closely around conference calls and conversations.

Nonetheless, in a business situation, you need to be reasonably certain your message is going to reach the recipient, and the current platform constraints – only being able to message someone using the same site as you – are untenable. Let’s look again at those email success factors:

  • Ease of use
  • Ubiquity across devices
  • Platform, service and infrastructure independence

Social networks do currently have ease of use. They may approach near-ubiquity across devices only if they create a developer ecosystem around their proprietary APIs, as Twitter has done, but this requires a lot of faith in a single third-party service.

No, I think it comes down to one principle:

Email has succeeded because it’s open, standard and decentralized; for social networks to replace it, they must also be open, standard and decentralized.

Next: real world, technical approaches to this that can be implemented today.

Microsoft may rule the open web

Ben Werdmuller — November 18, 2009

Yesterday, I posted some commentary on Tim O’Reilly’s take on the web as an application platform, and agreed that Microsoft championing the open web would be a very smart strategy for them.

Previously, I’d talked about the issues with cloud computing at the moment, and how an iPhone App Store approach to web applications would dramatically increase security and ease-of-use, and therefore the whole experience:

What if we could fix all of these things at once? Enterprises, organizations and individuals could have their own, more secure environment that would allow them to use the cloud applications they needed with fewer security risks, while enjoying the ease-of-use and immediacy that the cloud provides.

[…] Imagine if you could get your own server environment that was as easy to use as the iPhone.

Windows Azure is that product, built on their web platform infrastructure. Jorge Escobar took a look:

It picked my interest. A Web Platform Installer? Microsoft doing PHP?

I went to the URL provided and I was blown away with the concept behind this application. Basically Windows has introduced point-and-click cloud computing for the masses and it’s doing it in a way that resembles the iPhone application directory but for web applications.

The app gallery is available to browse today, and includes well known applications like WordPress, Moodle and SugarCRM. They also have a product, the Web Platform Installer, available right now, which allows you to use these apps and easily set up a web environment on your own computer or server. Windows Azure will use the same model, but without the need for your own server: the applications will install seamlessly into the cloud. Personal users get their own cloud application space; enterprise users get to use their own infrastructure for extra security. This is where Microsoft’s going, and it’s very clever indeed.

The war for the Web

Ben Werdmuller — November 17, 2009

Tim O’Reilly has a great piece up on Radar:

If you’ve followed my thinking about Web 2.0 from the beginning, you know that I believe we are engaged in a long term project to build an internet operating system. (Check out the program for the first O’Reilly Emerging Technology Conference in 2002 (pdf).) In my talks over the years, I’ve argued that there are two models of operating system, which I have characterized as "One Ring to Rule Them All" and "Small Pieces Loosely Joined," with the latter represented by a routing map of the Internet.

This is exactly it (although for technical accuracy, I prefer the term “application platform” to “operating system”). The “one ring to rule them all” approach is the game being played by companies like Facebook and Google. “Small pieces loosely joined” is the open approach, which seeks to create an Internet application platform that isn’t reliant on any one service provider – much like most of the rest of the Internet works today. (Anyone can run an email server, for example, without having to hook up to a central email provider.) I strongly believe that this second approach is the only one that can ensure a secure future for the web.

The full article is worth a read. Most intriguing, for me, is Tim’s postscript:

P.S. One prediction: Microsoft will emerge as a champion of the open web platform, supporting interoperable web services from many independent players, much as IBM emerged as the leading enterprise backer of Linux.

I had a conversation yesterday with someone related to Microsoft which suggests that this isn’t the case. Nonetheless, it’s a genius strategy, and I hope someone up there in MicrosoftLand is listening. (And hey, Microsoft, if that’s what you’re up to – I want in.)

Beyond the echo chamber

Ben Werdmuller — June 22, 2009

It’s exciting to see some of the big names in the Silicon Valley web scene shift gears from evangelizing about the power of the social web to explaining how it can be used to the outside world. For example, Robert Scoble, sometime Microsoft videoblogger and latter day net celeb has started Building 43:

A few people here and there are trying. I watch what Chris Messina, David Recordon, Marc Canter, Joseph Smarr, Kaliya Hamlin, and a group of others are trying to do by pushing a more open web. Those are the kinds of efforts that inspire me and are inspiring Building43. Can we build on what they are trying to do and take it to main street?

Marc Canter is taking it a step further and moving to Cleveland, Ohio, in order to start a new company that helps create Digital Cities:

Where workforce development, content production and local foods meet in the valley of health care, medical digitizing and the history polymers.  Add to that some Seniors interviews, green jobs knowledge bases and authorized venues, community services and common constructs – and you have our project!  Oh yah – and a business directory of……

In both cases, they’re taking the ideas that the web community has created – open, democratic platforms for content agnostic collaboration – and bringing them to communities and people who might not have been exposed to them but could benefit in real, tangible ways. The message I’m getting is that the theory has gained momentum and is rolling into something great; now it’s time to bring it to the world.

And me? I’m in Washington DC this morning, talking with the AAC&U about how these ideas can be used in education.

Social networking: beyond the silo

Ben Werdmuller — June 8, 2009
  1. The rise of social networking
  2. Monetization vs. collaboration
  3. The open web
  4. Fluid collaboration

The rise of social networking

Social forces have been the driving force behind application innovation on the web. Whereas previously we might have looked to advances in computer science for new directions, now some of the most dramatically impactful applications are lightweight, simple, and technologically unimpressive. The best new web applications have centered around collaboration, sharing and discovery with other people.

Correspondingly, enterprises have been relatively quick to pick up on this trend, and software vendors have been quick to grab the market. In an Intranet Journal article earlier this year, Kara Pernice, managing director at the Nielsen Normal Group, had this to say about the rise of social technology on the intranet:

"In the 9 years [the Intranet Design Annual, which highlights the ten best-designed intranets of the year] has been coming out (since 2001), I’ve never seen a change quite as great as this one."

On the Internet at large, social network use is growing at ten times the rate of other activities and now accounts for 10% of all online time, according to Nielsen Online in this March 2009 report (PDF), and is now more popular than email. Jerimiah Owyang has a list of more relevant statistics over on this digest blog post. Executive summary: social networks are big, transformative in terms of how we communicate and share information, and growing at an enormous rate.

Monetization vs. collaboration

Wikipedia defines a “walled garden”, in software terms, as being:

[..] A closed set or exclusive set of information services provided for users (a method of creating a monopoly or securing an information system).

In other words, a walled garden is a system where the data can not easily be imported or exported. These are often also called data silos, after the solid buildings used for secure storage.

Facebook, the #1 social networking site in most western countries, has over 200 million users, including over 30 million who update their profiles at least once a day. The network is free to use, yet their revenue for 2008 has been estimated at around $265 million, despite a decidedly “in progress” revenue strategy.

This has traditionally required a walled garden strategy: the content that users put into Facebook has not been easily removed for export or viewing in other interfaces, in order to preserve revenue from advertising (and – although this is a hunch – revenue from statistical analysis of users’ data). It’s only been in the light of some extremely negative publicity (for example this February 2008 New York Times article) that they have begun to relax this policy and embrace the open direction that much of the rest of the web is heading in.

Speaking personally, I get more enquiries from people wanting to build something “Facebook-like” than anything else, presumably because of its phenomenal popularity. However, this kind of walled garden approach is not conducive to true collaboration; generally people who ask for this are lacking a full understanding of the processes involved in social networking.

According to Nielsen, there are almost 1.6 billion people online. While Facebook’s 200 million sounds like a lot, it’s actually a drop in the digital ocean – so what happens if I want to share a Facebook conversation with someone who hasn’t signed up? The only way is currently to email them a link and force them to register for the service. Facebook would love me to do this, of course, because they get more eyeballs to view their ads and more people to fill in profiles. But what’s the point of even being on the web if you can’t make use of the decentralized communication features that form its backbone?

If I want to collaborate effectively online centering around a resource (which could be a file, a discussion or a pointer to something external), I need to be able to:

  • Share that resource with the people who need to see it
  • Grant access for them to edit it if required
  • Notify them that it’s been shared with them
  • Restrict access from everyone else

Furthermore, I need to do this with the lowest possible barrier to entry. My aim is to collaborate, not to get people to use a particular piece of software. By restricting this process, the Facebook model hinders collaboration.

The open web

The web was designed to be an open system, and adheres to principles (notably “every object addressable”, ensuring that every resource on the web has a unique reference address) set out by Doug Engelbart for open hypertext systems generally. Because web pages are interoperable, and all use the same basic standards, any page on the web is allowed to link to any other page on the web, no matter who wrote it or where it is hosted. In many ways that’s the key to why the platform is successful: despite being fragmented across millions of computers throughout the world, it navigates like a cohesive whole and can be viewed using a single piece of browsing software. (The downside to this is that the whole platform lives or dies depending on the capabilities of the browser you use: the sad fact is that Internet Explorer users, who often don’t have a choice because of policy decisions in their working environment, are at a disadvantage.)

While the original web was content-based, the social web is collaborative and centered around live data. However, because web applications are each developed separately using different sets of back-end infrastructure, their data does not adhere to the principle of interoperability – their user interfaces all use the same basic standards and can be viewed in a browser, but the underlying applications and data models tend to not work with each other. When social networks emerged, for example, there was no way to get Livejournal and Friendster, two of the pioneers in the space, to speak the same language; you still can’t add someone as a friend on one social network from another. More recently, this has become apparent in the walled garden approaches of Facebook and others.

Not only does this situation create a bottleneck for application design, and run contrary to the underlying principles that made the web a success, but it’s also a bottleneck to better collaboration. As Tim Berners-Lee, the web’s inventor, put it recently in this essential TED talk, data needs to be linked and interoperable in the same way pages are now. Beyond that, because walled garden services are making money out of the private information we’re loading onto them, there’s a human issue regarding the overall control of that data. Marc Canter, Joseph Smarr and others codified this into a Bill of Rights for users of the social web back in 2007. Though the issue has moved on since then, the underlying principles set out there are essential for open, collaborative, social tools on the web.

While the World Wide Web Consortium works on academically-developed standards for linked data in the form of the semantic web, developers have been getting their game on trying to solve the problems of interoperability between their applications and user control over their data. Application Programming Interfaces (APIs) – published sets of instructions for programmatically querying and extending web applications – have become popular, but in a very walled garden kind of way. Arguably the most successful has been Twitter’s API, which has led to a number of high profile third-party applications like TweetDeck and Tweetie that collectively eclipse Twitter’s own website interface in volume of usage. But these APIs are their own form of walled garden: an application written for Twitter will only work with Twitter, for example. The APIs are not generalized between applications, and as such are not truly open; in many ways they’re a way for services to get more functionality and reach for free.

One of the first attempts to publicize the benefits of truly open data was Marc Canter’s Data Sharing Summit, which I wrote about at the time for ZDNet. Chris Saad’s DataPortability.org attempted (largely successfully) to brand it, and latterly the Open Web Foundation has attracted some of the web’s leading lights in order to create a single organization to handle the creation of a set of open web application standards. Many of these comprise the Open Stack, which I’ve written about before; more generally, Chris Messina has written a very thoughtful overview on the topic.

Fluid collaboration

It used to be that to use the web, you would need to sit down at your computer and log on. Those days are over; the web is becoming more and more ubiquitous, thanks to devices like the iPhone. It’s also being integrated into software that wasn’t previously connected – it’s as easy, for example, to paste the URL of an image into the ‘Insert Image’ dialog box in most word processors as it is to pick an image from your own hard disk. The open, generalized API standards being created by groups like the Open Web Foundation bring us closer to enjoying that level of integration with collaborative social technologies.

The Internet is people, not technology: tools on the web (or anywhere else) facilitate social networks, but are not the network themselves. Currently they consist of destination sites, like Facebook, LinkedIn or Twitter – places that you explicitly have to visit in order to collaborate or share. This is the currently-fashionable model, but it’s a necessarily limited view of how collaboration can take place: all of these sites thrive on the walled garden model and are designed around keeping participation within their walls.

Not everything on the Internet works this way. Email, and increasingly Instant Messaging, are two technologies that generally do not: messages on email, Jabber and to a much lesser extent Skype are peer-to-peer and do not go through a central service:

  1. You select the people you wish to collaborate (in this case, email or chat) with. Nobody but the listed recipients will be able to see the content you share with them, and it doesn’t matter if they’re using the same service as you; you don’t have to invite them to join email in the same way you have to invite people to join Facebook.
  2. You write your content.
  3. You send it.
  4. They (hopefully) send content back.
  5. The collaborative exchange lasts only as long as it’s useful, and then disappears (but is archived for reference).

Recently, Google announced Wave, a decentralized pairing of protocol and open source web application that took email and IM as its inspirations to redefine how collaborative social technologies could work. Questions have been raised about how a decentralized tool like this can work with corporate data policies present in most large enterprises and public sector organizations, but in some ways they miss the point: Google Wave is best thought of as a proof of concept for how decentralized, transient communities can work in a standard way on the web. In short, websites are a kind of walled garden in themselves: what we will return to is the idea of the web as an open patchwork of people, data and information that links together to form a whole, much stronger than the sum of its parts.

Predicting the future of social networking on the web is hard. However, I believe that as general open social technologies develop and become more commonplace, the “social networking site” will shrink in importance – instead, social network facilitators will become more and more ingrained in all the software you use. This will dramatically increase the types of content and communication that can be used, and present opportunities for much wider, more fluid and – most importantly – more productive collaboration as a whole.

User control on the open web

Ben Werdmuller — February 21, 2009

Data portability and the open data movement (“the open web” for simplicity’s sake) revolve around the idea that you should be able to take your data from one service to another without restriction, as well as control who gets to see it and how. Very simply, it’s your data, so you should have the ability to do what you like with it. That means that, for example, if you want to take your WordPress blog posts and import them into MovableType (WordPress’s competitor), you should be able to. Or you should be able to take your activity from Facebook and include it in your personal website, or export your Gmail contacts for backup or transfer to a rival email service.

You can do this on your desktop: for example, you can open a Word document in hundreds of wordprocessors, and Macs will happily talk to Windows machines on a network. Allowing this sort of data transport is good for the web in the same way it’s good for offline software: it forces companies to compete on features rather than the number of people they can lock into their services. It also ensures that if a service provider goes out of business, a user’s data on that service doesn’t have to disappear with it.

In 2007, before the open web hit most peoples’ radars, Marc Canter organised the first Data Sharing Summit, which was a communal discussion between all the major Silicon Valley players, as well as many outside companies who flew in specially to participate (I attended, representing Elgg). One of the major outcomes was the importance of central control: the user owns their data. Marc, Joseph Smarr, Robert Scoble and Michael Arrington co-signed a Bill of Rights for the Social Web which laid these out. It wasn’t all roses: most of the large companies present took issue with the Bill of Rights, and as I noted in my write-up for ZDNet at the time, preferred the term “data control” rather than “data ownership”. The implication was simple: users didn’t own the data they added to those services.

Since then, the open web has been accelerating as both an idea and a practical reality. Initiatives like Chris Saad’s Dataportability.org, Marc Canter’s Open Mesh treatise, as well as useful blunders like Facebook’s recent Terms of Service mis-step, have drawn public attention its importance. Facebook in particular force you to license your content to them indefinitely, and disable (rather than delete) your account details when you choose to leave the site. Once you enter something into Facebook, you should assume it’s there forever, no matter what you do. This has been in place for some time to little complaint, but when they overreached with their licensing terms, it made international headlines across the mainstream press: control over your data is now a mainstream issue.

Meanwhile, technology has been improving, and approaches have been consolidated. The Open Stack is a collection of real-world technologies that can be applied to web services in order to provide a base level of openness today, and developments are rapidly emerging. Chris Messina is leading development around activity streams portability, which will allow you to subscribe to friends on other services and see what they’re up to. The data portability aspect of the open web is rapidly becoming a reality: you will be able to share and copy your data.

Your data will be out there. So, what happens next?

The same emerging open web technologies which allow you to explicitly share your data from one service to another will also allow tools to be constructed cheaply out of functionality provided by more than one provider. Even today, a web tool might have a front end that connects behind the scenes to Google (perhaps for search or positioning information), Amazon (for storage or database facilities), and maybe three other services. This is going to drive innovation over the next few years, but let’s say a user on that conglomerated service wants to delete their account. Can they reliably assume that all the component services will respect his or her wishes and remove the data as requested?

As web tools become more sophisticated, access control also becomes an issue. When you publish on the web, you might not want the entire world to read your content; you could be uploading a document that you’d like to restrict to your company or some other group. How do these access restrictions persist on component services?

One solution could be some kind of licensing, but this veers dangerously close to Digital Rights Manamgent, the hated technology that has crippled most online music services and players for so long and inhibited innovation in the sector. Dare Obasanjo, who works for Microsoft and is usually a good source for intelligent analysis, recently had this to say:

[..] I’ve finally switched over to agreeing that once you’ve shared something it’s out there. The problem with [allowing content to be deleted] is that it is disrespectful of the person(s) you’ve shared the content with. Looking back at the Outlook email recall feature, it actually doesn’t delete a mail if the person has already read it. This is probably for technical reasons but it also has the side effect of not deleting a message from someone’s inbox that they have read and filed away. [..] Outlook has respected an important boundary by not allowing a sender to arbitrarily delete content from a recipient’s inbox with no recourse on the part of the recipient.

The trouble is that many services make money by selling data about you, either directly or indirectly, and these are unlikely to relinquish your data (or information derived from it) without some kind of pressure. I agree with Dare completely on the social level, with content that has been shared explicity. Certainly, this model has worked very well for email, and people like Plaxo’s John McCrea are hailing the fall of ‘social DRM’. However, content that is shared behind the scenes via APIs, and content that is shared inadvertently when agreeing to perform an action over something like OAuth or OpenID, need to obey a different model.

The only real difference between data shared as a deliberate act and data shared behind the scenes is user interface. Everyone wants the user to have control over data sharing via a clear user interface. Should they also be able to enforce what’s done with that data once it transfers to a third-party service, or should they trust that the service is going to do the right thing?

The open web isn’t just for trivial information. It’s one thing to control what happens to my Dopplr information, or my blog posts, or my Flickr photographs. I really don’t mind too much about where those things go, and I’d imagine that most people would agree (although some won’t). Those aren’t, however, the only things the web is being used for: there are support communities for medical disorders, academic resources, bill management services, managed intranets and more out there on the web, and these will begin to also harness the benefits of the open web. All of them need to be careful of their data. Some of them need to do so for legal reasons; some of them need to do so for ethical reasons. Nonetheless, they could all benefit from securely being able to share data in a controlled way.

To aid discussion, I propose the following two categories of shared data:

  • Explicit shares – information that a user asks specifically to share with another person or service.

    Examples:

    • Atomic objects like blog posts, contacts or messages
    • Collections like activity streams
  • Implicit shares – information that is shared behind the scenes as a result of an explicit share, or to provide some kind of federated functionality.

    Examples:

    • User information or shadow accounts transferred or created as a result of an OpenID or OAuth login
    • User settings
    • User contact details, friend lists, or identifiers

For the open web to work, both clearly need to be allowed. At a very base level, though, I think that users need to be aware of implicit shares, in a clear, non-technical way. (OpenID and OAuth both allow the user to grant and revoke access to functionality, but they don’t control what happens to the data when access is granted once, which is likely to be kept.) They also need to provide a facility for reliably controlling this data. Just as I can Creative Commons license a photograph and allow it to be shared while restricting anyone’s ability to use it for commercial gain, I need to be able to say that services can only use my data for a limited time, or for limited purposes. I’m not calling for DRM, but rather a published best practice that services would adhere to and publicly declare their allegiance to.

Without this, the usefulness of the open web will be limited to certain kinds of use cases – which is a shame, because if it’s allowed to reach its full potential, it could provide a new kind of social computing that will almost certainly change the world.

Creative Commons License
Except where stated otherwise, all posts in this weblog are licenced under a Creative Commons Licence.