Hey Dude, Where’s My Data?

For the past couple of days I’ve been in Barcelona in order to attend the Bazaar Project’s seminar, Hey Dude, Where’s My Data?

Flippant name, important concept (one that I’ve been banging on about for ages in this blog): the free tools out there on the web aren’t really free. They present an important and exciting opportunity for learners and educators, but there’s a trade off in terms of data ownership and intention; the tools exist to make their owners a profit. How do we advise institutions and individuals alike, and provide a more data-safe environment without losing freedom and functionality?

I recorded a video about the event, including a post-event reaction from Graham Attwell. (This is my first attempt at editing / overdubs, and I realise it’s dodgy as hell. Will do better next time.) Below follows the position I submitted.

{{video:http://video.google.com/googleplayer.swf?docId=412570950581173830&hl=en-GB}}

The consumer web has dramatically changed the layout of the Internet and the underlying purpose behind the network. No longer is it simply a method for academics and professionals to exchange information; it’s that, but it’s also a video delivery mechanism, and a way to chat to your friends, and a way to play videogames with people all over the world. When Tim Berners-Lee invented the World Wide Web in 1989, he originally envisioned it as a collaborative read-write medium, and we are closer to his original vision than ever before.

Advances in web technologies have meant that more interactive services can be built than were previously possible. HTML has been streamlined, and the introduction and further development of CSS allows for a range of different styles to be dynamically imposed upon the same content. Meanwhile, the server and database software required to store user data in dynamic web applications has risen in quality and fallen in price, often to zero. Correspondingly, easy web scripting languages have been developed by enthusiasts in order to make associated programming easier. This new ease of programming and the rise of free, open source software has combined to dramatically lower the barrier to entry for people wanting to build a web application. Apache, MySQL and PHP, in large part, have made web 2.0 possible, and are largely responsible for the massive number of web applications available.

Since the first dotcom boom at the turn of the century, investors have been interested in making a profit through web services. Some web 2.0 sites have the potential to be very useful to large number of people, and as a result received funding from venture capitalists and angel investors. Unfortunately, most don’t have a business model, and the result is that organisations used to creating cool features and developing tools have suddenly had to find a way to repay millions of dollars. The only major asset these tools have to assist with monetisation is their userbases.

By their nature, web 2.0 sites obtain information about their users. Del.icio.us knows what kinds of sites you’re interested in; MySpace knows your musical preferences, whether you’re single, where you live and the kinds of people you’re interested in; LinkedIn knows the industry you work in and the people you’re connected to; Flickr knows what you take pictures of, what camera you use and where you took them; and so on. These sites have our information, and by necessity use it to make money and repay their debts.

Most often, this takes the form of advertising. Sometimes, this demographic information may be shared with other parties. There may be other uses that we aren’t aware of (for example, MySpace is run by News Corp, the multinational media megacorporation that owns Fox amongst other properties; they are likely using demographic information to inform their media decisions). One glance at the Techcrunch Dead List will tell you that advertising isn’t always enough, and web 2.0 applications can very easily disappear, usually taking your data with them.

However, the rise of web 2.0 has not been limited to centralised applications like Flickr or Writely that sit on a corporate server somewhere. There are, both in the open source and commercial software arenas, software products that use similar principlesbut allow you to host them on your own infrastructure. Although there is an obvious increase in cost for this most centrally-hosted web 2.0 applications are free to use due to their desire to grow their userbases there are significant benefits: if you control the server, you also control exactly what happens to the data.

Regardless of the approach with respect to application decentralisation, open standards are the only way forward, in order that we are not limited to one specific application for a particular purpose. If Writely disappears, for example, we should be able to switch to Zoho Writer, or the online wordprocessor we happen to be hosting at our institution, or the offline wordprocessor we have installed on our laptop, or the notepad on our PDA. A Writely document needs to work in Microsoft Word, and vice versa. If we move from one institution to another, we need our ePortfolio to come with us, even if the institutions use two different ePortfolio applications.

Open standards don’t have to be those that arise from public organisations. Standards that arise from business are not to be feared, and often, can be better than those developed in academia or the public sector. While HTML arose from academia, for example, RSS was developed by Netscape, and even Microsoft have recently developed some interesting standards for sharing content.

Unfortunately the tendency has been for education-related standards to be monolithic and rigid, whereas for a standard to obtain widespread adoption it must be easy to implement and flexible. This is the main reason RSS has been so successful, for example; it’s easy to program for, easy to read, and RSS 2.0 can be extended using standard XML namespaces. A good standard should be able to deal with every eventuality through being general and extensible, rather than attempting to cover every single possibility with a new facet to the specification. This generality may have its own problems, but a standard isn’t a standard if nobody else is using it: it’s simply a definition. Therefore, a good standard should also be written with the demands of users and software developers in mind, and if existing open standards exist that people are already using, they should be worked with where possible. In the case of ePortfolios, which are intended for use in life-long learning, we need our ePortfolio content to be usable in more generic commercial applications once we leave education.

When faced with a technological problem, we must look at the entire set of software available to us, regardless of its origin, and make our choices accordingly. Open source is not always the answer; commercial software is not always the answer. The traditional view of open source software is that you will not receive the same level of support, but this is usually not the case for major open source projects; commercial support services will always exist, and usually there is an additional element of community support. There are often perceived quality issues with open source, but these are irrelevant; open source is often stable and fully-featured, and commercial software is often buggy and lacking. The number one web server in the world, Apache, which powers over 70% of websites, is open source; the number one web browser in the world, Internet Explorer, is commercial.

Recommendations

We need to primarily determine the requirements for both applications and standards, and then use existing developments where they exist, wherever they may arise.

Academic software – including ePortofolios – need to be able to export their information to more generalised applications, and therefore must support generic open standards.

Open source projects in particular can often be influenced in order to meet particular criteria, or modified to better fit specific needs. For this reason, open source may be better in many cases.

Where possible, locally hosted applications should be used in favour of centrally hosted ones, and potentially developed where none exist.

It is often against the existing web industry’s interests to create decentralised applications that operate with open standards. Grants through funding bodies should be made available for open source projects with real world applications that adhere to these principles. The funding bodies should also attempt to manage developments in order to prevent overlapping projects and promote inter-institution collaboration.

Hey Dude, Where’s My Data?

Comments

Leave a Reply Cancel reply