Journalism and data

How wonderful it would be if journalists published links to data, Jon Udell commented in his IT Conversation with David Stephenson. For example, NYT journalists make wonderful infographics and it would be great to share links to the source. We are so far away from that vision!
Several conventions of journalism work against this model.
Quick tasty bites The infographics in NYT and other mainstream media are at best intended to rapidly convey the sense of a story to an intelligent but busy person. At worst, graphics are what Tufte calls “chartjunk” – visuals designed to look attractive and dramatic while muddling the underlying information, reflecting condescension to the audience and perhaps lack of understanding on the part of the designer. Whether the graphics good or bad, the model is to be able to digest information quickly. If the viewer has to think, the designer is doing their job wrong.
What’s new? News is about what’s new by self-definition. If there isn’t a news hook there isn’t a story. Publishing data, and accumulating growing understanding and utility around data may be informative, but it isn’t “news”. Maybe it’s science, but that’s done by a handful of people and only makes the news when there is some sort of revelation or discovery for “the public” to consume.
Journalists quote experts The news model assumes that the gathering and analysis of data is done by expert researchers. The role of the journalist is to represent the newsworthy findings, and then to interview experts with contrasting opinions about the findings.
In order for journalists to publish explorable data, all of these assumptions need to change.
Wiki model of journalismThink of news as recent changes in a wiki. Instead of a stream of novel and quickly forgotten news bites, a wiki sees information as changes in an underlying database. “News” happens when new information is added, or when new connections are made in existing information. The “news” isn’t disconnected from the existing body of knowledge.
A continuum of expertiseIf the data sources behind the news were public, the experts compiling the data would be able to explain some of the import themselves, journalists would curate, and passionate amateurs would be able to contribute at various levels. More people still consume than produce any given thing, but the population of creators and explainers at the top of the pyramid is expanded and diversified.
Dramatic moments and underlying trends The most artful data models would reward both quick grasping of salient information and slower exploration. This is possible but really really hard. The best in human intellectual and artistic achievement attains this quality. Perhaps the surplus that Clay Shirky envisions — the thousands of wikipedias that would be possible if people used more of their time adding and less time consuming.
The good news is that we have the tools, and collectively the time, to start building these resources today. The model will change by people doing and explaining.

Metered broadband protects the market for cable

Comcast’s 250GB transfer is equivalent to 2.5 hours of HD video, as observed deep in the comments of Om Malik’s post on the Comcast bandwidth cap. The internet use that would threaten Comcast’s business over the next few years is exactly what’s banned. Strategically diabolical, if you have the market power to do it.

Reflections on Here Comes Everybody

Clay Shirky’s Here Comes Everybody covers some similar territory to Groundswell from a different perspective. Clay isn’t writing tactical advice for a corporate customer base. He talks on a broader scale about the relations between social software and society. Unlike the Forrester book, which has a lot of helpful explanation for skeptics, Shirky assumes that the trends are fundamental and builds from there. The Forrester book is helpful and necessary; Shirky’s bigger picture take is refreshing. The future isn’t evenly distributed; he’s just looking for glimpses of it.
The book provides a concise and catchy summary of some key internet trends observed over the last 10 years. The net frees a “cognitive surplus” – time that had been spent consuming television is being displaced by time spent taking pictures, making movies, playing games. The internet helps people find kindred spirits, for good or ill, whether fellow wiccans or fellow terrorists. The net provides more venues for large scale distributed collaboration, bringing wikipedia and Linux to the world. The net provides new opportunities for powerful organizing – flash mobs can take down governments. Internet organizing is not just about “cyberspace” as a new polity, but new ways of increasing and strengthening in-person ties with tools like Facebook and Meetup.
The big weakness in Shirky’s book is his understanding of organizing. His model of organizations is binary — hierarchies like the 19th century railroad and the Catholic Church on the one hand, and self-organized, evanescent entities like flash mobs on the other. The chapter on the Catholic church is a major red herring. Shirky tells story of VOTF, a group of of Catholic lay people who used the internet to advocate for accountability for clergy who committed sexual abuse, and members of the hierarchy who protected abusers and covered up the scandal. A similar scandal a decade earlier, was swept under the rug, but with better tools to organize, the VOTF had more impact.
This story is a red herring because the Catholic Church is exceptionally hierarchical. There are many other denominations that have congregational models and influential lay leadership for hundreds of years. Catholics adhere to the Church in part because they believe that submitting to God and the church has spiritual value. If they wanted a congregational or informal or individual-centered model for spirituality, they have many other paces to go. The influence of parishioners on the hierarchy is notable — but it’s dramatic because of the unusual nature of Catholicism, not because of the internet.
Shirky overestimates traditional hierarchies, and underestimates movement organizing. Shirky looks at the loosest forms of net organizing — flash mobs and other sorts of ad hoc protest, and wonders what on earth can be coming next. Long before the internet, there were international, organized movements. Anti-slavery, women’s rights, labor; these movements changed society incorporated local organizing and coordination at a distance. They did depend on technology — reliable postal service, printing, and national / international travel that was within the reach of of upper middle class people and funded working class people. People could organize locally, and share ideas; transmit methodology; and organize support globally.
Shirky speculates about what might happen when people organize using the new tools, providing an extended case study of the norms and processes of the wikipedia community. Shirky is right that we don’t know what all of these forms will look like, but there are a number of models to observe by now; this is early sociology not science fiction. Books by Fogel and Weber observe the organizational structures of large, successful open source projects. There are emerging practices for political organizing, in an earlier stage than opensource software.
A flash mob needs very little organization. Unconferences need a tiny bit more. But structure isn’t a thing of the past. When people do net organizing over longer periods of time than an evening or a few weeks, they start to create structures. Longterm goals always require group memory, and may require money and legal protection. Wikis, mailing lists, and conferences transmit knowledge and culture from experienced folk to newbies. Established projects develop processes and often roles for decision-making. Established open source projects create or join foundation umbrellas. Political organizers create PAC arms. Organization itself is not going to fade away, as long as humans need to take action over time; the need to organize over time is fundamental to human culture. Flash mobs alone aren’t going to be able to address global warming or California’s water supply.
My hypothesis is that the biggest discontinuity will turn out to have been mass media. Popular culture is at least as old as civilization. Mass organizing is at least as old as modernity. The energy freed by the net is freed from watching television, and will free up some time to go back to organizing and folk culture (doesn’t mean broadcast is dead; when people remix a tv show it’s alive again). Where shirky is deeply right, I think, is the scale effect. The net enables organizing and folk culture at speed and scale unavailable in the past, and that will add up to differences we can’t predict.


Last week, I had a meeting with a staff person at a public service organization with a traditional approach to interacting with customers. He is interested in experimenting with social media for customer service and communication. But the organization as a whole react with fear and anxiety at the thought of using internet tools. Groundswell by Charlene Li and Josh Benioff came immediately to mind. The book is targeted at business people whose companies fear engaging with their customers online, but are attracted to the opportunity — or don’t have much of a choice.
Groundswell lays out step by step processes for engaging with “the groundswell” – the masses online who are talking about you and your products whether you want them to or not. The beginning of the book is a catalog of fears – exposes, PR disasters, digital mobs, displacement by internet services. The rest of the book is a how-to-guide for stepping into the roiling waters and engaging the groundswell.
A few things I liked about the book.
* The authors give good counsel about starting small, experimenting, and being patient. The well-known success of the Dove “Evolution” viral video took place after the champion spent two years laying the groundwork for it. Building social media takes time, and cultural adaptation takes time.
* Your customers want what your customers want. A company imagines that its customers are interested in their products; and the customer is cares about what they care about. The best example was the forum sponsored by P&G. Young teen girls don’t want to talk about menstruation, they want to talk about their lives, and the forum provides a supportive environment for them. And by the way provides information about products.
* How does this help my business? Each section has a sample business analysis to help champions cost-justify engaging the groundswell.
One core Forrester technique used in the book is simultaneously helpful and somewhat iffy. A survey segments the behavior and preferences of customers by market and company. Organizations can use these demographics to choose which social media techniques to use to engage their customers. Customers are characterized as “creators”, “critics”, “collectors”, “joiners”, “spectators” and “inactives” based on their use of tools: a blogger is a creator; someone who rates things on Amazon is a critic, someone who bookmarks things is a collector. To some extent this is basic channel analysis. A business whose customers aren’t on Facebook, or even online, shouldn’t be wasting their time on Facebook. A business whose customers are active raters has a significant opportunity to incorporate ratings into its online presence.
The flaw in the tool is that the the characterizations are moving targets. This is definitely true about tools. Facebook is only four years old, and its demographic has increased in just a few years from college students to business networkers. And it may be true about behavior as well. It is a well-established observation that large communities have only a few percent active participants. Most people lurk, a few people take small actions, and a very few are highly active. This doesn’t take into account learning. How many more people take photographs because of flickr and digital sharing services. How many people start by watching youtube videos, and eventually make and share videos? I would be surprised if there wasn’t mobility among the categories. Some people move up the engagement curve as they learn and model after their friends. Some people move down the curve as they focus their attention on other things.
While the authors do a good job telling corporate readers that it’s not about them, the structure of the book has that focus. The book is targeted at Forester’s customer base: big consumer products organizations desiring and fearing web 20. Forrester identifies their fears and sells them reassurance and good advice. For organizations who are in this situation — like the staff member at the nonprofit – this book really hits the spot.

Open source science with social software

Jean-Claude Bradley talks to Jon Udell about his use of social software in research and teaching in chemistry. Bradley’s lab at Drexel uses wikis as their lab notebooks (the norm in the field is still paper). Then, they use blogs and friendfeed to share links. By sharing their work in progress, they have found people to collaborate in related disciplines. He’s working on synthesizing malaria drugs, and has found bioinformatics specialists to predict compounds to test, and specialists who do clinical trials to test the compounds they synthesize. Scientists have traditionally found collaborators at conferences; the magic of google and friendfeed expands the circle of potential connections.
The patterns of social software use were familiar to the ones used in technology and business. It’s delightful to hear about the patterns proving valuable in the practice of science. The discovery of collaboration partners is especially useful where there is the potential for interdisciplinary collaboration among people who wouldn’t necessarily find each other, because of organization structure or discipline boundaries or geography.
As a professor, Bradley uses podcasts to completely replace lectures. He uses the saved time to spend time with students in small groups and 1:1, coaching students in areas where they need help. Lectures remain a required part of the program – students need to listen to learn the material. But there’s no need to attend lecture hall.
The most unusual aspect of Bradley’s use of social software in science is his use of second life. He holds seminars and poster conferences in second life. It’s not required for students, but is a vibrant part of his teaching. Part of the value is the 3d nature of the subject – Bradley uses a special 3d modeling tool to explain chemical structures in second life. Bradley found that the avatars and social body language added valuable dimensions unavailable to text chat — avatars reveal more about people’s personality, and the virtual presense seems to make it easier to join conversations.
I was pretty surprised — and Jon Udell was also — that second life was being woven into something useful. Uses of Second Life to complement the real world had seemed more like stunts than natural augmentation of existing communication. Apparently Second Life has a strong presence in the chemistry field, with active presence by professional associations, making second life a useful way for undergrads to network with potential employers and grad school programs.
Other than the 3d modeling tool used in Second Life, Bradley doesn’t put much time into the tools. He’s happiest that social software evolution has made simple tools available to him and his students for free (they use wikispaces and google blogspot), so they can devote their technical attention to the practice of chemistry.
p.s. it’s fun to listen to podcasts during weekend housecleaning. This podcast complemented the cleaning of three bicycles. Thanks, Jon Udell and Simple Green.

Professional women watch tv on the net – I’m a demographic

I occasionally watch television programming – by-the-drink episodes on the internet or by-the-bottle series on DVD. I thought that was just a personal idiosyncrasy – a habit that is one part geeky, one part lazy (plan to watch? remember to record?), and one part ADD (must internet multi-task while watching video).
It turns out this is a mainstream phenomenon, and I’m in the heart of the demographic. Twenty percent of US television viewers watch using the internet, according to a recent study from IMMI. The largest segment getting their TV from the net are busy working women women, ages 25-44. Thanks to strange attractor for the tip.


NBC’s internet-like coverage of the Olympics doesn’t let you watch coverage from another part of the world. Apparently they use IP address to segregate viewers into national ghettos. If you try to say you’re in Argentina or Andorra, they bounce you. It’s annoying enough that the NBC coverage for US viewers is mostly US athletes, with human-interest patter drowning out the events. With the Olympics you have no other choice, unlike most other events of international interest, where you can dip into international coverage and get multiple perspectives.
The Olympics are able to constrain the coverage because they have a scarce resource. The Oympics happen once every 4 years. It is feasible to constrain media and presentation. But imagine if the Olympic coverage was handled very differently.
With this year’s online Olympics coverage, you can select from a variety of recorded events, with easily searchable topics. Overall, there is more footage than anyone who’s not on bed rest can watch. Then there are little informative snippets, like a champion weight lifter explaining the Olympic lifts, or a gymnast explaining the judging rules. But it’s all from one perspective. The Olympics are the tip of a large iceberg of sports that are usually obscure. The good news is that the rest of the year these sports are obscure, so college gymnastics can be found on YouTube.
So, imagine if you could watch coverage from any nation. Imagine you could watch coverage from multiple perspectives, including the knowledgeable folk who pay attention to these sports all year long. Imagine people could add links to the YouTube videos of the obscure meets throughout the year. Imagine if people could add links to the coverage of these athletes in their local papers. Imagine if coaches could post tips on running and swimming based on the performance of these world-class athletes. Imagine if there could be ways to find your local clubs for cycling, swimming, volleyball, rowing.
Without a video monopoly, a site that could link together a broader and deeper array of content and conversation would reward more engagement. It would provide more opportunities for sponsors to make money. Broadcast network coverage would probably stay popular because of the production value and brand, even if the monopoly was lifted.
It is not even that large a stretch. Recently, other publishers of popular culture artifacts have started making peace with fan communities, creating hosted, sponsored sites for fans willing to take them up on the offer, and treating independent communities with benign neglect instead of persecution.
The Olympics would benefit from this approach. The producers believe that keeping a monopoly ensures they make money. They are not seeing the large amount of money they are leaving on the table.

Taxonomy is power

There’s a saying, “a language is a dialect with an army and a navy.” Categories reflect social power. This is true even with fictional things. A friend was describing a fantasy novel series. I googled and found its web page and wiki.
Lo and behold, the categories in the left navigation of the wiki read:
Picture 64.png
A funny set of categories to characterize this fictional world! And there’s a backstory — some of the fans wanted to classify dragons as people, and organize them by nationality, like people. But the maintainer of the wiki wanted to keep the categories of beings separate. Leading to a heated dispute about human/dragon racism.
No word on whether there is a full-fledged dragons rights movement. Or at least a protest t-shirt.
Even more backstory. That quote about dialects? It was a quote by a yiddish scholar, made famous at a presentation in a conference in 1945, while WW2 was in progress.

Why Twitter updates are better than Facebook feeds – or not

Gregor Hochmuth has a fine but overinterpreted explanation of why twitter updates are better than facebook. I think that Gregor’s article attributes to the features of tools something that belongs as much to the differing uses of the tools.
Gregor observes that that Twitter messages go to a defined audience, whereas Facebook newsfeeds don’t have the same effect because the items show in the newsfeed are selected by algorithm. This is a good insight, but misses something important – the usage patterns of Twitter and Facebook differ from community to community and from person to person. I recently went on a group mountain bike ride with a group of women who aren’t tech geeks. They weren’t millennials – ages ranged from mid-thirties to mid-forties. The conversation turned to Facebook. Turns out, they use Facebook like people I know use twitter. They post message updates for their friends to see. And, they don’t use Facebook like people I know use Facebook. They don’t have lots of apps installed — not the social, “buy-you-a-drink” apps that presumably appeal to the young and partyish, and not the movie/books/music/scrabble sharing games. It’s a lightweight way to stay in touch. They don’t use twitter, and they don’t need it, because they use Facebook like Twitter. Without the updates “so-and-so rated 12 movies”, the personal updates are visible. So its not just about Facebook, but how people use Facebook.
Individuals also vary in the way they use the two tools to describe their social circles. Some people use Twitter to collect friends. Others constrain their following to a degree of relationship. Some people use Facebook to collect friends. Other constrain their friending to a degree of relationship. The patterns vary by tool and by person. So for some people Twitter is more like broadcast. For some people Facebook is more like broadcast. It depends.
Gregor’s piece also misses a fun and useful attribute of the more diverse nature of Facebook feeds (for users who use Facebook that way). The diversity of applications — and the fact that notices are grouped by app not by user — results in interesting kinds of serendipity. You see movies, or books, or parties, or groups that you wouldn’t always run into because your acquaintances happen to mention them. Facebook multiples referral serendipity. Because Twitter affords and rewards reply, it intensifies conversation and news, but has less diverse serendipity.

Context is people

As usual, an insightful post from John Udell about what it takes to make sense of government data – or other data – online.
Udell has been covering the emerging array of tools to expose government and legislative data online. Then he tried to use the tools to follow a bill he cared about:

What I found is that, even with power tools like GovTrack and MAPLight, it’s really hard to make those connections. That’s partly because we lack good mechanisms to track the flow of bits of legislative language through an evolving assortment of bills, and to relate those fragments to the activities and interests of their sponsors.But it’s also because a novice who tries to read and interpret this record lacks context.

In order to understand the progress of a bill, you don’t just need a bill number and a tool to show differences in document versions — you need to understand committee process, legislative calendars, procedural maneuvers; written and unwritten rules; social and political dynamics.
Udell points to tools like GovTrack which are attempting to create a substrate for communities following bills. I’m seeing a trend that is fascinating and a little bit lower tech. National blogs like FireDogLake, local blogs like TransBayBlog, social network communities like the Get Fisa Right network in, provide their communities with more detailed context on the dynamics of legislation and the process of adding ingredients to the sausage. In the context of a community, members learn more about the legislative process than civics 101 class, or than getting email from the Sierra Club.
The data-driven tools that Udell envisions, where the system allows citizens who are tracking the same bill to find each other, are cool, visionary, good right and true. A lower tech solution is here today. Bill numbers are good search terms. Google a bill number and you’ll find resources on the bill. The ability to google for a bill number, find a great blog and discussion, and engage in some informed networking and activism, is here today.