Early last year I participated in a Journalism that Matters conference in Seattle. From what I learned there, I focused on creating an information ecology of Seattle as part of work related to the FCC Future of Media enquiry. (New America Foundation published that study in May 2010 as Seattle: A Digital Community Still in Transition.) It was this research that brought the achievements of the Washington News Council (WNC), a press and media policy group to my attention. The WNC spent most of 2010 building an ambitious project to track all news and information outlets in the state. The mapping project materialized from an initiative generated at the Journalism That Matters conference “Re-imagining News and Community in the Pacific Northwest” in January 2010.
A year later, in January 2011, the group debuted the prototype of the Online Media Guide, a database and map of more than 800 outlets reflecting Washington’s news and information ecosystem gathered from sources ranging from traditional media to Google groups and nonprofit organizations. The WNC presented the database and maps to the public for a brainstorming session that included representatives from local legacy media, public relations firms, a cable company, and the general community.
The Online Media Guide’s creator and lead developer Jacob Caggiano, the Council’s communications strategist, answered questions about how he approached the project, the challenges presented in defining news, and what the map means to the future of media.
What was your method for locating the outlets?
I searched neighborhood groups, Google groups, Yahoo groups, mainstream media, hyperlocal sites. I typed “Washington State” search terms on Google groups and got 500 groups. I went through them one-by-one and picked out 50 that have to do with civic information. I found active groups talking about whether it was the people who wanted a third party in the state, or a teachers’ group.
I wanted to add non-English media, and I wanted to think of terms, like ‘ethnic media.’ It’s not easy to find because I do not speak other languages. People ask me how I found 821 sites. The more you click around, the more you see they link to each other, you get a feel in circles, as in social cliques, you see certain blogs and blogrolls, and you see the same cast of characters.
Describe your approach and scope to coming up with a news and information taxonomy or semantic boundaries.
I came up with categories of tags and within each category there are several tags. Tags are not hierarchical so you can have multiple tags in different categories.
Each record is broken down into several subcategories based on the following criteria:
- MEDIUM - site, blog, print, radio, TV, forum, magazine, podcast
- SCOPE - hyperlocal, local, state, northwest, national, global
- SOURCE - corporate, independent (small and local, can be for-profit or non-profit), civic organization (registered non-profit group, advocacy group, or think tank) government, grassroots (unregistered group), labor union, industry group, political party, politician, or educational institute
- REGION - east, west, central, northwest, Seattle
- PARTISANSHIP - nonpartisan, left-leaning, right-leaning, centrist
I'm also experimenting with other ways to categorize the records, which will be interesting.
How did you create the database?
I started with Firefox because I didn’t want to type out a database by hand. You can make multiple profiles so you can have several identities on Firefox. Each profile is like a different person using the browser, so I made a profile dedicated to the project. I would bookmark findings, then tagged them each time I created a bookmark. Anytime I looked at something I hit the star on Firefox.
The next step was exporting this to a spreadsheet, painfully. Firefox is not database-friendly.
If we can establish a single source where to discover online media, and as long as we can work out who controls what and what is accurate and when things die off, when we get past that problem, the Online Media Guide is where the most up to date info on media ecology, then people can use it for research
I want to pull in RSS feeds, do keyword searches, do a Dippity timeline. People could make a timeline of any story, and say ‘Here is the evolution of story.’ But in order to get to the pot of gold we have to establish a database that is up-to-date and centralized. We made a good start. We think it’s the most comprehensive one that exists in the state.
Do you feel it is possible to include every news and information outlet in the state on this site? If so, how do you think that can be done?
Right now we're approaching 1,000 records, which is already substantial enough to do some really insightful analysis and provide a meaningful contribution to the understanding of our region's news ecology. Every week I keep finding new sources, so that tells me that there's still a ways to go. The best thing to make this vision a reality is to establish ourselves as a go-to directory and build an easy way to keep the records self maintained.
The idea is that this is supposed to be a starting point, a launch pad for anyone who cares about news and information in their community. The idea is just to know where you can go, and we’re hoping people will look for information and sources that do not only align with their personal beliefs, or find something they disagree with. The idea is not to push any sort of agenda. It’s basically to dig as deep as we can as to how information flows in a region. We had a discussion about how broad do we want to make this. You probably have seen arguments about what is the definition of a journalist. The more I heard about it, the more I thought about widening the scope.
I’m still figuring out where to draw the line because right now the limitations are more labor and technical than based on what I think should be in there. The line I’m trying to draw is civic information. We are including a lot of nonprofits. It would be a highly skewed map if we didn’t include nonprofit information. A lot of environmental information you get will come from groups that are very focused on things like cleaning up Puget Sound. Mainstream media should not feel insulted that other people may have better information than them. When online media started, mainstream media was sort of defensive because of what other people were publishing.
The goal is to get the best information to the right people. To me, if it’s good information -- and by good that’s a whole debate -- it matters that it gets to the right people. Our mission is to promote ethics, integrity and quality journalism. I proposed idea of mapping media ecology because I told [Washington News Council president] John Hamer that in order to do our job to provide a forum to talk about news media, we need to reestablish our understanding of what the media are.
Were there any surprising findings?
Almost all places had just one newspaper. Of everything I could find, with the exception of two, the amount of newspapers has been significantly reduced. In Seattle, I saw an increase in news start-ups online. It’s still a matter of debate, but when we can start doing reports and analyses then we will know for sure whether the hyperlocals are picking up the slack
The goal is for the Online Media Guide news site database to become user-generated, so that it updates itself. How are you marketing the database?
The great thing about this resource is that it is valuable both to producers and consumers. Consumers are looking for ways to find relevant news, and publishers are trying to find new people to discover their work. The idea is to establish the Online Media Guide as the authoritative media database for the State of Washington. Once news consumers realize that this is the best place to go to discover who's talking about what, it will be in the best interest for publishers to ensure that their records are accurate and up to date.
All of our research has indicated that as far as quantity, we have the most listings available. We are also providing unique quality by slicing and categorizing each entity in ways that deliver useful insights about the type of news and information out there, as well as what's lacking.
What other ideas does the Washington News Council have for media mapping?
I have buckets of great ideas that I want to implement once we get the maintenance issue streamlined. I want to expand the definition of ‘maps’ to more than just geographical pinpoints, but as visualizations that can be used to navigate the flow of information from inception to appropriation. I want to make charts, share trends, and get a handle on where news comes from and where it goes. All of this can ultimately be used to help carry out the mission of the News Council, which is to promote quality journalism. But first we need to know where the news comes from and how it travels. The sky is really the limit once we get a reliable database in order.
Technically speaking, any web tools on the horizon that would make media mapping easier?
One that I'd like to mention is Open Action, which has a very impressive, simple-to-use product that can easily make maps that implement tags and categories, and also pull in outside information such as blogs/Twitter/Facebook pages all in one place. Right now we're in the midst of a test run, which I am very excited about.
What were some of your or WNC's biggest challenges to creating these maps?
It's definitely been a challenge to make the right decision on how to classify each record. It's not easy putting things into a box, especially when I'm first getting familiar with the content. There are also some interesting debates about semantics, such as what it means to be "grassroots" "independent" "right-leaning" "left-leaning," etc.
It's also going to be a challenge to keep this project financially alive, but that's nothing new, and hopefully the people reading this will take an interest and help us keep the momentum going. We are open to suggestions.
What I want to do is not just publish this guide, but I want to make a guideline book, a wiki that has a table of guidelines, because a lot of the tags I gave, it was difficult to assign tags, to assign what box to put things in.
How much did this project cost?
I can't give an exact number but it's taken several hundred hours to get the project where it is today. This includes not just the data collection, formatting, and presentation, but also the time we've spent brainstorming ways to market it, as well as the actual marketing we've done so far. One thing I will say is that I've picked up a lot of knowledge on the way, and if I get to do it over again (which I would like to do for other regions if you're out there looking!), I will be able to do it much more efficiently.
Do you have any advice for database builders?
I’m trying to figure out how to take this and monetize it to keep doing this. Maybe contract it to other states. Right now it’s all proprietary. I started a master spreadsheet on Google Docs, which will go into SQL, and that way it will be more robust and can do much more.
Join the Conversation
Please log in below through Disqus, Twitter or Facebook to participate in the conversation. Your email address, which is required for a Disqus account, will not be publicly displayed. If you sign in with Twitter or Facebook, you have the option of publishing your comments in those streams as well.