Skip to main content

Big Data
Not long ago, journalists who sought to dive deep into government data usually started their investigation with a trip to the local library. Interested in information that government officials want to keep secret? Americans needed to pass laws to make publicly owned data public.

Two forces have combined to blow the doors open on government data: technology and the fast-growing amount of information it collects, generates, and disseminates.  (The Obama administration has helped.) Data that might have once been combed through by hand is increasingly being combined into public databases and journalists, researchers and even businesspeople are making use of it.

Here’s our guide to getting the most out of this age of transparency.

On the surface, using government databases is simple. Many are easily accessed on the web for free; others have special apps for phones and tablets. But raw data can present its own problems for the user.

Sources, Sources, Sources

For one, all data can be corrupted by shoddy collection practices or biases so it helps to know where the numbers are coming from, how they were assembled and by whom, says Joe Germuska, a former senior news application developer for the Chicago Tribune. Germuska is also a co-creator of Census.IRE.org, a tool that won the 2012 Knight News Challenge for allowing journalists to easily utilize Census community data.

“One should always bring a skeptical eye, knowing that data collection is a human process, so that there may be systematic errors because of shoddy collection processes or actual devious intent,” he said.

Other shortcomings result from the nature of data collection. For instance, crime data reported by the Chicago police department often fails to capture the full scope of crimes, although in recent years police officials have worked to improve how crimes are categorized, says Jeremy Gorner, a crime reporter for the Chicago Tribune.

“There’s a hierarchy when it comes to categorizing crimes in Chicago. For instance, homicides are the most serious followed by robberies, then shootings. If someone is shot during a robbery, the crime will still be categorized as a robbery because that’s technically a more serious crime,” explained Gorner. “In order to get a better and more accurate number of shootings or people shot, what I am forced to rely on most of the time are good sources to help me figure that out. ”

The Story Behind the Numbers

Having a basic knowledge of statistics is also an essential, says Germuska. That’s because numbers can always be made to tell a story but if you want a truthful story the data need to be parsed. Think of homicide numbers going up or down from year to year in a city and the typically pungent headlines that often follow in the local daily.

“The best advice I can give about interpreting crime data is the same advice I’ve received from criminologists,” agreed Gorner. “Rather than merely comparing one year to the previous year, it’s more suitable to compare one year to several previous years. A drastic increase in homicides from one year to the next, for example, could merely be an anomaly.”

As a freelance copywriter, make sure you include data in your content, one of the ways you can turn boring content into brilliant content.

Moving Towards Truly Open Access

In some cases, the term open government data is actually a misnomer.  That’s because the data, while easy enough to read, is often trapped in a PDF document or other format that makes it inaccessible for analysis.  This situation is changing, in part because government officials have begun to recognize the need to provide data in useable formats.

For instance, the Securities and Exchange Commission first began to utilize XBRL (extensible Business Reporting Language), an open standard that uses XML for reporting business and financial information, in 2004. However, it wasn’t until early 2009 that the commission required businesses to file their official documents in the language, which allows for computerized analysis by Wall Street watchers.

Looking Forward

“Accessing open data has made me work harder because I’ll use that data to formulate better questions when I interview police officials,” Gorner said. “This kind of data not only makes you a more informed reporter on your beat, but it allows you to take a step back and be more critical and skeptical about what officials are telling you.”

What do you think? How does open data affect your work?

Correction: Jeremy Gorner’s last name was incorrectly spelled.