In class, we will discuss the following:
New York Times, Dec 9, 2017.
Seattle Times, July 13, 2008.
New York Times, Mar 24, 2012.
ProPublica, May 11, 2013; Methods.
Los Angeles Times, May 18 and Oct 24, 2012; Methods.
What data did the reporters use?
Some of these stories used multiple sources of data. Make sure to identify each one.
How was it obtained?
Was the data simply downloaded or scraped from the web? Did it require a public records request? Did any data come from a commercial source?
What were the challenges in working with the data?
Did the data have to be cleaned or processed? Were parts of the data missing or unreliable? How did the reporters deal with these problems?
How did the reporters interview the data?
Think about how the data was filtered, sorted, grouped, and summarized. Did the reporters use rates or make any other calculations? Did they perform any joins across two or more datasets?
Compared to what?
Data usually has to be put in context to tell a story. The latest data may need to be compared to historical averages, for example, or reporters may decide to compare an agency’s performance against an accepted standard. Sometimes reporters may create categories within the data to compare a situation of particular interest to the rest of the data. What comparisons were central to each story?
What other reporting was required?
Data journalism is just one tool in a good reporter’s toolbox. What other reporting did the reporters behind these stories do?
Does the story have any weaknesses?
Good reporters try to “bullet proof” their stories by looking for holes and anticipating how an interested party might try to attack the analysis as flawed. Did the reporters behind these stories leave any vulnerabilities?
Complete this quiz on good practice and basic principles of data journalism.
Due: Weds Jan 31 at 8pm