Resources: Articles

Journalist describes role in helping compile, publicize national data on COVID-19 Date: 08/31/20

Betsy Ladyzhets

By Bara Vaida

Among the many challenges in covering COVID-19 has been the federal government’s lack of public standardized data on testing, hospitalizations and deaths. Several private organizations and journalists have worked to fill the void. One of the largest efforts has been the COVID-19 Tracking Project, a volunteer project started this spring through a collaboration between two journalists from the Atlantic, Alexis Madrigal and Robinson Meyer, and Related Sciences’ Jeff Hammerbacher. Hundreds of journalists, scientists, data experts, designers, developers and other data-gatherers volunteer to update the site by gathering COVID-19 data from state public health sources, standardizing it, and putting it into context for the public.

Betsy Ladyzhets, a freelance writer and New York City-based research associate at Stacker, is one of the many journalists volunteering time at the project. She recently launched the COVID-19 Data Dispatch newsletter to put data about the pandemic into a better context for friends, family, media and the public. Here she discusses why she launched the newsletter and gives advice to journalists on obtaining and using COVID-19 data.

Q: Tell me more about the COVID-19 Tracking Project and how it is getting information to the public?

A: I started volunteering for the project in April. It’s a lot of journalists, students and some scientists who are familiar with data and wanted to contribute. We do what we can to collect and standardize data. We have an API (application programming interface) that anyone can use to power a database in any way they would like. We publish weekly updates on the data, and sometimes articles on our blog. Every day, after we compile the most recent data from states’ public health departments, someone from the communication team puts out a tweet thread saying: how many new [COVID-19} cases there were today, how many new tests, and what we notice in terms of trends.

Q: What time does that daily tweet go out, so that journalists know what time to watch for it?

A: We try to post it around 5 p.m. E.T. Sometimes, it’s a little later (if a state health department is late with its data), but it’s up, for sure, by 6 or 6:30 E.T.

Q: What’s your role with the COVID-19 Tracking Project?

A: The volunteers are all split into teams. I’m a shift lead for the Data Entry team, and I’m part of Race Data and Data Quality, which are subgroups of Data Entry. There’s another data entry group involved in collecting data from long-term care facilities, another group on cities, and another team on the website, another on blog posts, another on editorial, and others.

Q: Since you aren’t in a newsroom together, how do you all communicate and work together?

A: We all communicate via Slack (an online communication platform).

Q: Tell me about your work on race data. We hosted a webcast in the spring and the data was pretty spotty then. There were even a few states that didn’t report any data broken down by race.

A: We publish the COVID Racial Data Tracker, which is a collaboration between the COVID Tracking Project and the Center for Antiracist Research at Boston University. The COVID Tracking Project’s main COVID-19 dataset is a public dashboard, updated every day with data on cases, outcomes, testing, and hospitalizations. Then, the Racial Data Tracker is a supplemental dashboard, where we are looking at race and ethnicity data for cases, hospitalizations, deaths and testing, where the states make this information available. The (racial) data are a lot better now (than they were in April). All the states report some race and ethnicity data, but it’s still pretty spotty. New York state, for example, reports deaths based on race and ethnicity, but not cases. West Virginia stopped reporting on deaths based on race in late May. One of the goals of the Racial Data Tracker is trying to push state officials to do better with their data, so we are urging anyone who wants better demographic data to go on our site and use our new contact form to get information for their state’s governor. The site has a script for you to use and say, ‘Hey Governor [so and so] why aren’t you reporting this data?.

Q: Did you ever think data would be such a hot story?

A: In a way. There were a lot of experts who have been thinking about pandemics for a long time, who anticipated if there were to be a pandemic, that it would go poorly. … They predicted that a country that is as decentralized as we are would not be as equipped to handle this as well (and that turned out to be true). The problem is that early on, every state was left to fend for themselves. So every state has its own set of data and its own way of reporting it and its own definitions. It’s like dealing with 50 different countries. Some states are okay at it and some are terrible. There was no one from the federal government that said to every state, “you have to report these data this way.” The Centers for Disease Control and Prevention has some (data reporting) guidelines, but not every state follows them, and while there is a federal testing dataset run by the Department of Health and Human Services, it’s not comprehensive as [it only includes PCR (polymerase chain reaction) tests]. We at the COVID Tracking Project are the only national site comprehensively tracking testing and relaying it to the public.

Q: What advice would you have for journalists on how to approach COVID-19 data?

A: If you have a question and you think it can be answered by some data, don’t stop with the numbers as the answer to your question. There is probably more there. For example, the other day, my coworker asked me how many people in the U.S. have been tested for COVID-19? And I had to say that I can’t tell you how many, because tests are reported differently in each state. Some report how many samples were tested, and some report at how many individuals got a test — in the first type of state, someone who got tested five times would be reported as five test results, while in the second type, they would only be reported as one. With a question like this, you have to keep poking around. You have to ask about the definition of the data, the context of how it was collected, and the source behind it.  You can’t just take a number and put it in a headline.

Q: What would you say are the common mistakes that you see in how COVID-19 data is reported?

A: I think it is not contextualizing data appropriately. You have to explain what the data mean. For example, you can say a state’s positivity rate fell from one week to the next, but it is important to explain the numerator and the denominator ― the number of tests that were completed and how many of those tests were positive. And you have to explain that positivity rate in the context of what is happening in the state. Is the state actually doing more testing, or did it have to shut down testing centers because of a hurricane, causing both the number of tests and the number of positives to go down — this happened in Florida a few weeks ago. And also, don’t forget there are real people behind these numbers. It’s always important to remember that.

Q: Why did you decide to create your newsletter? You must be so busy!

A: Well, I have been in the weeds on COVID-19 data for four or five months now, and there were more and more situations where I wanted to do a story on some topic or dataset, but my editors at Stacker would say it was too “niche.” Also, I have a lot of friends and family who don’t have the context and aren’t super familiar with how to interpret data and so I was explaining (COVID-19 data issues) to them a lot. Especially around the time that hospitalization data switched from the CDC to the HHS, I saw a lot of confusion from friends and people on social media, wondering what the change meant for them. And since I know a lot about the data and I like to talk about it, I thought I may as well explain it in a newsletter. So, it’s been a nice platform to share what I understand about the data and express my frustrations at some public sources that don’t do a great job.

Q: What was your background? How did you end up at Stacker and the COVID-19 tracking project?

A: I double majored in biology and English at Barnard College. I got interested in science communication my junior year in college, as I wanted some way to combine my academic interests. I tried out lab work and it wasn’t for me, and then Stacker, a new publication (that creates slide shows to contextualize data), took me on as an intern soon after they launched in 2017. Stacker gave me a lot of flexibility and inspiration to learn how to use more data tools and a bit of coding. And then this April, the COVID-19 Tracking project put out a call for volunteers that was posted on the National Association of Science Writers listserve, and I signed up. I usually volunteer for them for six to eight hours a week, helping update the main dataset and the race and ethnicity dataset. It’s a good way to contribute and help keep everyone informed.

Betsy Ladyzhets is a data journalist and science writer based in Brooklyn, New York. As a research associate at Stacker, she manages the publication’s Science and Lifestyle verticals. Ladyzhets also is a volunteer at the COVID Tracking Project, where she focuses on data standards and the COVID Racial Data Tracker. She recently launched a newsletter covering data on the pandemic called the COVID-19 Data Dispatch.