Resources: Articles

Uncovering the safety flaws in IBM's Watson supercomputer Date: 09/06/18

Casey Ross

Ike Swetlitz

By Casey Ross and Ike Swetlitz

We started with a simple goal — to learn everything we could about IBM’s leading cancer treatment adviser, Watson for Oncology.

In spring 2017, most of the headlines about Watson were glowingly positive, asking whether the supercomputer could cure cancer or replace doctors. But there were a few chinks in the narrative, most notably when MD Anderson Cancer Center scrapped a $60 million project with IBM.

We started reading countless clips and IBM documents about the design of the product. We also reached out to the company directly. We made no promises other than to explore Watson for Oncology thoroughly, and IBM agreed to make company executives available to talk with us.

Over several months we interviewed IBM engineers, clinical experts and executives. We spoke with doctors who were using the product in the United States and around the world. We got the official sales pitch, observed doctors and nurses using the system, and sat in as cancer specialists trained Watson at Memorial Sloan Kettering Cancer Center in New York City.

And no, our editors were not telling us to take as long as we needed. We had to deliver frequent updates, gather for meetings, and draft memos about the status of the reporting and the next steps. We also continued to produce other stories on our beats.

In September 2017, we published our initial story under the headline, “IBM pitched its Watson supercomputer as a revolution in cancer care. It’s nowhere close.”

It reported that the product was not performing up to the expectations that IBM had set for it through a misleading marketing campaign. Our investigation found that doctors around the world were complaining that the system’s recommendations were unreliable and biased toward American methods of care, and that IBM had neither conducted prospective studies to assess the impact of the system’s advice on doctors and patients, nor publicly acknowledged the complaints about its performance.

The story generated huge traffic as well as pushback from IBM, which published a “Get the Facts” blog to tell its version of events. It never contacted us to complain of inaccuracies or request a correction. We produced a follow-up story on how hospitals were using Watson for Oncology as a marketing tool, and another on IBM’s lobbying to exempt Watson from federal regulation. This coverage helped Stat win SABEW’s Best in Business award for general excellence in 2017.

The depth of our reporting began to produce a much bigger impact than traffic and accolades: a steady stream of new sources and tips about Watson and its applications in health care.

We vetted the information as it came in and kept an eye trained on IBM. In June 2018, Watson Health reported layoffs, downplaying them as normal restructuring. More sources came forward to challenge that narrative, and we published a series of stories (on June 11, June 14 and June 15) reporting that the company was struggling with delays and technical problems in merging its data assets for its work with hospitals.

Then, after months of reporting, we got our big break: Access to internal IBM documents that contained important details on problems with Watson for Oncology and its underlying programming.

The documents revealed, contrary to public statements by IBM’s top executives, that Watson for Oncology was trained on synthetic patient cases based on hypothetical scenarios, rather than data about the treatments and outcomes of actual patients. The documents also stated that the system had recommended inaccurate and unsafe treatments while the company was promoting it to doctors and hospitals around the world.

Our second major story detailing these revelations was published in late July 2018. IBM issued a statement for the story emphasizing that it had improved Watson for Oncology and that the product was generating substantial business for the company.

However, a few days later, we published a follow up about an internal IBM meeting in which one of its top executives disclosed that the product had actually fallen short of the company’s revenue goals. In this meeting, which took place on the morning our story based on the internal documents ran, the executive also discussed plans to add real patient data to Watson for Oncology and modify its software to reflect differences in the way cancer patients are treated around the world.

The experience of reporting this story has been among the most rewarding of our careers as journalists. It reinforced old lessons and taught us some new ones. The biggest takeaways are these:

Keep digging, trust your reporting, and don’t let go of the story.

And perhaps most importantly, keep faith in the reporting process and in the insights of your editors and fellow reporters. Cooperation with colleagues always produces a better story than competition.   

Casey Ross is national correspondent and Ike Swetlitz is Washington correspondent for Stat. You can follow them on Twitter at @caseymross and @ikeswetlitz.