In the old American folk tale, steel-driving man John Henry squares off against a steam-powered hammer. Henry dies when his heart gives out after (just barely) beating the machine. In our modern world, human doctors are faring much better against their artificial intelligence (AI) counterparts. In fact, according to the latest research from Harvard Medical School, the humans are kicking the robots' butts.

Real-life doctors make a correct diagnosis more than twice as frequently as the 23 most commonly-used “doctor apps.” These apps typically take symptoms as input and generate their diagnosis by cross-indexing against a database of diseases.

There is no denying the growing popularity of these AI docs, however. Hundreds of millions of people use Internet sites, smartphone apps, or desktop computer programs to self-diagnose their symptoms. The process is quick and often free. The results, however, are frequently incorrect.

In the Harvard study, 234 internal medicine physicians were asked to evaluate 45 clinical cases. The cases ran the gamut of common and uncommon conditions, all with varying degrees of severity. In each scenario, the doctors had to identify the most likely diagnosis along with two additional possible diagnoses. Each of these diagnosis scenarios were solved by a minimum of 20 physicians.

In the John Henry-style head-to-head contest, the humans delivered the correct diagnosis 72 percent of the time. The algorithms delivered the right answer only 34 percent of the time. Adding some gray area to the diagnoses, 84 percent of the doctors generated the correct diagnosis in the top three possibilities, compared to only 51 percent for the robots.

Worse still (for the machines), the biggest discrepancy was in the area of more severe and less common illnesses. The software performed a little better in the arena of less acute and more common sicknesses.

"While the computer programs were clearly inferior to physicians in terms of diagnostic accuracy, it will be critical to study future generations of computer programs that may be more accurate," said senior investigator and graceful victor Ateev Mehrotra, an associate professor of health care policy at Harvard Medical School.

Champagne and victory laps behind us, there is no denying the sobering reality that, although man trounced the machinery, human physicians still clocked in with errors in 15 percent of the cases. So maybe the best solution is for man and machine doctors to join forces?

"Clinical diagnosis is currently as much art as it is science, but there is great promise for technology to help augment clinical diagnoses," Mehrotra said. "That is the true value proposition of these tools."

The final score was published in the Journal of the American Medical Association.

Digital Health

DIGITAL HEALTH

Real vs AI Doctors

485 Madison Ave., FL 21 | New York, NY 10022 | 212.365.5000