The recently-announced acquisition of the oncology data company Flatiron Health by Roche for $2.1B represents a robust validation of the much-discussed but infrequently-realized hypothesis that technology entrepreneurs who can turn health data into actionable insights can capture significant value for this accomplishment.
Four questions underlying this deal (a transaction first reported, as usual, by Chrissy Farr) are: (1) What is the Flatiron business model? (2) What makes Flatiron different from other health data companies? (3) Why did Roche pay so much for this asset? (4) What are the lessons other health tech companies might learn?
The Flatiron Business Model
To a first approximation, Flatiron has a model that can be seen as similar to tech platforms like Google and Facebook – delight (or at least offer a useful service to) front-end users, and then sell the data generated to other businesses. For Flatiron, the front-end users are oncologists (mostly community, some academic), and the data customers are pharma companies. In contrast to Google (and also in contrast to the less successful Practice Fusion, recently acquired at a loss), Flatiron doesn’t sell access to front-end users themselves (e.g. through targeted ads), but rather access to de-identified, aggregated clinical information.
Success of this model requires that the Flatiron platform is attractive to oncology practices, who must feel that they’re getting distinct value from it and believe that it helps them fulfill their primary mission of taking care of cancer patients. If this is true, then the Flatiron platform will enjoy continued traction from its current base, and may more easily win over new users (including practices that use a different EMR system, like Epic, but still want access to the Flatiron network and analytics).
By all accounts, the oncologist-facing Flatiron platform (originally a mediocre oncology EHR that Flatiron acquired and serially refined) remains a work in progress; the company appears determined to continue to improve the quality of this application, even (especially) post-acquisition, as oncologists are foundational to this model, and the more delighted oncologists are with it, the more traction it stands to achieve.
On the back end, Flatiron has created a dataset thats seems largely distinct in the industry – a meticulously assembled oncology dataset that pulls information from the electronic health records and organizes it in a fashion that approaches the quality of clinical research, enabling investigators (and regulators) to ask questions of the data that might normally require a dedicated, stand-alone study to resolve.
What Makes Flatiron Different?
Flatiron’s key insight wasn’t so much recognizing the foundational need for a robust, clinical research-grade dataset, but rather, realizing that creating this required meticulous, artisanal data curation – largely done by hand, Mechanical Turk style.
The Mechanical Turk was a fake chess-playing machine from the late 18th century, presented as an intelligent device, but actually powered by hidden human players. Amazon (and indeed, many other tech companies) use a version of this approach to solve problems, generally problems that seem like they should be addressable by a computer but which in fact may be most efficiently or economically addressed by actual people.
Flatiron recognized that, at least in cancer, half or more of the most important data in health records isn’t in structured data fields, but rather in unstructured data, the free text fields of pathology reports and clinical notes. While technology can in theory “read” these fields, actually pulling out the most useful aspects, at least today, requires people, and Flatiron has hired and trained an army of them, generally health professionals who painstakingly read through unstructured data and extract the relevant aspects. Technology tools assist in this process (hence “technology-enabled”), and the quality of the extraction is closely and systematically monitored, but the essential work is still done by human beings.
The key driver of this approach was Dr. Amy Abernethy, a physician-scientist who spent her career at Duke focused on the question of how to upgrade the quality of EHR data so that it could be more useful for both clinicians and researchers.
When my co-host Lisa Suennen and I interviewed Abernethy on our Tech Tonics podcast last year, she told us that when she was first introduced to the young Flatiron founders Nat Turner and Zach Weinberg, and explained to them the need to painfully extract data from unstructured fields, “they listened and weren’t scared off by fact that it was a really hard problem to solve.” In the interview prep, Abernethy also told me, “what Flatiron did was not be scared off by doing the hard stuff – everyone else says ‘That is someone else’s problem to solve.’”
On the podcast, Abernethy explained,
“As we imagine the data in electronic health records, it’s easy to think about structured data and how we might make use of it. For example, using glucose or HbA1c values to monitor what’s going on in diabetes. But in cancer, many of the critical data points reside in documents that are not structured at all. For example, histology. If a cancer is an adenocarcinoma or a squamous cell cancer is something that’s in a pathology report, and sometimes it’s really distinct, and it’s pretty easy to pull that information out. But a lot of the times, it’s contextual, and includes a lot of the other information that a pathologist is seeing. And this is not just histology, but information like biomarkers, and what’s in the radiology report, and what’s in the clinical case notes . We estimate that 50% or more the critical data points you need for research live in these PDF representations of data.” [Comments lightly edited for clarity.]
Indeed, a fascinating publication from Flatiron and Pfizer compared the composition of two theoretical cancer research cohorts, one identified by using just structured data from an EHR, and a second identified by using the combination of structured and unstructured data. The profound conclusions were how different the composition of these cohorts were, how much larger the second cohort was, and how surprisingly little overlap there was between the two cohorts. In other words, if you’re trying to draw conclusions on the basis of extracting data from structured fields alone, you’re going to make a lot of mistakes, and you may miss a lot of patients.
In some ways, the vast quantities of data in hospital EHR systems has tempted and frustrated researchers for ages. On the one hand, it’s tantalizing – with so much data in these systems, surely this information can be mined for clinical and scientific insight? While there have been discrete examples of success, the process has proved maddeningly frustrating, and it seems, stubbornly resistant to automation – especially since both pharma researchers and practicing clinicians are exquisitely sensitive to data utility, not data volume. Moreover, even some examples of apparent automation success – such as the genotype/phenotype data integration that underlies the Regeneron drug discovery engine – owe much to artisanal curation of phenotype, the so-called “phenotype whispering” capability I’ve discussed at a number of conferences (in a previous role).
Why Did Roche Pay So Much For Flatiron?
The $2.1B total acquisition price caught the attention of investors and entrepreneurs alike, begging the inevitable question, why? A thoughtful discussion of this transaction has been written by Andrew Matzkin of the consultancy Health Advances – see here.
From what I’ve been able to piece together, it appears the answer is that Roche, through Flatiron, is embracing an evolving vision of clinical trial validation, a world in which real world data, extractable in nearly-real time from a network of oncology practices, can be used to provide a trusted, clinical-research grade readout of drug efficacy and utility. This would offer the possibility of obtaining regulator-worthy data with unprecedented ease, potentially saving significant money both from clinical study costs and by delivering the relevant data with the speed of a database query, which (if accepted by regulators) could lead to quicker decisions, and a faster time-to-market. Given Roche’s commitment to oncology at a global level, it’s not difficult to imagine how reduced trial costs and more rapid time to market could quickly translate into billions of dollars of value for the pharmaceutical giant.
An indication of this can be found in Pfizer and Flatiron poster presented at a breast cancer conference last year, in which the authors argue that data obtained from a cohort of Flatiron patients match data from the active control arm (representing existing standard of care) of a formal phase 3 study. The implication is that if active control arm data can be obtained reliably from a trustworthy database, then soon one might ask whether (and under what circumstances) it’s ethical to randomize patients to standard of care.
(As an aside, while some have suggested the Roche acquisition was motivated, at least in part, by the pharma company’s desire to directly access the Flatiron provider network, there is every indication by the way the transaction has been structured that this is explicitly not the case, and that Roche intends to maintain Flatiron as an independent subsidiary. For Roche, the value is likely in the clinical research-grade quality real world data generated by the Flatiron network, and they’re likely to keep their hands off, and do everything they can to keep the data flywheel spinning.)
Perhaps the most significant takeaway from the Flatiron story – what Flatiron figured out and what so many other health tech companies miss – is the importance of viscerally understanding what your customers want. In the case of Flatiron, it means truly understanding what practicing oncologist actually view as meaningful, and what oncology researchers actually view as meaningful. Flatiron didn’t say “here’s our amazing technology, let’s hire a sales team and see where we can jam it,” but instead, aligned around “here’s the goal, we’re recognize it’s hard, let’s own this challenge and see what it takes to get us there.”
As former FDA Commission Robert Califf pointed out on Twitter last week, “People should pay attention to the [Flatiron] strategy–relentless curation of data–an army of “data janitors” transforming EHR data into analyzable, actionable information. Congrats to the Flatiron team–this was hard work paying off–not slogans and glitz.”
Notably, Flatiron seems to have achieved a level of physician-engineer collaboration that most health tech companies fail to approach. From the outset, it seems clear that Flatiron didn’t just want to be a software vendor, delivering tech services to providers and pharma researchers, but wanted to be an empathetic partner, wanted to, on the deepest level, grok healthcare and the problems faced by those in the trenches. This aspect of the Flatiron culture was nicely captured by this Fast Company article from last year, which noted, correctly, “this is an entirely different style of work for the engineering talent.”
Flatiron also strategically and intelligently partnered closely with regulators, providing FDA with complimentary access to data, and publishing together the results of such analyses. As Abernethy discussed on Tech Tonics, this helped Flatiron refine their platform, better understanding the questions they should be addressing, while also providing referenceability for pharma companies: if Flatiron data is good enough to be used by the FDA, perhaps it’s worthy of pharma attention as well. Moreover, to the extent that the value proposition to pharma is that regulators accept Flatiron data (in some contexts) as equivalent to dedicated study data, regulator comfort with and buy-in to the platform is absolutely essential.
Finally, as some wags on twitter and elsewhere have been pointing to the Flatiron exit as confirming the value of health tech startups targeting pharma customers, I worry it may be easy to draw the wrong conclusion here. First, most resources within pharma companies aren’t just sloshing around, but tend to be exceptionally loculated, assigned to specific business initiatives and aligned with articulated corporate goals. Second, pharma drug developers are generally highly-trained scientists conducting research that must meet strict regulatory evidentiary standards. A cutesy app, a minimally-validated technology, or access to messy data probably isn’t going to cut it, no matter how valuable you insist it is. Conversely, a rigorously-vetted approach that offers the credible possibility of moving the needle – especially in the incredibly expensive area of clinical trials – is likely to be enthusiastically received.
Parting Thought: The People
How ironic, yet also so brilliant and telling, that the key technology behind “health tech” startup Flatiron is basically human-mediated extraction of data describing human illness, to achieve a level of utility required and explicitly demanded by the human physicians caring for patients, by the human researchers developing new medicines, and by the human regulators evaluating their efforts.
It’s a lesson many other health-focused tech startups might do well to heed.
David Shaywitz is a Senior Partner with Takada Ventures and a Visiting Scientist at Harvard Medical School.