The Phonemic Path: A Way to Measure Health That Can Lead to Health Improvement

flying cadeuciiWe know what improves health–but we’re simply years away from having the tools to achieve it. We know that we can reduce the chronic conditions plaguing the world’s populations by a subtle combination of:

  • Closely monitoring the behavior of individuals
  • Linking health goals to treatments and behavior changes
  • Upgrading the problems in communities that contribute to disease

Such activities call for supple and sophisticated ways to link together disparate types and sources of information–the subject of this article. Doing such linking requires a new way of approaching data that is lacking today in our health care system.

The process of developing the new data approach will have to be incremental (no “Health Data Manhattan Project” for us), will involve thousands of contributors in crowd sourced fashion, and will take unanticipated directions based on the insights of the contributors. I am not laying out a framework in this article, but just touching on the themes that the project will likely explore. I’ll also mention a few of the people working in this area, notably the Yosemite Project. I call this idea Phonemic Path, in reference to the extensive research biologists are carrying on to find genomic paths that explain disease.

Screen Shot 2015-09-03 at 8.44.28 AM

A community and family focus

We’ll start at the highest level by recognizing that our health depends largely on our communities, families, and workplaces. No drug or other treatment has saved as many lives as offering communities clean running water (a historic advance that is threatened in many parts of the world, and even many parts of the U.S.).

Public health agencies and organizations such as the Centers for Disease Control carry out heroic efforts to find out what’s making us sick. But when a patient walks into the clinic, there is no way to link the information on her family, community, and workplace to her.

A doctor who has the rare opportunity to interview a patient will sometimes enter a free-text note into the record mentioning that a spouse suffering from dementia is draining the patient’s energy, or that she is unable to get exercise because she lives in a neighbourhood experiencing random gunfire. A subsequent sharp-eyed clinician may even see that note and follow up. (Such discoveries are becoming less likely as the undisciplined adoption of EHRs lead to bloated clinical notes.) But so far as a comprehensive system for improving health goes, this information is invisible.

The integration of daily life and personal device information

Focusing down now on the individual, we have to commit to the shift so many have demanded from looking at disease to looking at health. We need to know whether an individual has increased or decreased his activity over the past year. It is often the elderly who lose functioning, but not they along. I often hear friends lament that a demanding new job has demolished their exercise and health eating habits. All these things should be stored in a format that allows a program to draw them out and convey them in a chart or animated visualization.

Low-cost fitness devices, and the miniaturization of medical devices in general, open up tremendous opportunities for monitoring patients and comparing their phonemics to healthy models. Once again, the data falls short.

We cannot wait for standards to take hold in the device space. It would be simple if a single vendor could specify how to store data (as Apple is trying to do with Apple Health) or if a single electronic health record was dominant enough to dictate how it would receive data. But we are in a many-to-many situation here. Validic does good business linking the various devices and EHRs, and such efforts will have to create the personal health record until the field settles down.

Linked data

What we’ll end up with is widely scattered repositories of data maintained by public health organizations, government agencies, clinicians, patients, and researchers. Now we have to ask how the myriad of data types can come together productively. Luckily, the computer field has numerous tools for generating and looking at graphs, which are flexible representations of relationships.

To take a trivial example, suppose a clinician determines that a patient’s BMI of 44 is contributing to diabetes with macular edema, and that in turn, left knee joint pain leading to lack of exercise contributes to the obesity (which of course contributes to the knee pain as well). A graph showing these relationships could look like Figure 1.

[See attached file health_graph.pdf for Figure 1.]

Figure 1 isn’t easy for a computer to read, but the basic idea of circles joined by arrows can be represented in numerous formats and databases that come with their own programming libraries. The Resource Description Framework (RDF) has been used in web pages for years and allows many of the conveniences you see in search engines, RSS feeds, and elsewhere. A more complex language called OWL is the darling of many ontologists, although it hasn’t caught on for general Web use. Commercial graph tools are on the market, and at least one called Neo4j includes an open source implementation.

Now a record can show in a structured manner how a patient’s medical conditions relate to each other, to his behavior, and to public health conditions. Eventually, such graphs will crowd out rigid EHR data representations and one-dimensional, deliberately circumscribed standards such as ICD (which isn’t getting the most enthusiastic reception from doctors) and CIMI.

Making it happen

An estimated twenty-four billion dollars has been spent (some now say thirty) to encourage the adoption of EHRs that do not solve our health problems and are poorly designed to evolve in the face of the needs laid out in this article. We are not going to get another expenditure on that scale, so we have to be clever.

I think that a data network will be built item by item as consumers, clinicians, and vendors make contributions. A few ubiquitous conditions such as diabetes, mood disorders, and congestive heart failure will quickly develop communities and establish terms and codes everyone recognizes. For rare diseases, the patient advocacy groups will draw on the people who know the condition the best and create a standard. Gradually, we’ll come together around how to represent behaviors, health conditions, and data.

Numerous organizations are trying to build ontologies of linked data, such as the National Center for Biomedical Ontology and the World Wide Web Consortium’ Health Care and Life Sciences group HL7 already has a Reference Information Model that narrowly focuses on the actions taken by clinicians and their outcomes.

The problem with such large, top-down projects is that they take place outside the scruffy give-and-take of real-life treatment and tend to make slow progress or be forgotten and never adopted. Still, any work produced by standards bodies that proves useful can be incorporated into Phonemic Path.

How do we get data sources to work together? Imagine a spectrum with two poles:

  • Everyone uses a unique format, and when two sites want to talk they create a tool to convert data between formats. In theory, this could lead to conversion tools to handle all conversions between sites.
  • Everybody uses the same standard format. This begs the questions of who chooses the format, how it can evolve while all sites stay in lock step, and what to do with pre-existing data in other formats.

Today we are somewhere in between these poles. A few dimensions are well represented by robust standards, such as LOINC for lab results. HL7 defines some sorta-kinda standards that EHR vendors pledge to support and implement in incompatible ways. Each EHR uses its own format internally. Validic translates each EHR it supports into an internal format, leading to potentially 2n conversions instead of n2. Validic leaders also told me they will enthusiastically switch to a standard format if an appropriate one arises.

I’m expecting that we’ll live with a variety of solutions lying between the two poles for some time. Legacy systems will convert data and cope with the inconsistencies and errors that come up. Newer systems will voluntarily standardize, driven by the demands of consumers for data they can use. The related problems of patient identification and reconciling records will be carried out by the individuals themselves, newly empowered with their own data.

The founders of the Yosemite Project understand that change will come from many quarters and will combine the old with the new. Standards are great where they exist, says software architect David Booth. Where standards have not yet been established, we can ask organizations and volunteers to translate between clashing formats.

Conor Dowling, another Yosemite Project leader, told me that a “bunker mentality in health care” has led to proliferating silos of different standards, along with other practices that hold back innovation. For instance, to identify the ingredients in medications, this bunker mentality led FDA, NIH, and the VA each to create its own list. And now each of these bodies has a self-contained process built around its own ontology.

Dowling made an astute prediction with which I totally agree: standardization will arise first among ancillary disciplines such as general practitioners trying to represent their diverse populations and individuals quantifying themselves. More traditional settings such as the in-patient environment will will come along later.

Let’s not depend on consortia of companies concerned primarily with jockeying for market share, who design systems that fail to interoperate. Let’s not wait for standards committees that take years to design huge superstructures that get adopted around the time they’re obsolete. Standard ontological work (such as finding synonyms) will take place organically as people join the initiative.

The computer field achieves the most when it makes complicated things simple. This is true of the great computer languages (such as Lisp, C, and Python), the classic Internet protocols (such as HTTP), and the most common data structures (stacks, lists, trees). By utilizing graphs and gathering the experience of participants, the health care field can make a simple matter of its exceedingly complicated data needs.

Andy Oram is an editor at O’Reilly Media.

Categories: Uncategorized

Tagged as: ,

3 replies »

  1. Thanks for the article, Andy, I really enjoyed reading it. At first when I read the title, I thought “surely he misspelled ‘phenomic’?” Shows you where my head is these days 🙂

    Best- Jon