The Data Entry Paradox

Everyone, including this blog writer, has been touting the virtues of the vast troves of data already or soon to be available in the electronic health record (EHR), which will usher in the learning healthcare system [1, 2]. There is sometimes unbridled enthusiasm that the data captured in clinical systems, perhaps combined with research data such as gene sequencing, will effortlessly provide us knowledge of what works in healthcare and how new treatments can be developed [3, 4]. The data is unstructured? No problem, just apply natural language processing [5].

I honestly share in this enthusiasm, but I also realize that it needs to be tempered, or at least given a dose of reality. In particular, we must remember that our great data analytics and algorithms will only get us so far. If we have poor underlying data, the analyses may end up misleading us. We must be careful for problems of data incompleteness and incorrectness.

There are all sorts of reasons for inadequate data in EHR systems. Probably the main one is that those who enter data, i.e., physicians and other clinicians, are usually doing so for reasons other than data analysis. I have often said that clinical documentation can be what stands between a busy clinician and going home for dinner, i.e., he or she has to finish charting before ending the work day.

I also know of many clinicians whose enthusiasm for entering correct and complete data is tempered by their view of the entry of it as a data blackhole. That is, they enter data in but never derive out its benefits. I like to think that most clinicians would relish the opportunity to look at aggregate views of their patients in their practices and/or be able to identify patients who are outliers in one measure or another. Yet a common complaint I hear from clinicians is that data capture priorities are more driven by the hospital or clinic trying to maximize their reimbursement than to aid clinicians in providing better patient care.

Another challenge for clinicians is the time required for electronic data entry. There is no question that the 20th century means of clinical documentation, mostly consisting of scribbling illegible notes on paper, was much easier and faster than typing and/or clicking. While I think that few clinicians want to go back to hand-written notes, there is an appeal of their ease of use, at least for the person doing the entry.

Related to the time for electronic data entry is the “tension” between structured data, which makes aggregation and analysis easier, and “flexible” (or narrative) data, which allows the clinician to tell the story of the patient [6]. Many clinicians report that excess structuring of data (i.e., pointing and clicking) loses the story of the patient, although those who process the data know that structured data is easier to analyze.

An additional challenge for electronic data entry for clinicians is the shift of the focus from the patient to the computer. This was exemplified in a cartoon published earlier this year in JAMA that showed a 7-year-old’s sketch of an exam room with the physician hunched over the computer, his back turned away from the patient and her family [7] (the sketch viewable at http://jama.jamanetwork.com/article.aspx?articleid=1187932).

An excellent example of the promise but limitations of current data entry systems was recently documented by Parsons et al. [8], who found in a wide sample of primary care EHRs in New York City that the accuracy of data for measuring breast cancer screening quality measures was highly variable due to differing practices in documentation, workflow, and related factors. While some physicians had the quality of their care measured accurately, for many others it was underestimated due to data limitations and not the care they provided.

I cannot claim to have easy answers to this grand challenge, but two related aspects of it sit in front of us:

  1. We need to find better and faster ways for clinicians to enter data into the EHR that allow data whose quality is good enough to be re-used for other purposes, such as research, quality measurement and improvement, and public health.
  2. We must reward clinicians for their efforts in entering high-quality data. We must allow them to see aggregate views of patients in their practices and be able to identify outliers. We must also engage them in research, quality improvement, and other system uses of their data.

In short, the concept of “garbage in, garbage out” still remains a problem for computers and information technology nearly a half-century after it was coined. In healthcare, we must give clinicians the best tools and incentives for them to participate in the learning healthcare system. For informatics, the problem of data entry is a grand challenge every bit as important as how to make use of its growing quantity, since the knowledge derived from that data will only be as good as the quality of what is input.


1. Friedman, C., Wong, A., et al. (2010). Achieving a nationwide learning health system. Science Translational Medicine, 2(57): 57cm29. http://stm.sciencemag.org/content/2/57/57cm29.full.

2. Greene, S., Reid, R., et al. (2012). Implementing the learning health system: from concept to action. Annals of Internal Medicine, 157: 207-210.

3. McCarty, C., Chisholm, R., et al. (2010). The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Genomics, 4(1): 13. http://www.biomedcentral.com/1755-8794/4/13.

4. Rea, S., Pathak, J., et al. (2012). Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: The SHARPn project. Journal of Biomedical Informatics, 45: 763-771.

5. Nadkarni, P., Ohno-Machado, L., et al. (2011). Natural language processing: an introduction. Journal of the American Medical Informatics Association, 18: 544-551.

6. Rosenbloom, S., Denny, J., et al. (2011). Data from clinical notes: a perspective on the tension between structure and flexible documentation. Journal of the American Medical Informatics Association, 18: 181-186.

7. Toll, E. (2012). The cost of technology. Journal of the American Medical Association, 307: 2497-2498.

8. Parsons, A., McCullough, C., et al. (2012). Validity of electronic health record-derived quality measurement for performance monitoring. Journal of the American Medical Informatics Association, 19: 604-609.

William Hersh, MD is Professor and Chair of the Department of Medical Informatics & Clinical Epidemiology at Oregon Health & Science University in Portland, OR. He is a well-known leader and innovator in biomedical and health informatics. In the last couple years, he has played a leadership role in the ONC Workforce Development Program. He was also the originator of the 10×10 (“ten by ten”) course in partnership with AMIA. Dr Hersh maintains the Informatics Professor blog.

16 replies »

  1. At Sydney Escorts, we have got a wide range of mature and intelligent escorts who have got the best training in adult entertainment. This is enough proof that the escorts have got the best skills to ensure that you are having the best moments at any time that you have them as your travel companions. However, one thing that you have to bear in mind is that all the escort services Sydney that are offered by these beauties are aimed at the high end market.

  2. Really interesting article as it points out two critical factors about Big data mngmt nearly no one talks about:
    – the physicians
    – the data quality of the “data” part in “big data”
    It seems that the second point is still a big issue for health information mngmt, and I would be curious to know how it is addressed by the physicians themselves.

  3. Telling that the patients height and weight are the first data the guvmint wants.

    Sizing the herd, I’d say.


  4. This article hits home for me. When I have to enter 10 minutes worth of data while doing a 15 minute case on a 2 y/o, it matters a lot. I really do have to look at my patient, hold a mask with one hand and sometimes do something other than enter data with the other one. Give us EHRs that work and dont require so much time. We will use them. Stop designing them for other software designers and administrators.


  5. Dumb question: What is MU?

    I think one of the main problems with medical records is handwriting. Most doctors and nurses are not adept at typing. Many handwritten notes are unreadable (nurses generally have better handwriting than docs)

    Until we are able to transform handwritten notes into typed documents, the EHR is going to be severely limited.

    As a user of voice recognition, I think it has promise.

  6. Agree that EMR without MU would be the smart way for many practices to go, but lots of times the suits won’t let them.

    Did you see the House Ways and Means Committee’s letter asking Sebelius to suspend MU incentive payments until the standards become, well, at least slightly meaningful? Is that significant or is it just politics?

  7. I’m not talking about MU documentation (I’m not a big fan of some of those criteria, either), just requisite documentation in general. Dictating? Use Dragon. Use Praxis EMR. There’s plenty of help for semi-structured narrative assessments and plans.

    My Primary has been on an EMR since 2004. I have never seen it as an obstacle to our communication. Moreover, he doesn’t have to rely on memory during post-encounter dictation amid a typically crazy busy day. My SOAP is done when he escorts me out to checkout.

    BTW, about 75% of MU documentation can (and should) be done by support staff. And 5 of the 15 core criteria are one-time check-off attestations. Yeah, we enabled our med-meds; yeah, we have a CDS flag for all of our dx 250’s in need of an a1c. Formulary enabled (Menu Set)? Check….

    Yeah we did our Core 15 “Protect PHI” 45 CFR 164.308 et seq risk assessment and mitigation (Right: they’re mostly all lying about that).

    Moreover, you can use an EMR and not participate in MU. It’s not required. Do the cost/benefit analysis going out to the “payment reduction” years (which may well pale in comparison with other painful adjustments that may be coming). Maybe you’ll net out ahead.

    At Clinic Monkey, btw, we make it easy to Grab The Gold While The Grabbin’ Remains Good.


  8. I don’t see how you can say that.

    Yes, the MU garbage is not an inherent design flaw in EMRs, but rather just another hoop you have to jump through if you decide to use them.

    But it’s just a fact that dictating a thorough text note AFTER the visit is much, much quicker than typing in the same note. If one isn’t sitting in front of the patient doing steno work, but actually listening to them, there’s no need to take notes during the visit (and it’s much more polite).

  9. Y’know, I am REALLY tired of that lame excuse. Like scribbling stuff into paper forms (including checking off boxes and circling choice options, etc) is not “data entry.”

    (And, its equally fatuous collary “spending time looking at the computer rather than the patient.” Like looking down at and shuffling through the paper gets a pass there.)

    There are numerous genuine and serious issues with HIT. This is not one of them.

    Redesigning the workflows to clear the decks for the doc so she can focus on documenting only that which she legally can is a good place on which to focus.

    The documentation has to get done one way or another. That it is onerous and “unprofitable” is not the proximate fault of any technology, digital or analog.


  10. “Everyone, including this blog writer, has been touting the virtues of the vast troves of data already or soon to be available in the electronic health record (EHR), which will usher in the learning healthcare system”

    How do you propose to get this data and how will you get patients permission to share?

  11. I worked for years in the VA and for all it’s issues, It has one of the best EMR systems around. And even with that, the problems are huge – I can’t tell you how hard it is to find relevant data in the sea of information, and that’s not getting into the hunt and peck typing of some providers that make their notes close to irrelevant or the templated notes that lead to rote information gathering and errors. When I followed that work by consulting in many different hospital systems, I learned that it is a problem everywhere and totally EMR agnostic. The people who think data will save the world are severely over-reaching.

  12. This is the main reason why Pharmacovigilance has been a huge challenge. Can we avoid another Vioxx? Yes we can, if we capture data in the right way. Two questions here. Who is “WE” and what is “DATA”?

    The source of data capture is frequently at odds with their goals and objectives. Everything from providers capturing data in an EMR just to get their incentive money. Therefore, vital structured data that can be used for clinical intelligence may be missing. Another reason for using technology may be to get, as Dr. Hersh mentions, higher reimbursement, or it may be simply to satisfy the needs of the ‘master’, the hospital.

    Whatever the case may be, data is, at this time, just not there yet.

    And then, there is the entirely different issue of data harmonization where there are no standards.

  13. Data entry. An unsexy topic if ever there was one. But I think the analysis here is correct. 99% of the physicians I’ve spoken with about this topic, bring up the “I’m not a data entry clerk dammit” as soon as the subject comes up…

    the ones I’ve talked to who are doing the most with this have come up with workarounds that make this part of the job more manageable. Pre- entry of some parts of the record at intake seems to work well …