Did you know we are living in the Zettabyte Era? Honestly, did you even know what a zettabyte is? Kilobytes, gigabytes, maybe even terabytes, sure, but zettabytes? Well, if you ran data centers you’d know, and you’d care because demand for data storage is skyrocketing (all those TikTok videos and Netflix shows add up). Believe it or not, pretty much all of that data is still stored on magnetic tapes, which have served us well for the past sixty some years but at some point, there won’t be enough tapes or enough places to store them to keep up with the data storage needs.
That’s why people are so keen on DNA storage – including me.
A zettabyte, for the record, is one sextillion bytes. A kilobyte is 1000 bytes; a zettabyte is 10007. Between gigabytes and zettabytes, by powers of 1000, come terabytes, petabytes, and exabytes; after zettabyte comes yottabytes. Back in 2016, Cisco announced we were in the Zettabyte Era, with global internet traffic reaching 1.2 zettabytes. We’ll be in the Yottabyte Era before the decade is out.
In Partnership with the Duke-Margolis Center for Health Policy, Resolve to Save Lives, Carnegie Mellon University, and University of Maryland, Catalyst @ Health 2.0 is excited to announce the launch of The COVID-19 Symptom Data Challenge. The COVID-19 Symptom Data Challenge is looking for novel analytic approaches that use COVID-19 Symptom Survey data to enable earlier detection and improved situational awareness of the outbreak by public health and the public.
How the Challenge Works:
In Phase I, innovators submit a white paper (“digital poster”) summarizing the approach, methods, analysis, findings, relevant figures and graphs of their analytic approach using Symptom Survey public data (see challenge submission criteria for more). Judges will evaluate the entries based on Validity, Scientific Rigor, Impact, and User Experience and award five semi-finalists $5,000 each. Semi-finalists will present their analytic approaches to a judging panel and three semi-finalists will be selected to advance to Phase II. The semi-finalists will develop a prototype (simulation or visualization) using their analytic approach and present their prototype at a virtual unveiling event. Judges will select a grand prize winner and the runner up (2nd place). The grand prize winner will be awarded $50,000 and the runner up will be awarded $25,000.The winning analytic design will be featured on the Facebook Data For Good website and the winning team will have the opportunity to participate in a discussion forum with representatives from public health agencies.
Phase I applications for the challenge are due Tuesday, September 29th, 2020 11:59:59 PM ET.
Learn more about the COVID-19 Symptom Data Challenge HERE.
Challenge participants will leverage aggregated data from the COVID-19 symptom surveys conducted by Carnegie Mellon University and the University of Maryland, in partnership with Facebook Data for Good. Approaches can integrate publicly available anonymized datasets to validate and extend predictive utility of symptom data and should assess the impact of the integration of symptom data on identifying inflection points in state, local, or regional COVID outbreaks as well guiding individual and policy decision-making.
These are the largest and most detailed surveys ever conducted during a public health emergency, with over 25M responses recorded to date, across 200+ countries and territories and 55+ languages. Challenge partners look forward to seeing participant’s proposed approaches leveraging this data, as well as welcome feedback on the data’s usefulness in modeling efforts.
Indu Subaiya, co-founder of Catalyst @ Health 2.0 (“Catalyst”) met with Farzad Mostashari, Challenge Chair, to discuss the launch of the COVID-19 Symptom Data Challenge. Indu and Farzad walked through the movement around open data as it relates to the COVID-19 pandemic, as well as the challenge goals, partners, evaluation criteria, and prizes.
This article originally appeared in the American Bar Association’s Health eSource here.
By KIRK NAHRA
This piece is part of the series “The Health Data Goldilocks Dilemma: Sharing? Privacy? Both?” which explores whether it’s possible to advance interoperability while maintaining privacy. Check out other pieces in the series here.
Congress is debating whether to enact a national privacy law. Such a law would upend the approach that has been taken so far in connection with privacy law in the United States, which has either been sector specific (healthcare, financial services, education) or has addressed specific practices (telemarketing, email marketing, data gathering from children). The United States does not, today, have a national privacy law. Pressure from the European Union’s General Data Protection Regulation (GDPR)1 and from California, through the California Consumer Privacy Act (CCPA),2 are driving some of this national debate.
The conventional wisdom is that, while the United States is moving towards this legislation, there is still a long way to go. Part of this debate is a significant disagreement about many of the core provisions of what would go into this law, including (but clearly not limited to) how to treat healthcare — either as a category of data or as an industry.
So far, healthcare data may not be getting enough attention in the debate, driven (in part) by the sense of many that healthcare privacy already has been addressed. Due to the odd legislative history of the Health Insurance Portability and Accountability Act of 1996 (HIPAA),3 however, we are seeing the implications of a law that (1) was driven by considerations not involving privacy and security, and (2) reflected a concept of an industry that no longer reflects how the healthcare system works today. Accordingly, there is a growing volume of “non-HIPAA health data,” across enormous segments of the economy, and the challenge of figuring out how to address concerns about this data in a system where there is no specific regulation of this data today.
In an effort to help women make informed decisions about where to deliver their babies, we set out to collect a comprehensive, nationwide database of hospitals’ C-section rates. Knowing that the federal government mandates surveillance and reporting of vital statistics through the National Vital Statistics System, we contacted all 50 states’ (+Washington D.C.) Departments of Public Health (DPH) asking for access to de-identified birth data from all of their hospitals. What we learned might not surprise you — the lack of transparency in the United States healthcare system extends to quality information, and specifically C-section data. Continue reading…
Value-based healthcare is gaining popularity as an approach to increase sustainability in healthcare. It has its critics, possibly because its roots are in a health system where part of the drive for a hospital to improve outcomes is to increase market share by being the best at what you do. This is not really a solution for improving population health and does not translate well to publicly-funded healthcare systems such as the NHS. However, when we put aside dogma about how we would wish to fund healthcare, value-based healthcare provides us with a very useful set of tools with which to tackle some of the fundamental problems of sustainability in delivering high quality care.
What is value?
Defined by Professor Michael Porter at Harvard Business School, value is defined as a function of outcomes and costs. Therefore to achieve high value we must deliver the best possible outcomes in the most efficient way, outcomes which matter from the perspective of the individual receiving healthcare and not provider process measures or targets. Sir Muir Gray expands on the idea of technical value (outcomes/costs) to specifically describe ‘personal value’ and ‘allocative value’, encouraging us to focus also on shared decision making, individual preferences for care and ensuring that resources are allocated for maximum value. This article seeks to demonstrate that the role of data and informatics in supporting value-based care goes much further than the collection and remote analysis of big datasets – in fact, the true benefit sits much closer to the interaction between clinician and patient.
Despite (some might say, because of) a raft of new biological methods, pharma R&D has struggled with its EROOM problem, the fact that the cost of successfully developing a new drug, including the cost of failures, has been relentlessly increasing, rather than decreasing, over time (EROOM is Moore spelled backwards, as in Moore’s Law, describing the rapid pace of technology improvement over time).
Given the impact of technology in so many other areas, the question many are now asking is whether technology could do its thing in pharma, and make drug development faster, cheaper, and better.
Many major pharmas believe the answer has to be yes, and have invested in some version of a by-now familiar data initiative aimed at aggregating and organizing internal data, supplementing this with available public data, and overlaying this with a set of analytical tools that will help the many data scientists these pharmas are urgently hiring to extract insights and accelerate research.
Artificial intelligence requires data. Ideally that data should be clean, trustworthy and above all, accurate. Unfortunately, medical data is far from it. In fact medical data is sometimes so far removed from being clean, it’s positively dirty.
Consider the simple chest X-ray, the good old-fashioned posterior-anterior radiograph of the thorax. One of the longest standing radiological techniques in the medical diagnostic armoury, performed across the world by the billions. So many in fact, that radiologists struggle to keep up with the sheer volume, and sometimes forget to read the odd 23,000 of them. Oops.
Surely, such a popular, tried and tested medical test should provide great data for training AI? There’s clearly more than enough data to have a decent attempt, and the technique is so well standardised and robust that surely it’s just crying out for automation?Continue reading…
Data is not always the path to identifying good medicine. Quality and cost measures should not be perceived as “scores,” because the health care process is neither simplistic nor deterministic; it involves as much art and perception as science—and never is this more the case than in the first step of that process, making a diagnosis.
I share the following story to illustrate this lesson: we should stop behaving as if good quality can be delineated by data alone. Instead, we should be using that data to ask questions. We need to know more about exactly what we are measuring, how we can capture both the physician and patient inputs to care decisions, and how and why there are variations among different physicians.
A Tale of Two Doctors
“As soon as I start swimming, my chest feels heavy and I have trouble breathing. It is a dull pain. It is scary. I swim about a lap of the pool, and, thankfully, the pain goes away. This is happening every time I go to work out in the pool”.
Her primary physician listened intently. With more than 40 years of experience, the physician, a stalwart in the medical community, loved by all, who scored high on the “physician compare” web site listing, stopped the interview after the description and announced, with concern, that she needed to have a cardiac stress test. The stress test would require walking on a “treadmill” to monitor her heart and would include, additionally, an echocardiogram test to see if her heart was being compromised from a lack of blood flow.
“But, I have had three echocardiogram tests in the last year as part of my treatment for breast cancer and each was normal. Why would I need another”?
“Well, I understand your concern about more tests, but the echocardiograms were done without having your heart stressed by exercise. The echo tests may be normal under those circumstances, but be abnormal when you are on the treadmill. You still need the test, unfortunately. I want to order the test today and you should get it done in the next week”.
I don’t know why, but even as a young person I never could make sense of the saying, “seeing is believing”. Seeing, vision, is nothing more than a data collection instrument, not an arbiter of insight. I saw my wife frown at me the other day, for example, after I claimed to have washed the dishes so thoroughly that no spot of grease could be left behind. I have made this claim before and been incorrect, so the frown, the data, triggered an anticipation of being rebuffed. However, nothing of that sort followed. I asked, Why the frown?” She responded, “I just cut my finger”. The frown was obvious, the cause unclear. I believed I was about to be reprimanded and missed the chance to notice her accident. This story suggests that a truer aphorism might be, instead, then, that “believing is seeing”.
The phrase “healthcare data” either strikes fear and loathing, or provides understanding and resolve in the minds of administration, clinicians, and nurses everywhere. Which emotion it brings out depends on how the data will be used. Data employed as a weapon for purposes of accountability generates fear. Data used as a teaching instrument for learning inspires trust and confidence.
Not all data for accountability is bad. Data used for prescriptive analytics within a security framework, for example, is necessary to reduce or eliminate fraud and abuse. And data for improvement isn’t without its own faults, such as the tendency to perfect it to the point of inefficiency. But the general culture of collecting data to hold people accountable is counterproductive, while collecting data for learning leads to continuous improvement.
This isn’t a matter of eliminating what some may consider to be bad metrics. It’s a matter of shifting the focus away from using metrics for accountability and toward using them for learning so your hospital can start to collect data for improving healthcare.Continue reading…
Listen to them on Itunes or Spotify
Subscribe to our mailing list
Want to Partner with THCB?
View our Advertisement & Sponsorship Prospectus here