The Quants Are Coming For Healthcare
What the quant revolution in investment management previews for healthcare
The advent of quantitative and algo-driven trading strategies is one of the most interesting trends in investment management over the past few decades. Just a couple weeks ago, high frequency trading found itself back in the spotlight (and on the menu) following an electric series of trading days in the public markets (🚀🚀🚀).
Today, algorithms developed by world class software engineers ingest tons of data and execute automated investment actions. And quant funds continue to capture more and more market activity:
Source: The Economist, October 2019
Source: Financial Times, January 2019
It’s interesting how many parallels there are between the evolutions of both financial data and healthcare data. Specifically:
Technological advancement has enabled an explosion in “alternative data” which yields greater richness and frequency of data
More robust data unlocks algorithmic decision making driving efficiency and better outcomes through the system
The quant revolution on Wall Street may well provide a glimpse into the future of healthcare.
Narrow and Periodic —> Robust and Continuous
Historically, data for both assessing a company’s financial health and a patient’s physical health have been available to decision makers on an episodic basis.
For example, companies have been required to publish financial data in regular earnings reports since the 1934 Securities and Exchange Act. Regulated reporting itself was a step-change in access to reliable data for investors; just look at what financial reporting was like prior to its passage:
In the period immediately preceding the enactment of the 1933 and 1934 Securities Acts, the most controversial securities sold in this country were foreign government bonds, particularly Latin American bonds. Between 1923 and 1930 American investors purchased close to $6.3 billion of foreign bonds [7], an amount equal to approximately ten percent of new securities sales in the United States during those years [8]…
… At the request of the [Senate Finance] Committee, nine leading investment banks supplied record prospectuses or offering circulars employed in selling over 100 foreign government bond issues and approximately forty foreign corporate security issues during the 1920s [12]. Generally, the foreign bond prospectuses were extremely brief, many occupying less than a page in the printed hearings...
…The foreign bond prospectuses consistently omitted information that obviously would have been material to calculation of investment risk. Of the thirty-seven Latin American government bond prospectuses in the Senate Finance Committee record, none indicated the investment bankers' gross spread, only four indicated or implied the due date of other loans, only six indicated or implied the annual interest or debt service on other debt, only four indicated whether the government issuing the bond had ever defaulted on a loan, only fifteen of the prospectuses disclosed both the issuing government's total receipts and expenditures for any recent year, and in no case were receipt and expenditure figures broken down into specific items as they would be on a corporate income statement…
…Corporate financial disclosure was little better. Passage of the 1933 Securities Act occurred after brief hearings in the House and Senate during the celebrated first hundred days period of Franklin D. Roosevelt's presidency. Apparently, in reliance on the estimates of state blue sky officials, both the House and Senate reports asserted that investors had lost $25 billion during the previous decade because of what the Senate report termed "incomplete, careless or false representations"…
…independent private studies and well-informed persons in the investment community corroborated the evidence as typical and indicative of common problems in the financial community before 1934, thus underlining the need for a general mandatory corporate disclosure system. One such study was provided by Laurence Sloan, vice-president of Standard Statistics Company. On the basis of 1927 financial reports, he found it possible to compare the gross incomes and net profits of but 235 out of 545 leading industrial firms. In other words, Sloan found that forty-three percent of gross incomes were not reported by fifty-seven percent of these firms. For only 219 of the 545 firms was it possible to obtain data revealing the sums that were charged to depreciation and depletion in the years 1926 and 1927 [20]. A subsequent study conducted by Sloan based on 1929 reports found that only 323 of 580 leading firms reported gross income; 257 or forty-four percent did not [21].”
Source: “THE SEC AND ACCOUNTING: A HISTORICAL PERSPECTIVE” - Joel Seligman (1985)
Still, a company’s performance in the time between formal financial reports remains opaque. Investors would be well served by gaining insight into a company’s performance during this “dark” period. Also, while quarterly earnings reports provide some level of visibility on a company’s performance, it’s far from comprehensive. Adding additional data into the mix would provide a more informed view on a company’s financial health.
Here’s where the advent of “alternative data” comes in. Matt Turck neatly sums it up:
Wall Street has been in the prediction game since its origins, and the idea of obtaining data not available to anyone else is not new. It used to be stock prices and fundamental information. As those became widely available, hedge funds moved on to other forms of data.
Not that many years ago, some hedge funds would send people to literally stand in front of big-box retail stores and count the number of people coming in and out, and on that basis make predictions about the retail chains themselves and the economy in general.
Alternative data now offers an opportunity to do the same thing at an entirely different scale and level of sophistication.
The trend started a few years ago with social media data. Could one not only access market moving news faster than the regular press, but also gain non-obvious insights by crunching through all tweets relating to a certain topic? Those where the days when some of the larger hedge funds and banks would start licensing the Twitter firehose.
Now hedge funds have broadened their interest to all sorts of other datasets: geo-location, credit card payments, satellite images, IoT sensor data, building permits, health data, etc. Some of this data comes from companies that are just trying to monetize their data exhaust; other data sets are coming from companies whose primary business model is to offer this data (often in the form of data products, as per the above).
A whole cottage industry has now appeared, with some key players nicely highlighted by CB Insights in this landscape:
Source: “The New Gold Rush? Wall Street Wants your Data” - Matt Turck, January 2017
By leveraging alternative data, investors can get more timely and holistic views on corporate and market performance. In summation, we have seen a shift from narrow and periodic data to robust and continuous data.
In healthcare, patient data is also narrow and (barely) periodic.
For example, a patient’s vitals are available only as snapshots in time, typically when they present in a clinical setting (e.g. for an annual check up or when they’re already sick). When a patient does show up in a clinical setting, beyond a patient’s oral communication of how they’ve been feeling, clinicians have no visibility on what that patient’s vitals looked like over the preceding days and weeks.
I thought of this recently when my father repeatedly recorded high blood pressure when he self-measured at home, only to report normal blood pressure at several subsequent doctor’s visits. Beyond my father’s possibly dubious self-reported readings, the physician had no insight into his actual blood pressure over the preceding days. As a result, the physician’s clinical guidance was based on an extremely limited set of available data.
Most of this “snapshot” data we have on patients lives in a patient’s electronic health record. EHRs are the closest thing we have to a detailed history of a patient’s clinical data today, but they are woefully insufficient to fully satisfy the healthcare needs of patients. That’s not surprising when you consider the misaligned incentives behind modern EHRs:
To understand where EHRs stand today, it helps to understand where, when, and why they originated. One of the first medical records taught surgical techniques and was written on Egyptian papyrus around 1600 BC… For the next 3,500 years, doctors wrote case histories, mostly for themselves and for their students. In other words, these histories were written for practitioners alone, and not for outside observers. The purpose of medical records changed around the 1880s, when administrators at New York Hospital, motivated by concerns about the medical record as a legal document in insurance and malpractice cases, began to supervise records’ quality and content… From that point forward, the structure and function of the medical record became increasingly influenced by third parties. Patients, however, were not consulted about the use of their medical records…
… A fundamental reason for doctors’ dissatisfaction is that today’s EHRs are designed neither for them nor for their patients. The primary beneficiaries of today’s systems are arguably financial stakeholders—insurers and administrators. Patients are even more distant from EHRs than their doctors, and the typical patient has little awareness of or access to his or her own EHR data.
Source: “From Electronic Health Records to Digital Health Biographies.” Robert F. Graboyes and Darcy N. Bryan, MD, 2018
Limited patient-centric data is a major constraint on clinicians who could make better, faster decisions with access to broader, more robust data.
And now, “alternative data” in healthcare is on the way. As Deloitte explains in a piece on the rise of remote patient monitoring and the data streams it will create:
"Even before the COVID-19 pandemic, 88 percent of health care providers had already invested in remote patient monitoring (RPM) or said they planned to do so in the future…
…We expect the future of RPM will be in simultaneous pooling and synthesis of data from continuous measurements and from interval-based assessments to generate even more meaningful, personalized insights for each patient.
1. Continuous measurements come from ongoing monitoring of physiological and environmental variables that physicians traditionally use to observe patients who are recovering from acute episodes. These include:
Traditional biometrics (biological data, such as blood-glucose levels, heart rate, and blood pressure)
Emerging physiological biometrics (new measures suitable for statistical analysis of functional integrity, such as electrophysiological tonus, sleep patterns, voice patterns in-home movement, daily activity levels, and exercise patterns)
Environmental metrics (used to quantify external factors—such as weather conditions, pollutant levels, and the presence and potency of sunlight—that could affect the health of at-risk patients)
2. Interval-based assessments come from measurements taken during recurring evaluations. Physicians often use measures like these to manage chronic conditions. Data from interval-based assessments includes:
Patient reported outcomes (PROs) come from questionnaires about the patient’s health status, quality of life, and mental well-being. They might be used to measure satisfaction/mood, pain, and optimism as well as whether the patient feels better after a procedure.
Socio-biometrics come from personal and community data. This could include an individual’s social media activity, eating patterns, commute patterns, and community crime rates. Clinicians can use this information to understand the influence that certain drivers of health have on the patient.
Personalized treatment plan information includes patient-specific details about key events along the patient’s care journey. This might include previous interventions, medication prescriptions, physical therapy recommendations, and outcomes.
Source: Deloitte Consulting, June 2020
New technologies - e.g. wearable RPM devices - are enabling us to collect more data than ever. As more RPM devices are cleared as clinical-grade and adoption of these tools expands, the available patient-generated data will explode by orders of magnitude vs. the data that can be used to make decisions today (i.e. what’s available in a patient’s EHR or payor claims data).
These millions of new data points, which take us from narrow/periodic to robust/continuous, will help power algorithmic automation in healthcare.
Automation Unlocks Capacity
As we’ve seen in the investing industry, talented engineers have built on top of new datasets to create tools that automate (at least partially) the decision-making process to systematically drive outcomes:
In the late 1990s, an algorithm might have simply tried to ride the momentum of a stock’s price rise, buying at a certain price level and selling at a predetermined moment. Today’s algorithms can make continuous predictions based on analysis of past and present data while hundreds of real-time inputs bombard the computers with various signals.
Some investment firms are pushing into machine learning, which allows computers to analyze data and come up with their own predictive algorithms. Those machines no longer rely on humans to write the formulas.
Algorithms and quants eventually could sharply reduce the need for large investment staffs. A machine-driven algorithm might help quantitative researchers discover dozens of new algorithms in the time it used to take to create one.
Source: Wall Street Journal, May 2017
The automation advantage here boils down to the ability to ingest and analyze tons of inputs. Instead of having a bullpen of human analysts sift through and decipher various data inputs (which isn’t feasible to begin with), software can rapidly distill the data at scale and push out insights to investment decision-makers.
But fear not junior investors - this doesn’t obviate investment staff. Rather, it enhances existing investment staff by streamlining their decision-making process while allowing humans to focus their time on the more creative/strategic/complex aspects of their craft. Automation simply unlocks decision-maker capacity.
The more fundamental funds will use the data as an input into human-driven investment decisions. For example, they’ll try to predict the sales or churn of a specific company, with the overall gall of of outperforming sell side consensus. Or they’ll try to predict macro economic trends, for example through the observation of satellite images. They will also often use models, but what the quant (data scientist) predicts will be generally just one data point that the “PMs” (portfolio managers) will decide to use or ignore in their investment decisions, alongside other inputs (such as what their carefully cultivated professional network thinks).
Source: “The New Gold Rush? Wall Street Wants your Data” - Matt Turck, January 2017
That same trend of unlocked, super-powered decision-maker capacity is what we can expect to see in healthcare as algorithmic decision making, powered by millions of new patient-generated data points that will be created in the coming years, is integrated into healthcare workstreams.
In the example of radiology analysis, AI tools can handle large volumes of diagnostic work and automate the simpler diagnostic tasks, freeing up clinician’s time so that they can focus on the more complex diagnosis. This drives tremendous value particularly where the supply of clinicians is constrained:
Further, the ability to build on massive patient-generated data sets will help shift care down the acuity curve via the enablement of proactive clinical interventions:
Machine-learning systems excel at prediction. A common approach is to train a system by showing it a vast quantity of data on, say, students and their achievements. The software chews through the examples and learns which characteristics are most helpful in predicting whether a student will drop out. Once trained, it can study a different group and accurately pick those at risk…
…In hospitals, for instance, doctors try to predict heart attacks so they can act before it is too late. Manual systems correctly predict around 30%. A machine-learning algorithm created by Sriram Somanchi of Carnegie Mellon University and colleagues, and tested on historic data, predicted 80%—four hours in advance of the event, in theory giving time to intervene.
Source: The Economist, August 2016
Earlier interventions, in lower acuity settings (i.e. at home vs. the hospital), will drive better patient outcomes and a lower cost of care.
//
In closing, we are still a long way from anything close to the ubiquitous adoption of automation and algorithmic decision making in healthcare.
There are big hurdles on data interoperability (e.g. connecting disparate silos of patient data), regulatory support (e.g. algo-driven decisions will need to be repeatable/explainable and sufficiently tested prior to clearance) and the broader ethics of algorithmic decision making in healthcare.
Still, the vision for how healthcare decision-makers can leverage more robust data to make better, faster decisions and drive better patient outcomes is extremely promising.
Disclaimer:
This content is being made available for educational purposes only and should not be used for any other purpose. The information contained herein does not constitute and should not be construed as an offering of advisory services or an offer to sell or solicitation to buy any securities or related financial instruments in any jurisdiction. Certain information contained herein concerning economic trends and performance is based on or derived from information provided by independent third-party sources. The author believes that the sources from which such information has been obtained are reliable; however, the author cannot guarantee the accuracy of such information and has not independently verified the accuracy or completeness of such information or the assumptions on which such information is based.