In other words, every day brings a new and seemingly meaningful story — something to which the public and leaders react and that influences decision-making about individual behaviors as well as judgments about policy.
New York is a sobering example of recent oscillation in headlines. The April 6 article “Virus Toll in N.Y. Region Shows Signs of Leveling Off” suggested that “steps to control the coronaviruses’ spread might be working,” while a mere twelve hours later an article proclaimed that New York virus deaths had reached a new high. We expect that the number of reported deaths — just as with anything other phenomenon that we measure — will vary with counts going up and down. Without a method to understand if the variation we experience in the number of deaths is merely randomness that we’d expect to see in any measure, we will struggle to recognize if things are improving or getting worse and to make data-driven informed decisions accordingly.
To model the trajectory of covid-19 new reported daily deaths, we have created Shewhart control charts (graphical representations that include the mean, upper, and lower control limits). We’ve created charts so that within a specific geographical region, the charts can detect when the pandemic is entering a rapid growth phase and when exponential growth is ending. We hope that this analysis will be of use to subject matter experts, leaders of quality improvement, decision makers, care providers, and the general public.
The Shewhart (control) chart method and theory has been applied successfully to some of the world’s most pressing problems in healthcare and society. Given the current realities of a global pandemic, Shewhart’s charts are an invaluable tool in making sense of an increasingly complex data environment because they provide a basis for taking action. Such methods are needed to learn from daily reported covid-19 death counts from countries, states, and other geographies.
Because the control chart method and theory are designed to distinguish between random and non-random variation, they are ideally suited to understand if the daily death counts are stable (i.e., only reflect common causes) or unstable (i.e., include special causes). The control chart method developed here — and outlined in detail below — is an advanced application of the approach that has been automated for ease of use and interpretation, though users can input their own data and create their own charts.
To improve a situation, we look at the variation in a process. A fundamental concept of the science of improvement is that variation in a measure has two potential origins: common causes and special causes. Walter Shewhart developed in 1939 this theory of variation. Common causes are inherent in the system over time, affecting everyone in the system and all system outcomes. Special causes are not part of the regular system but arise because of particular circumstances or some “special” source of variation that can be assigned to some identifiable cause. Shewhart developed the “control chart” as a tool to distinguish the random variation of common causes from the non-random variation of special causes.
In these charts, our unit of analysis is the count of new daily reported deaths. Many high-profile news media outlets and statistical models are tracking covid-19 cases. A problem arises when case counts data fail to convey meaningful information because of issues with underreporting and factors related to wildly variable testing and case detection across different countries and states. The use of death rates for understanding this epidemic is also challenging because of the lack of a stable denominator. The area of opportunity for detecting cases varies across communities because of variation in testing strategies and access to testing as well as the sensitivity and specificity of a new diagnostic tests. Another factor confounding death rate data is the extent of to which the health care delivery system in a region may be over capacity and/or dealing with staffing and equipment shortages. The use of cumulative deaths in modeling this epidemic is a common approach — although studying the increasing total of deaths in an area is more likely to mask variation (and thus hinder learning) relative to tracking new daily deaths. Our method considers day-to-day variation in reported number to deaths each day to prevent us from over-responding to common cause variation in reporting errors. Our data source for these charts is updated each afternoon from The New York Times Github database.
Discrepancies in reported deaths are common due to under-reporting community covid-19 deaths and potential over-reporting in hospitals owing to codes used for insurance billing and reimbursement. An April 10 article in The New York Times describes how “paramedics are not testing those they pronounce dead for the virus, so it is almost impossible to say how many of the people were infected with it.” In spite of these and other limitations associated with data quality in the reporting of covid-19 deaths, new daily reported deaths data provide useful information to decision-makers navigating a complex pandemic landscape.
A focus on new daily reported deaths is among the distinctive contributions of our approach. Shewhart control charts offer a different way to display and learn from this type of data. Our method relies on minimal assumptions (primarily, the expected logistic increase of an epidemic) and builds on the initial data observed in an area. Our predictions consist of extending the center line (the mean) and upper- and lower-control limits and are based on the assumption of a stable system over time. We don’t predict the end of exponential growth but can determine when that point has been reached.
The chart below provides a U.S. “system view” with a summary over time of the number of states that are in each phase of exponential growth, reinforcing the point that on any given day we are dealing with different dynamics across the nation as well as changes happening within each state.
Shewhart Control Chart Methodology
This application of the Shewhart chart method offers at least three important contributions:
- Minimizing the risk and psychological toll that can occur when people react to every data point as if it is a meaningful signal of things getting better or worse (when they may not be changing);
- Identifying the day at which the number of daily deaths in an area (county, state, country) has begun to grow exponentially (a signal things are getting worse); and
- Identifying the day at which the number of daily deaths in an area has peaked and is entering the flat part of the epidemiologic curve.
Numerous methods are now available to model growth of the epidemic and to predict the peak number of deaths in specific areas, but we are unaware of any models that detect the first day an area enters the exponential growth phase. Our approach is different in that we are not predicting a peak or growth phase based on the last data point, but rather identifying when the peak or growth phase has been reached based on all the data.
We use a C-chart for the initial pre-growth phase. After at least eight deaths have been reported and a special cause signal (above the upper limit or a shift of eight successive points above or below the center line) has emerged, we switch to a log-regression I-chart with slanted center line to model the growth phase. During this growth stage, we continue plotting the deaths and update the calculation of center line and limits each day (extending the center line and limits into the future). If a location is still in its growth phase after 20 days, we “freeze” calculations of the center line and limits. At this time, we also extend the “frozen” limits (based on regression analysis of the 20 data points) into the future as a guide to interpret subsequent daily reported deaths plotted on the chart.
When a special cause signal below the lower limit occurs before the growth phase and has lasted for 20 days, we freeze the limits and plot the next day’s observation. If this last data point also signals special cause variation, we conclude that the growth phase has ended. Then we continue to plot subsequent counts of reported deaths on the chart but do not include these values in the regression used to calculate the center line and limits that are intended to detect signals related to the trajectory of the growth phase.
The Shewhart control charts below illustrate what a chart will look like on both a linear and logarithmic scale for a location (such as Italy) that has moved past the exponential growth phase.
DOWNLOAD NOW: I-Chart COVID19 Shewhart Chart Template v1.2 (Version 2.0 with multiple phases of growth incorporating a C-chart and slanted I-Chart with options to freeze calculations of limits coming soon!)
Lloyd P. Provost
Improvement Advisor with Associates in Process Improvement who has worked worldwide to apply the science of improvement.
Rocco J. Perla
Co-Founder of The Health Initiative, a campaign catalyzing a nationwide effort to spur a new conversation about – and increased investments in – health.
Shannon Provost PhD
Lecturer at Department of Information, Risk, and Operations Management, McCombs School of Business, The University of Texas at Austin
Also available: Watch Rocco Perla and Lloyd Provost explain how we can use improvement science and data to better understand and manage the COVID19 crisis - ISQua COVID19 Webinar Series.