This dashboard contains data both on all patients and only COVID+ patients within the N3C Data Enclave.
A COVID-positive patient is defined as any patient having one of the following within their EHR records:
- Laboratory confirmed positive COVID-19 PCR or Antigen test
- Laboratory confirmed positive COVID-19 Antibody test
- Medical visit in which the ICD-10 code for COVID-19 (U07.1) was recorded
- Condition diagnosis patients have no record of a positive PCR/Antigen or Antibody test within their EHR; however, they were diagnosed with COVID due to the symptoms they displayed.
A Long COVID patient is defined as any patient having the ICD-10 code for PASC (U09.9) within their EHR.
- Note: The ICD-10 for PASC (U09.9) was not created until October 1, 2021, and any data using this code will be limited to after this date. Therefore, this data is not a full representation of patients diagnosed with Long COVID.
Severity of COVID-19 is a calculation based on multiple events recorded in a patient's EHR during their medical visit. The severity score for each patient may be inaccurate due to missing information within the EHR. Patients will only be graded on Severity if they have a laboratory-confirmed positive PCR or Antigen test. Below are the definitions of each Severity Category.
- Mild - The patient has no record of Emergency Room visits or hospitalization for COVID-19
- ED Visit (not admitted) - The patient had an Emergency Room (ER) visit for COVID-19, but we have no record of hospitalization (Inpatient) for COVID-19
- Moderate Hospitalized - The patient was hospitalized (Inpatient visit) for COVID-19 AND did not receive ECMO OR Invasive Ventilation
- Mortality - The patient’s records show a date of death
- Severe Ventilation/ECMO/AKI - The patient was hospitalized for COVID-19 AND received Extracorporeal membrane oxygen (ECMO) OR received Invasive Ventilation
- Unavailable - Patients who did not have a lab-confirmed positive PCR or Antigen COVID test. This includes patients who were diagnosed only based on the symptoms they displayed or patients who do not have any recorded COVID-19 diagnosis within the Enclave.
Visits and hospitalizations related to the severity calculations are classified into two types based on the evidence to support that the event was directly related to COVID-19 (strong/weak). For these definitions, both types have been considered and combined.
Comorbidities for each patient are linked to EHR medical visits coded for any of the 17 different conditions defined by the Charlson Comorbidity Index. A patient may have undiagnosed conditions that would not be recorded in their EHR and, therefore, would not be represented here. Additionally, a patient may have a CCI condition for which they have not required a medical visit, which would exclude them from representation.
The age of each patient is calculated as of the date of the last data update.
- If an age exceeds 89, it will be obscured using a date shift of +/- 10 years.
- As of 7/15/22, July 1st is used as a placeholder date of birth when there are 0s or nulls in the OMOP person table to avoid biasing towards older age.
The race and ethnicity of patients are adjusted to standard categories based on self-reported fields within the EHR.
- Note that the EHRs do not always contain all of the information on race and ethnicity, and patients may not self-report a response that can be fully mapped into one of the standard categories. The patient would fall into the "Unknown" category in these cases.
The sex of patients is determined based on self-reported fields within the EHR.
- Note that the EHRs do not always contain all of the information on sex; if a patient's EHR does not contain data on their sex, they will fall into the "Unknown" category.
- If a patient records any response other than "Female" or "Male" they would be mapped into the "Other" category.
Vaccination data in the N3C is sparse and represents only EHR-recorded vaccination events at our data partners. The absence of a vaccination record does not mean that a patient is unvaccinated. If a patient were vaccinated at their local pharmacy, doctor's office, or state/federal vaccination site, they would not be represented because these systems do not automatically link to a patient's EHR.
Given the known national vaccination rates, it is likely that many, if not most, vaccination events are occurring outside of the academic health systems submitting data. Therefore, patients shown here as "Unknown" may be vaccinated; however, we do not have the records to verify this.
Vaccinated patient counts shown here does not mean the patient is fully vaccinated. We consider a vaccinated patient to have at least one dose of Pfizer, Moderna, or Johnson & Johnson COVID-19 vaccines. Given that Pfizer and Moderna require two vaccine doses to be considered fully vaccinated, patients shown here may be partially vaccinated. This same assumption applies to booster shots, as we do not consider shots beyond the first one recorded within the patient's EHR.
Mortality is defined as:
- Any patient with a date of death in the Enclave
- (or) Any patient from a mortality-linked PPRL site who exists in one of the external sources:
- Government Mortality: Government data sourced from death certificates and person-reporting.
- ObituaryData.com: Obituary data sourced from funeral homes, newspapers, and other online obituary sources, specifically from obituarydata.com (a private obituary aggregator).
- Private Obituary: Obituary data sourced from funeral homes, newspapers, and other online obituary sources sourced from other private sources.
For external mortality sources:
- Several mortality sources do not know the exact date of death for all reported deaths. If they know only the month of death, they will provide the date of death as the first day of that month, or if they know only the year of death, they will provide the date of death as the first day of that year. This means that an increased number of deaths will appear at those intervals.
- Each source has a distinct lag associated with their reported deaths. However, on average, 90% of the deaths that will show up will be in the data by 28 days after the occurrence.
Mortality data should not be considered representative of all deaths in the United States.
Note: This metric is distinct from the Mortality category associated with Severity, as it does not limit deaths to only those suspected to be caused by COVID-19.
General Enclave Limitations
- “Sicker” patients will likely be overrepresented within the N3C Data Enclave, as sicker patients will more often seek out and receive care at clinical centers.
- The N3C may have multiple contributors to data “missingness”. Clinical facts and events that occur in the real world may not be captured for reasons including:
- The event was recorded at a clinical site that does not contribute data
- Data is not yet linked across sites
- Medical records are inherently incomplete
- Some of the external datasets that have been used for analysis cannot be fully mapped due to issues such as missing measurement units.
- All dates within the Enclave have been shifted between -3 to 45 days to ensure that reidentification is not possible.
- N3C data may not be representative of the entire US population
- N3C does NOT have a representative sample of any state, as data is contributed from only a few providers in each region (Region - includes multiple states).
- Cell sizes smaller than 20 people have been suppressed
- For COVID+ patients: A patient is only counted once in this data, even if they have multiple positive tests over time. Except in instances where dashboards focus on reinfection, only dates of first infection are utilized.