MIMIC-II vs MIMIC-III
MIMIC-III is an extension of MIMIC-II: it incorporates the data contained in MIMIC-II (collected between 2001 - 2008) and augments it with newly collected data between 2008 - 2012. In addition, many data elements have been regenerated from the raw data in a more robust manner to improve the quality of the underlying data.
One of the challenges of adding new data resulted from a change in data management software at the Beth Israel Deaconess Medical Center. The original Philips CareVue system (which archived data from 2001 - 2008) was replaced with the new Metavision data management system (which continues to be used to the present). This page aims to facilitate the transition for researchers familiar with MIMIC-II who would like to continue their research with the updated MIMIC-III.
In MIMIC-II there were multiple tables containing the same column name,
ITEMID, but referring to different concepts. In attempt to alleviate confusion, we have merged all these tables into a single table, as follows:
|Old table||Merged into|
D_CHARTITEMS, D_IOITEMS, and D_MEDITEMS were all sourced from the same data source: the ICU database (specfically Philips CareVue). In contrast, iMDSoft Metavision only has a single table to define most
ITEMID concepts. In order to simplify the schema and coalesce the databases, it was decided to merge together all the
ITEMID dictionary tables into a single table, except D_LABITEMS (see below).
ITEMID ranges in the original D_ tables overlapped, they were offset by constant values. The following table maps the old
ITEMID value ranges to the new ranges:
|MIMIC-II source||Old range||New range||Offset|
|D_CHARTITEMS||1 - 20009||1-20009||None|
|D_MEDITEMS||1 - 405||30001 - 30405||+ 30000|
|D_IOITEMS||1 - 6807||40001 - 46807||+ 40000|
For example, the charted item “Heart Rate” had an
ITEMID of 211 in MIMIC-II (in D_CHARTITEMS). The new
ITEMID for this is, again, 211, as there was no offset for D_CHARTITEMS. This means that any ITEMIDs used in the D_CHARTITEMS table will be directly portable to MIMIC-III (caveat: you will still need to extract additional
ITEMID for the new metavision patients).
Conversely, if we look at the
ITEMID = 51 in MIMIC-II D_MEDITEMS (vasopressin), we can find the same concept in MIMIC-III as 51 + 30000 = 30051.
D_CODEDITEMS contained many concepts - most of these have been unchanged, simply moved to the new table D_ITEMS. The following table provides details.
||Concept||Where is it in MIMIC-III?|
|60001 - 61018||DRG codes||Deleted - DRGs are stored with descriptions in DRGCODES|
|70001 - 70093||Microbiology specimens||D_ITEMS, these
|80001 - 80312||Microbiology organisms||D_ITEMS, these
|90001 - 90031||Microbiology antibacterium||D_ITEMS, these
|100001 - 101885||ICD-9 procedure codes||Deleted - ICD9 codes are stored with descriptions|
Lab ITEMIDs - not merged into D_ITEMS
ITEMID for laboratory measurements in the D_LABITEMS and LABEVENTS tables in MIMIC-II do not match the
ITEMID for laboratory measurements in MIMIC-III. We have provided a mapping table to facilitate the updating of queries which use this table. The mapping can be found in the MIMIC Code Repository:
Much of the data has been mapped to LOINC codes, which provide a standard ontology for recorded lab values. Careful inspection shows that the LOINC code for an
ITEMID in MIMIC-III is, in rare occasions, different from the LOINC code for the same concept in MIMIC-II. This is usually attributable to the laboratory assigning a new LOINC code, which is done for many reasons, including changing the reagents of a laboratory test, changing the technique used to acquire the result or because the previous LOINC code was discontinued.
While almost all concepts are unified into a single table (D_ITEMS), the concepts within that table are not unified. As patients admitted between 2001-2008 used a different ICU system to patients admitted between 2008-2012, the
ITEMID values for the same concept differ in these two time periods. That is, for heart rates between 2001-2008, the
ITEMID value 211 is appropriate. For heart rates for patients admitted between 2008 and onward, the
ITEMID 220045 is appropriate. Most concepts will require the use of multiple
ITEMID values in order to completely extract the data.
ADMISSIONS is now sourced from the hospital database, rather than the ICU database. Changes include:
- Admission and discharge dates now have the time component
- Discharge location is now available
- Diagnosis on admission is now available
- ED registration and exit time are now available
CENSUSEVENTS replaced by TRANSFERS
The CENSUSEVENTS table was used in MIMIC-II to track patient hospital admissions. The original source of this table was the ICU database which recorded when patients were admitted and discharged from the ICU. In MIMIC-III, this table has been removed and replaced with the TRANSFERS table. The TRANSFERS table is sourced from the hospital admission, discharge, transfer (ADT) data. This has a number of advantages:
- The ADT data tracks a patient throughout the entire hospital stay, providing greater granularity and easier tracking of a patient’s hospital course
- The ADT data provides information regarding ward location
- The ADT data has fewer erroneous admissions: frequently the ICU database contained erroneous entries corresponding to accidental admission/discharges
- The ADT data is available for all patients in the ICU database
DEMOGRAPHIC_DETAIL merged into ADMISSIONS
The DEMOGRAPHIC_DETAIL provided extra static information regarding a patient which rarely changed throughout an admission (e.g. age, ethnicity). This data was originally sourced from the ICU database. The new ADMISSIONS table is sourced entirely from the hospital database, and contained the same set of demographics. Instead of creating a new table for these demographics, it was decided to maintain the demographics within the ADMISSIONS table in the same format as it exists in the raw data.
DRGEVENTS renamed DRGCODES
The DRGEVENTS table has been renamed DRGCODES. The table still corresponds to diagnostic related groups, however the clarity of the data has been improved. First, a column corresponding to the DRG system has been provided. Second, if available, additional information regarding the DRG severity and mortality risk have been added. Finally, the DRG code has been mapped to a description based upon the underlying version. Due to frequent updates in the DRG coding system, it is non-trivial to map DRG codes to a description. To ease the use of this data, each DRG code has been mapped to a free text description which has been added to the table.
ICD9 renamed DIAGNOSES_ICD
To clarify the content of the table, the ICD9 table has been renamed to DIAGNOSES_ICD. The table contains diagnoses sourced from the hospital databased codified by ICD, usually ICD-9.
IOEVENTS, ADDITIVES, DELIVERIES and MEDEVENTS
Data in the IOEVENTS and MEDEVENTS tables is now contained in the OUTPUTEVENTS, INPUTEVENTS_CV and INPUTEVENTS_MV tables. As all the medications in the MEDEVENTS table were continuous infusions, they were all associated with an entry in IOEVENTS. MEDEVENTS would specify the drug rate, while IOEVENTS would specify the volume given. These tables have been consolidated to ease querying for drug deliveries.
POE_MED and POE_ORDER merged into PRESCRIPTIONS
The term POE, or provider order entry, references a hospital specific database which users may not be familiar with. To clarify the content of these tables, they have been merged into a single table named PRESCRIPTIONS.
SUBJECT_ID between MIMIC-II v2.6 and MIMIC-III have been kept consistent, for example,
SUBJECT_ID 788 is corresponds to the same patient in MIMIC-II v2.6 as it does in MIMIC-III.
HADM_ID have been regenerated in MIMIC-III.
HADM_ID in MIMIC-II v2.6 will not match any
HADM_ID in MIMIC-III. The newly generated
HADM_ID range from 100,000 - 199,999 to help differentiate these IDs from others.
ICUSTAY_ID have been regenerated in MIMIC-III.
ICUSTAY_ID in MIMIC-II v2.6 will not match any
ICUSTAY_ID in MIMIC-III. Note that the newly generated
ICUSTAY_ID range is between 200,000 - 299,999 to prevent confusion with other IDs.
The CALLOUT table contains information about ICU discharge planning and execution. Each patient is “called out” of the ICU: which involves alerting hospital administrative staff that a bed, usually on the floor, is required for a patient currently in the ICU. The call out event also includes any precautions that the patient may require (such as susceptibility to MRSA or respiratory support). The table provides information both on when the patient was deemed ready for discharge and when the patient actually left the ICU.
ICD-9 codes for procedures are now available in the PROCEDURES_ICD table.
INPUTEVENTS_CV and INPUTEVENTS_MV
Two different monitoring systems were operating in the hospital over the data collection period. The systems - Metavision and CareVue - recorded data in very different ways. We therefore made the decision not to merge the data, and instead provided two separate tables (INPUTEVENTS_CV and INPUTEVENTS_MV).
Data about outputs were recorded in a consistent fashion for the Metavision and CareVue databases. Therefore, we merged this data into one table: OUTPUTEVENTS.
There are many tables in MIMIC-II which are no longer present in MIMIC-III. In most cases, these tables were generated from the raw data for user convenience. We have transitioned from the approach of creating flat files of these tables to providing the source code necessary to regenerate them. This has two advantages: first it is much more efficient in terms of data transfer, and second it clarifies that these data are not “raw” in that they are not acquired directly from the databases but rather synthesized views of this data.
This table is frequently used to define the comorbid status for patients. Code for generating this table will be provided in the MIMIC Code Repository. The comorbidities will be defined using ICD-9 codes and DRG codes as proposed by Elixhauser et al.
These tables were specific to the older CareVue database. As these tables are not present in the Metavision data, and the same information has been acquired from the hospital database (see the ADMISSIONS and D_PATIENTS tables), these tables have been removed.
The care unit identifier, CUID, has been removed from the various tables as it was unavailable in the Metavision ICU database. Care unit can be determined at the patient level, rather than the observation level, using the ICUSTAYEVENTS and TRANSFERS tables.
This was a generated table to facilitate the interpretation of various coding systems, including microbiology, DRG, etc. The database has been restructured to have explicit definitions for these codes where appropriate.
This table is no longer needed as
ITEMID concepts have been consolidated in the D_ITEMS table.
A script to generate ICUSTAY_DETAILS will be provided in the MIMIC Code Repository shortly.
This table is no longer needed as
ITEMID concepts have been consolidated in the D_ITEMS table.
D_WAVEFORM_SIGNALS, WAVEFORM_METADATA, WAVEFORM_SEGMENTS, WAVEFORM_SEG_SIG, WAVEFORM_SIGNALS, WAVEFORM_TRENDS, WAVEFORM_TREND_SIGNALS have been removed. The mapping to the waveform data is no longer provided within the relative database for clarity.