MIT

FAQs

Q: What data does the MIMIC II database contain?

A: The MIMIC II database contains clinical data for 33,361 ICU stays of 26,655 patients, collected over 6 years. The information includes data pertaining to:

  • Patient events such as movement between wards.
  • Patient diagnoses using ICD-9 codes.
  • Data from bedside monitors such as ECG waveforms, arterial blood pressure and oxygen saturation levels.
  • Clinical data such as laboratory test results, medications, patient demographics, nursing progress notes, discharge summaries, etc.

Q: How can I learn how to use the database?

A: The database is accessed with a well known query language, SQL. To learn more about relational databases, and how to use SQL, there are many resources freely available on the web. We recommend this, reasonably short, online book, with an emphasis on the first four chapters: http://philip.greenspun.com/sql/.

Q: How was the SAPS-I score calculated?

A: The SAPS-I score was calculated using the method outlined in this publication: Le Gall J-R, Loirat P, et al. A simplified acute physiology score for ICU patients. Crit Care Med. 1984; 12: 975-977.

Q: Why do I keep getting 'certificate' errors?

A: All of our secure web pages are signed using SSL certificates. Install our certificate authority to remove these warnings.

Q: What if I want to perform a study that requires knowing in what year the patients were admitted?

A: Since all dates have been shifted by 10±3.12 years, it is impossible to perform any study that requires knowing which patient was admitted before another patient. For some research studies, it is essential to know the approximate date during which care was administered. We make this information available to researchers who require it, subject to an extended data use agreement

Q: Is this all the data concerning a patient?

A: No, not all data falls into a hospital or ICU stay. Patients may be transferred to other hospitals. they are likely to have a medical history before and after the six-year time period of our data collection.

Q: Did you remove any patients?

A: Yes. We removed all VIPs. We also removed all patients who turned 90 during a stay. Patients in the test set for the PhysioNet Computers in Cardiology Competition 2009 were removed while the challenge was in progress, they have since been restored

Q: What is the difference between charttime and realtime?

A: When two times are recorded for the same observation, the earlier time generally indicates when the observation was made, and the later time indicates when information about the event was entered into the electronic medical record.  For example, a patient might have had blood drawn at 7:45 for a lab test that was not completed and logged into the record until 10:05.  The earlier time is important in understanding the state of the patient at that time;  the later time is significant since it may represent the first time at which the observation might have been able to influence the patient's care.


When two times are recorded for a medical intervention, the earlier time generally indicates when the physician's order was given, and the later time indicates when it was carried out.  For example, at 8:00, a physician may order a medication to be administered to the patient, but the time needed to prepare the medication may mean that the patient doesn't begin to receive it until 8:20.  In this case, the earlier time marks the time at which the decision was made to intervene, and the later time indicates when the effects of the intervention might begin to be observable.


Unfortunately, the nomenclature for these various types of timestamps is not as consistent as one might hope.  In general, the "charttime" is the time when an
observation was recorded (10:05 in the first example) or an order was given (8:00 in the second example), and the "realtime" is the time when an observation was made (7:45 in the first example) or an order was carried out (8:20 in the second example).  Thus, "charttime" can precede or follow "realtime", depending on whether the event in question is an observation or an intervention.


A further complication is that some of these timestamps are generated automatically while others are entered manually, and there is no guarantee that all of the clocks used are precisely synchronized -- so that if two events occur within a few minutes, their timestamps do not unambiguously indicate which event occurred first.

Q: Does the database contain codes for the procedures performed on a patient?

A: No, unfortunately the database contains codes which indicate the procedures performed on a patient. However, the nursing notes, discharge summaries and charted parameters can be used to infer this information.

Q: How do I export entire tables from the database?

A: If you wish to export complete tables (e.g. SELECT * FROM medevents) and you use a Linux/Unix operating system (or cygwin under Microsoft Windows), it is best to use the flat files available on the main PhysioNet website. Point your web browser to:

http://physionet.org/AUTHORIZED-USERS-ONLY/ This e-mail address is being protected from spambots. You need JavaScript enabled to view it

(replacing " This e-mail address is being protected from spambots. You need JavaScript enabled to view it " with your email address).  You will be prompted to enter your PhysioNet password, please note that this is not the same as your MIMIC password, is the password you received when you first request access to the restricted database. Download the all the batches of patient data (n[1-6].tar.gz.gpg and t[1-6].tar.gz.gpg). Decrypt and unzip the files.

The flat files are split into directories named with the subject ID of the patients, for example:

Batch1/
00001/MEDEVENTS-00001.txt
00002/MEDEVENTS-00002.txt
...
Batch2/
05033/MEDEVENTS-05033.txt
...

To build a file for the MEDEVENTS table containing the data for all subjects, concatenate the files together using a BASH command such as:

for file in `find -name "MEDEVENTS-*.txt" -type f` ; do
tail -n +2 $file >> MEDEVENTS-full.txt;
done;

The files are separated with ' |||| ' (a 'tab' character, followed by 4 'pipe' symbols followed by another 'tab'). These can be transformed into CSV files using e.g.:

sed 's/\t||||\t/,/g' MEDEVENTS-full.txt > MEDEVENTS-full.csv

You can repeat this process for all the tables in the database.

Q: How can I get access to the QueryBuilder/Explorer/Restricted Dataset?

First, you must register with PhysioNet (http://physionet.org/physiobank/database/mimic2cdb/restricted/) and submit your completed data use agreement (DUA). Your MIMIC II login credentials are separate from the PhysioNet login and will be sent to you once your application has been approved.

Q I am having problems logging in to the system. What should I do?

Please ensure that you are using the correct username and password. Your username is generally the first part of the email address used to register with the system. i.e. This e-mail address is being protected from spambots. You need JavaScript enabled to view it would have the username mimicuser. Your password was sent to you in an email when you first registered. If you would like to have your password reset, please use the Contact form to request a new password.