As the performance of a conventional track and trigger system in a rapid response system has been unsatisfactory, we developed and implemented an artificial intelligence for predicting in-hospital cardiac arrest, denoted the deep learning-based early warning system. The purpose of this study was to compare the performance of an artificial intelligence-based early warning system with that of conventional methods in a real hospital situation.
Retrospective cohort study.
This study was conducted at a hospital in which deep learning-based early warning system was implemented.
We reviewed the records of adult patients who were admitted to the general ward of our hospital from April 2018 to March 2019.
The study population included 8,039 adult patients. A total 83 events of deterioration occurred during the study period. The outcome was events of deterioration, defined as cardiac arrest and unexpected ICU admission. We defined a true alarm as an alarm occurring within 0.5–24 hours before a deteriorating event.
Measurements and Main Results:
We used the area under the receiver operating characteristic curve, area under the precision-recall curve, number needed to examine, and mean alarm count per day as comparative measures. The deep learning-based early warning system (area under the receiver operating characteristic curve, 0.865; area under the precision-recall curve, 0.066) outperformed the modified early warning score (area under the receiver operating characteristic curve, 0.682; area under the precision-recall curve, 0.010) and reduced the number needed to examine and mean alarm count per day by 69.2% and 59.6%, respectively. At the same specificity, deep learning-based early warning system had up to 257% higher sensitivity than conventional methods.
The developed artificial intelligence based on deep-learning, deep learning-based early warning system, accurately predicted deterioration of patients in a general ward and outperformed conventional methods. This study showed the potential and effectiveness of artificial intelligence in an rapid response system, which can be applied together with electronic health records. This will be a useful method to identify patients with deterioration and help with precise decision-making in daily practice.
To investigate the reproducibility of computer-aided detection (CAD) for detection of pulmonary nodules and masses for consecutive chest radiographies (CXRs) of the same patient within a short-term period. A total of 944 CXRs (Chest PA) with nodules and masses, recorded between January 2010 and November 2016 at the Asan Medical Center, were obtained. In all, 1092 regions of interest for the nodules and mass were delineated using an in-house software. All CXRs were randomly split into 6:2:2 sets for training, development, and validation. Furthermore, paired follow-up CXRs (n = 121) acquired within one week in the validation set, in which expert thoracic radiologists confirmed no changes, were used to evaluate the reproducibility of CAD by two radiologists (R1 and R2). The reproducibility comparison of four different convolutional neural net algorithms and two chest radiologists (with 13- and 14-years’ experience) was conducted. Model performances were evaluated by figure-of-merit (FOM) analysis of the jackknife free-response receiver operating curve and reproducibility rates were evaluated in terms of percent positive agreement (PPA) and Chamberlain’s percent positive agreement (CPPA). Reproducibility analysis of the four CADs and R1 and R2 showed variations in the PPA and CPPA. Model performance of YOLO (You Only Look Once) v2 based eDenseYOLO showed a higher FOM (0.89; 0.85–0.93) than RetinaNet (0.89; 0.85–0.93) and atrous spatial pyramid pooling U-Net (0.85; 0.80–0.89). eDenseYOLO showed higher PPAs (97.87%) and CPPAs (95.80%) than Mask R-CNN, RetinaNet, ASSP U-Net, R1, and R2 (PPA: 96.52%, 94.23%, 95.04%, 96.55%, and 94.98%; CPPA: 93.18%, 89.09%, 90.57%, 93.33%, and 90.43%). There were moderate variations in the reproducibility of CAD with different algorithms, which likely indicates that measurement of reproducibility is necessary for evaluating CAD performance in actual clinical environments.