Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Searching For Phenotypes Of Sepsis An Application Of Machine Learning To Electronic Health Records
PREMIUM
Số trang
59
Kích thước
740.2 KB
Định dạng
PDF
Lượt xem
1665

Searching For Phenotypes Of Sepsis An Application Of Machine Learning To Electronic Health Records

Nội dung xem thử

Mô tả chi tiết

Yale University

EliScholar – A Digital Platform for Scholarly Publishing at Yale

Yale Medicine Thesis Digital Library School of Medicine

January 2019

Searching For Phenotypes Of Sepsis: An

Application Of Machine Learning To Electronic

Health Records

Michael Jarvis Boyle

Follow this and additional works at: https://elischolar.library.yale.edu/ymtdl

This Open Access Thesis is brought to you for free and open access by the School of Medicine at EliScholar – A Digital Platform for Scholarly

Publishing at Yale. It has been accepted for inclusion in Yale Medicine Thesis Digital Library by an authorized administrator of EliScholar – A Digital

Platform for Scholarly Publishing at Yale. For more information, please contact [email protected].

Recommended Citation

Boyle, Michael Jarvis, "Searching For Phenotypes Of Sepsis: An Application Of Machine Learning To Electronic Health Records"

(2019). Yale Medicine Thesis Digital Library. 3477.

https://elischolar.library.yale.edu/ymtdl/3477

Searching for Phenotypes of Sepsis:

An Application of Machine Learning to Electronic Health

Records

A Thesis Submitted to the

Yale University School of Medicine

In Partial Fulfillment of the Requirements for the

Degree of Doctor of Medicine

by

Michael Jarvis Boyle

2019

2

SEARCHING FOR PHENOTYPES OF SEPSIS: AN APPLICATION OF MACHINE LEARNING TO

ELECTRONIC HEALTH RECORDS. Michael J. Boyle (Sponsored by R. Andrew Taylor).

Department of Emergency Medicine, Yale University School of Medicine, New Haven,

CT.

Sepsis has historically been categorized into discrete subsets based on expert

consensus-driven definitions, but there is evidence to suggest it would be better

described as a continuum. The goal of this study was to perform an exhaustive search

for distinct phenotypes of sepsis using various unsupervised machine learning

techniques applied to the electronic health record (EHR) data of 41,843 Yale New Haven

Health System emergency department patients with infection between 2013 and 2016.

Specifically, the aims were to develop an autoencoder to reduce the high-dimensional

EHR data to a latent representation amenable to clustering, and then to search for and

assess the quality of clusters within that representation using various clustering

methods (partitional, hierarchical, and density-based) and standard evaluation metrics.

Autoencoder training was performed by minimizing the mean squared error of the

reconstruction. With this exhaustive search, no convincing consistent clusters were

found. Various clustering patterns were produced by the different methods but all had

poor quality metrics, while evaluation metrics meant to find the ideal number of

clusters did not agree on a consistent number but seemed to suggest fewer than two

clusters. Inspection of one promising arrangement with eight clusters did not reveal a

statistically significant difference in admission rate. While it is impossible to prove a

negative, these results suggest there are not distinct phenotypic clusters of sepsis.

3

Acknowledgements

I am indebted to my thesis advisor, Dr. R. Andrew Taylor, for his constant support and

insight, and to my friends and colleagues for their willingness to discuss these ideas and

serve as valuable sounding boards. This work was made possible through the generous

support of the Yale Summer Research Grant.

None of this would be possible, however, without the love and support of my wife,

Shirin Jamshidian. This work is dedicated to her.

4

INTRODUCTION 6

Sepsis Definitions 6

Machine Learning and Electronic Health Records 12

AIMS 15

METHODS 16

Study Design 16

Study Setting and Population 16

Study Protocol 17

Data Set Creation 19

Imputation 26

Autoencoder Training 26

Clustering 30

RESULTS AND DISCUSSION 31

Quality of dimensionality reduction and latent representation 31

Clustering 32

Assessing clustering propensity 32

Assessing ideal number of clusters 33

Partitional Methods 35

K-means 35

K-medoids 38

Tải ngay đi em, còn do dự, trời tối mất!