{"id":2951,"date":"2022-01-12T14:19:04","date_gmt":"2022-01-12T19:19:04","guid":{"rendered":"https:\/\/my.vanderbilt.edu\/masi\/?p=2951"},"modified":"2022-01-12T14:24:14","modified_gmt":"2022-01-12T19:24:14","slug":"pyphewas-a-phenome-disease-association-tool-for-electronic-medical-record-analysis","status":"publish","type":"post","link":"https:\/\/my.vanderbilt.edu\/masi\/2022\/01\/pyphewas-a-phenome-disease-association-tool-for-electronic-medical-record-analysis\/","title":{"rendered":"pyPheWAS: A Phenome-Disease Association Tool for Electronic Medical Record Analysis"},"content":{"rendered":"<p class=\"Citation\">Kerley, C.I., Chaganti, S., Nguyen, T.Q. <i>et al.<\/i> pyPheWAS: A Phenome-Disease Association Tool for Electronic Medical Record Analysis. <i>Neuroinform<\/i> (2022). https:\/\/doi.org\/10.1007\/s12021-021-09553-4<\/p>\n<p class=\"Citation\"><strong>Full text:\u00a0<\/strong><a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/34981404\/\">NIHMSID<\/a>, <a href=\"https:\/\/link.springer.com\/article\/10.1007\/s12021-021-09553-4\">Springer<\/a><\/p>\n<h2>Abstract<\/h2>\n<p>Along with the increasing availability of electronic medical record (EMR) data, phenome-wide association studies (PheWAS) and phenome-disease association studies (PheDAS) have become a prominent, first-line method of analysis for uncovering the secrets of EMR<i>.\u00a0<\/i>Despite this recent growth, there is a lack of approachable software tools for conducting these analyses on large-scale EMR cohorts. In this article, we introduce <i>pyPheWAS<\/i>, an open-source python package for conducting PheDAS and related analyses. This toolkit includes 1) data preparation, such as cohort censoring and age-matching; 2) traditional PheDAS analysis of ICD-9 and ICD-10 billing codes; 3) PheDAS analysis applied to a novel EMR phenotype mapping: current procedural terminology (CPT) codes; and 4) novelty analysis of significant disease-phenotype associations found through PheDAS. The pyPheWAS toolkit is approachable and comprehensive,\u00a0encapsulating data prep through result visualization all within a simple command-line interface. The toolkit is designed for the ever-growing scale of available EMR data, with the ability to analyze cohorts of 100,000\u2009+\u2009patients in less than 2\u00a0h. Through a case study of Down Syndrome and other intellectual developmental disabilities, we demonstrate the ability of pyPheWAS to discover both known and potentially novel disease-phenotype associations across different experiment designs and disease groups. The software and user documentation are available in open source at <a href=\"https:\/\/github.com\/MASILab\/pyPheWAS\">https:\/\/github.com\/MASILab\/pyPheWAS<\/a>.<\/p>\n<p><strong>Keywords: <\/strong><span dir=\"ltr\">PheWAS, PheDAS, Electronic Medical Records, Phenotype, ICD<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-2954\" src=\"https:\/\/my.vanderbilt.edu\/masi\/wp-content\/uploads\/sites\/2304\n2661\/2022\/01\/Fig1_crop-650x291.png\" alt=\"Fig1_crop\" width=\"650\" height=\"291\" srcset=\"https:\/\/cdn.vanderbilt.edu\/t2-my\/my-prd\/wp-content\/uploads\/sites\/2304\/2022\/01\/Fig1_crop-650x291.png 650w, https:\/\/cdn.vanderbilt.edu\/t2-my\/my-prd\/wp-content\/uploads\/sites\/2304\/2022\/01\/Fig1_crop-300x134.png 300w, https:\/\/cdn.vanderbilt.edu\/t2-my\/my-prd\/wp-content\/uploads\/sites\/2304\/2022\/01\/Fig1_crop-768x344.png 768w, https:\/\/cdn.vanderbilt.edu\/t2-my\/my-prd\/wp-content\/uploads\/sites\/2304\/2022\/01\/Fig1_crop.png 1022w\" sizes=\"auto, (max-width: 650px) 100vw, 650px\" \/><\/p>\n<p><span dir=\"ltr\"><strong>Fig. <span dir=\"ltr\">1<\/span><\/strong> <span dir=\"ltr\">Overview of PheDAS. In the background, a Manhattan plot <\/span><span dir=\"ltr\">shows the statistical significance of many phenotypes in relation to a <\/span><span dir=\"ltr\">single target variable (target). Phenotypes are sorted into and colored <\/span><span dir=\"ltr\">by category, and the significance threshold for multiple comparisons <\/span><span dir=\"ltr\">correction is marked with a dashed horizontal line. These relation<\/span><span dir=\"ltr\">ships were estimated by individually modeling the target variable as<\/span> a function of each phenotype using a logistic regression. For a closer <\/span><span dir=\"ltr\">look, the significant phenotype Sleep Apnea is highlighted. The dis<\/span><span dir=\"ltr\">tribution of subjects from each target group that do (not) present the <\/span><span dir=\"ltr\">Sleep Apnea phenotype is shown, along with the ICD-9 codes that <\/span><span dir=\"ltr\">map to this this phenotype.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Kerley, C.I., Chaganti, S., Nguyen, T.Q. et al. pyPheWAS: A Phenome-Disease Association Tool for Electronic Medical Record Analysis. Neuroinform (2022). https:\/\/doi.org\/10.1007\/s12021-021-09553-4 Full text:\u00a0NIHMSID, Springer Abstract Along with the increasing availability of electronic medical record (EMR) data, phenome-wide association studies (PheWAS) and phenome-disease association studies (PheDAS) have become a prominent, first-line method of analysis for uncovering&#8230;<\/p>\n","protected":false},"author":7554,"featured_media":2954,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[27,139,8,23,1],"tags":[174,189,188,14],"class_list":["post-2951","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-big-data","category-emr","category-informatics-big-data","category-machine-learning","category-news","tag-emr","tag-icd","tag-pyphewas","tag-regression"],"_links":{"self":[{"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/posts\/2951","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/users\/7554"}],"replies":[{"embeddable":true,"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/comments?post=2951"}],"version-history":[{"count":1,"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/posts\/2951\/revisions"}],"predecessor-version":[{"id":2955,"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/posts\/2951\/revisions\/2955"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/media\/2954"}],"wp:attachment":[{"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/media?parent=2951"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/categories?post=2951"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/my.vanderbilt.edu\/masi\/wp-json\/wp\/v2\/tags?post=2951"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}