An out of sample version of the EM algorithm for imputing missing values in classification

Sergio Campos, Alejandro Veloz, Hector Allende

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

© Springer Nature Switzerland AG 2019. Finding real-world applications whose records contain missing values is not uncommon. As many data analysis algorithms are not designed to work with missing data, a frequent approach is to remove all variables associated with such records from the analysis. A much better alternative is to employ data imputation techniques to estimate the missing values using statistical relationships among the variables. The Expectation Maximization (EM) algorithm is a classic method to deal with missing data, but is not designed to work in typical Machine Learning settings that have training set and testing set. In this work we present an extension of the EM algorithm that can deal with this problem. We test the algorithm with ADNI (Alzheimer’s Disease Neuroimaging Initiative) data set, where about 80% of the sample has missing values. Our extension of EM achieved higher accuracy and robustness in the classification performance. It was evaluated using three different classifiers and showed a significant improvement with regard to similar approaches proposed in the literature.
Original languageEnglish
Title of host publicationAn out of sample version of the EM algorithm for imputing missing values in classification
Pages194-202
Number of pages9
ISBN (Electronic)9783030134686
DOIs
Publication statusPublished - 1 Jan 2019
EventLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) -
Duration: 1 Jan 2019 → …

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11401 LNCS
ISSN (Print)0302-9743

Conference

ConferenceLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Period1/01/19 → …

Fingerprint Dive into the research topics of 'An out of sample version of the EM algorithm for imputing missing values in classification'. Together they form a unique fingerprint.

  • Cite this

    Campos, S., Veloz, A., & Allende, H. (2019). An out of sample version of the EM algorithm for imputing missing values in classification. In An out of sample version of the EM algorithm for imputing missing values in classification (pp. 194-202). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11401 LNCS). https://doi.org/10.1007/978-3-030-13469-3_23