Friday 19th of April 2024
 

A Comparative Study of Machine Learning Methods for Verbal Autopsy Text Classification


Samuel Danso, Eric Atwell and Owen Johnson

A Verbal Autopsy is the record of an interview about the circumstances of an uncertified death. In developing countries, if a death occurs away from health facilities, a field-worker interviews a relative of the deceased about the circumstances of the death; this Verbal Autopsy can be reviewed off-site. We report on a comparative study of the processes involved in Text Classification applied to classifying Cause of Death: feature value representation; machine learning classification algorithms; and feature reduction strategies in order to identify the suitable approaches applicable to the classification of Verbal Autopsy text. We demonstrate that normalised term frequency and the standard TFiDF achieve comparable performance across a number of classifiers. The results also show Support Vector Machine is superior to other classification algorithms employed in this research. Finally, we demonstrate the effectiveness of employing a locally-semi-supervised feature reduction strategy in order to increase performance accuracy

Keywords: Text Classification, Verbal Autopsy, Machine Learning, Algorithms, Term Weighting, Feature Reduction.

Download Full-Text


ABOUT THE AUTHORS

Samuel Danso
Samuel Danso is currently pursuing a PhD at the University of Leeds in the United Kingdom. He holds a BSc(Hons) in Computing and an MSc in Advanced Software Engineering. He has over 10 years experience in database design and implementation for large and complex epidemiloigical and clinical studies carried out by the London School of Hygiene and Tropical Medicine in colloboration with the Kintampo Health Reseach Centre, Ghana, where the studies are conducted. Samuel is a Commonwealth Scholar, and has co-authored a nmber of peer-reveiwed journal publications. His research interest lies in Text Analytics, particularly focused on Health Informatics

Eric Atwell
Eric Atwell is an Associate Professor at the University of Leeds in the United Kingdom. He has over 30 years experience in conducting and supervising language research projects. His research specialty is in the area of Corpus Linguistics and Text Analytics: Machine Learning and Data Mining analysis of a corpus of text - in English, Arabic, or other languages - to analyse the text and detect "interesting" and "useful" features or patterns. He has led research projects supported by various funding bodies including the EPSRC, ESRC, CPNI, HEFCE, MoD, and industry

Owen Johnson
Owen Johnson is Senior Fellow at the University of Leeds in the United Kingdom. He has over 20 years experience as a practitioner and has been responsible for the development, implementation and strategic management of information systems within major blue-chip organisations such as BT, Amoco and Forte. Most recently, he was the IT Manager for Gardner Merchant Leisure, a division of the world's largest catering company, Gardner Merchant Sodexho. He was until recently Vice Chair of Bradford Community Housing Trust.


IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »