Lim Jia Qi, Norma Alias, Farhana Johar
Preprints (www.preprints.org) | NOT PEER-REVIEWED
Manual interpretation of these huge amounts of image volumes are susceptible to inter-reader variability and human error. Thus, accurate automated CAD scheme is highly desirable in clinical pathological diagnosis. In this research, plethora of machine learning paradigms (e.g. feature extraction, dimensionality reduction and supervised classification methods) were explored, evaluated, compared and analyzed to identify the optimal pathway for brain MR images (normal vs neoplastic) binary classification task. External validation dataset was used to test the generalizability of the optimal predictive models implemented. Relevant and informative features were selected to construct cross-validated decision tree and eventually simple rule set was built based on the decision tree. The experimental results show that almost all pattern recognition paradigms achieve high accuracy with careful selection of number of attributes. LDA+ELM with 55 features are the optimal pipelines which achieve perfect classification when training and test data are of same source; and achieving (accuracy=97.5%, AUC=0.989, sensitivity=95% and specificity=100%) under balanced test dataset; (accuracy=99.5%, AUC=0.988, sensitivity=95% and specificity=100%). Cross-validated decision tree model also shows comparable performance: accuracy=98.8%, AUC=99.1%, sensitivity=99.6% and specificity=98.2%. Three highly relevant and robust attributes are visualized and selected for construction of decision tree models and finally a rule sets a re read directly off the decision tree. This rule sets can potentially serve as fast and accurate classification algorithm.