GENETIC ALGORITHM WITH BAGGING FOR DNA CLASSIFICATION

Authors

  • Balamurugan E
  • Jackson Akpajaro

DOI:

https://doi.org/10.29284/ijasis.7.2.2021.31-39

Keywords:

DNA classification, genetic algorithm, feature selection, ensemble method, decision tree, bagging.

Abstract

Accurate classification of cancer plays an important role for cancer treatment. The advancement of microarray technologies improves the accuracy of cancer diagnosis. Recently, scientists identify more informative genes from thousands of genes for accurate cancer detection. In this paper, Genetic Algorithm (GA) with bagging is developed for DeoxyriboNucleic Acid (DNA) classification. To remove the noise and data integrity, GA is applied to find the informative genes from the microarray data. It uses Backward Selection (BS), Forward Selection (FS) and Branch and Bound Selection (BBS) algorithms to select the sub-set of genes. Then bagging is employed to classify the selected genes to normal or abnormal. The evaluation of DNA classification system is performed on five cancers; colon, Central Nervous System (CNS), ovarian, leukemia and breast. Results show that the accuracy of GA-BBS with bagging algorithm is better than GA-BS and GA-FS with bagging. For all datasets, GA-BBS with bagging provides no misclassification and gives the highest performance (100%) in terms of sensitivity, accuracy and specificity. Based on results, it is concluded that ‘best’ prediction system is GA-BBS with bagging classifier.

References

K. Yan and H. Lu, "An Extended Genetic Algorithm Based Gene Selection Framework for Cancer Diagnosis," 9th International Conference on Information Technology in Medicine and Education, 2018, pp. 43-47.

P. Wu and D. Wang, "Classification of a DNA Microarray for Diagnosing Cancer Using a Complex Network Based Method," IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 16, No. 3, 2019, pp. 801-808.

I. Jain, V. K. Jain and R. Jain, "An improved Binary Particle Swarm Optimization (iBPSO) for Gene Selection and Cancer Classification using DNA Microarrays," Conference on Information and Communication Technology, 2018, pp. 1-6.

N. Soleymani and M. H. Moattar, "An approach based on resampling and feature selection to improve the classification of microarray data," 6th Iranian Joint Congress on Fuzzy and Intelligent Systems, 2018, pp. 61-64.

B. Patra and S. S. Bisoyi, "CFSES Optimization Feature Selection with Neural Network Classification for Microarray Data Analysis," 2nd International Conference on Data Science and Business Analytics, 2018, pp. 45-50.

M. Liu, L. Xu, J. Yi and J. Huang, "A Feature Gene Selection Method Based on Relief and PSO," 10th International Conference on Measuring Technology and Mechatronics Automation, 2018, pp. 298-301.

T. Almutiri and F. Saeed, "Chi Square and Support Vector Machine with Recursive Feature Elimination for Gene Expression Data Classification," First International Conference of Intelligent Computing and Engineering, 2019, pp. 1-6.

A. Khoirunnisa, Adiwijaya and A. A. Rohmawati, "Implementing Principal Component Analysis and Multinomial Logit for Cancer Detection based on Microarray Data Classification," 7th International Conference on Information and Communication Technology, 2019, pp. 1-6.

R. Xu, D. Wunsch II and R. Frank, "Inference of Genetic Regulatory Networks with Recurrent Neural Network Models Using Particle Swarm Optimization," IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 4, No. 4, 2007, pp. 681-692.

M. Atlam, H. Torkey, H. Salem and N. El-Fishawy, "A New Feature Selection Method for Enhancing Cancer Diagnosis Based on DNA Microarray," 37th National Radio Science Conference, 2020, pp. 285-295.

F. Han, C. Yang, Y. Wu, J.S. Zhu, Q.H. Ling, Y. Q. Song and D. Huang, "A gene selection method for microarray data based on binary PSO encoding gene-to-class sensitivity information," IEEE/ACM transactions on Computational Biology and Bioinformatics, Vol. 14, No. 1, 2017, pp. 85-96.

J. Tang and S. Zhou, "A New Approach for Feature Selection from Microarray Data Based on Mutual Information," IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 13, No. 6, 2016, pp. 1004-1015,.

E. Bonilla-Huerta, A. Hernández-Montiel, R. Morales-Caporal and M. Arjona-López, "Hybrid Framework Using Multiple-Filters and an Embedded Approach for an Efficient Selection and Classification of Microarray Data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 13, No. 1, 2016, pp. 12-26. 75.

L. Scott, P. Pomeroy, P. Tamayo and G. Michelle, "Prediction of Central Nervous System Embryonal Tumour Outcome Based on Gene Expression," Letters to Nature, vol. 415, 2009, p. 436-442.

E. Petricoin, A. Ardekani, B. Hitt, P. Levine, V. Fusaro and S. Steinberg, "Use of proteomic patterns in serum to identify ovarian cancer," Lancet, Vol. 359, No. 9306, 2002, pp. 572–577.

T. Furey, N. Cristianini, N. Duffy, D. Bednarski, M. Schummer and D. Haussler, "Support vector machine classification and validation of cancer tissue samples using micro-array expression data, Bioinformatics, Vol. 16, No. 10, 2000, pp. 906-914.

J. Zhang and H.W. Deng,"Gene selection for classification of microarray data based on the Bayes error," BMC Bioinformatics, Vol. 8, No. 1, 2007, pp. 370-379.

Downloads

Published

2021-12-31

Issue

Section

Articles

How to Cite

[1]
B. E and Jackson Akpajaro, “GENETIC ALGORITHM WITH BAGGING FOR DNA CLASSIFICATION”, IJASIS, vol. 7, no. 2, pp. 31–39, Dec. 2021, doi: 10.29284/ijasis.7.2.2021.31-39.

Share