• Jayesh Manohar Sonawane
  • Shrihari D.Gaikwad
  • Gyan Prakash
Keywords: DNA, Microarray Data, DTMBWT, KNN


Deoxyribo Nucleic Acid (DNA) microarrays are widely used to monitor the expression levels of genes in parallel. It is possible to predict human cancer using the expression levels from a collection of DNA samples. Due to the vast number of genes expression level, it is challenging to analyze them manually. In this paper, data mining approach is used to extract the prevailing information from DNA microarray with the help of multiresolution analysis tool. Dual Tree M-Band Wavelet Transform (DTMBWT) is employed for the extraction of features from the given dataset at the 2nd level of decomposition. K-Nearest Neighbor (KNN) classifier is used for cancer classification. Results show that KNN classifier classifies five different cancer datasets; Breast, Colon, Ovarian, CNS, and Leukemia with over 90% accuracy.  


C. Chen, C. Xu, R. Bie, and X.Z. Gao, “Artificial immune recognition system for DNA microarray data analysis”, IEEE Fourth International Conference on Natural Computation, Vol. 6, 2008, pp. 633-637.

A.H. Chen, G.T. Chen, J.C. Hsieh, and C.H. Lin, “BCPP: An intelligent prediction system of breast cancer prognosis using microarray and clinical data”, IEEE WRI World Congress on Computer Science and Information Engineering, Vol. 5, 2009, pp. 28-32.

W. Chen, H. Lu, M. Wang, and C. Fang, “Gene expression data classification using artificial neural network ensembles based on samples filtering”, IEEE International Conference on Artificial Intelligence and Computational Intelligence, Vol. 1, 2009, pp. 626-628.

Y.M. Chiang, H.M. Chiang, and S.Y. Lin, “The application of ant colony optimization for gene selection in microarray-based cancer classification”, IEEE International Conference on Machine Learning and Cybernetics, Vol. 7, 2008, pp. 4001-4006.

C.C. Chuang, S.F. Su, and J.T. Jeng, “Dimension reduction with support vector regression for ovarian cancer microarray data”, IEEE International Conference on Systems, Man and Cybernetics, Vol. 2, 2005, pp. 1048-1052.

L.M. Fu, and E.S. Youn, “Improving reliability of gene selection from microarray functional genomics data”, IEEE Transactions on Information Technology in Biomedicine, Vol.7, No.3, 2003, pp.191-196.

P.G. Kumar, T.A.A. Victoire, P. Renukadevi, and D. Devaraj, “Design of fuzzy expert system for microarray data classification using a novel genetic swarm algorithm”, Expert Systems with Applications, Vol.39, No.2, pp.1811-1821.

S. Hengpraprohm, and P. Chongstitvatana, “Selecting Informative Genes from Microarray Data for Cancer Classification with Genetic Programming Classifier Using K-Means Clustering and SNR Ranking”, IEEE Frontiers in the Convergence of Bioscience and Information Technologies, 2007, pp. 211-218.

S. Hengpraprohm, S. Mukviboonchai, R. Thammasang, and P. Chongstitvatana, “A GA-Based classifier for microarray data classification”, IEEE International Conference on Intelligent Computing and Cognitive Informatics, 2010, pp. 199-202.

N. Iam-On, and T. Boongoen, “Revisiting link-based cluster ensembles for microarray data classification”, IEEE International Conference on Systems, Man, and Cybernetics, 2013, pp. 4543-4548.

S. Li, C. Liao, and J.T. Kwok, “Wavelet-based feature extraction for microarray data classification”, IEEE International Joint Conference on Neural Networks, 2006, pp. 5028-5033.

A. Osareh, and B. Shadgar, “Microarray data analysis for cancer classification”, IEEE 5th International Symposium on Health Informatics and Bioinformatics, 2010, pp. 125-132.

N. Kingsbury, “Complex wavelets for shift invariant analysis and filtering of Signals”, Applied and computational harmonic analysis, Vol. 10, No. 3, 2001, pp. 234-253.

I.W. Selesnick, “The double-density dual-tree DWT”, IEEE Transaction on Signal Processing, Vol. 52, No. 5, 2004, pp. 1304-1314.