Finding Accuracy in Feature Selection Using Firefly Algorithm with Rough Set theory
Keywords:
Bioinformatics, Feature, Firefly, OptimizationAbstract
Feature selection techniques play a vital role in bioinformatics applications. In addition to the large group of techniques that have already been developed in the machine learning and data mining fields, specific applications in bioinformatics have led to possess of newly proposed techniques. In this paper, a method for feature selection is based on Firefly Optimization (FFO) with Rough Set Theory(RST) is proposed. Data sets include a large volume of features with irrelevant and redundant features. Redundant and irrelevant features reduce accuracy. The main aim of this paper is to select a subset of relevant features. A statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method shows the improvement significantly in terms of performance measure metrics: accuracy, sensitivity, specificity, computation time and so on. FFO technique is applied to determine the features globally according to the light intensity. Then the selected features are grouped together to make a subset and applied RST to find the optimized feature. This optimized feature is used to analyze the protein information in the organisms and improve the feature selection accuracy and reduce the computation time in protein data analysis.
Downloads
References
Bing Xue, Mengjie Zhang , Will N. Browne, “Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach, “IEEE Transactions on Cybernetics, Volume 43, Issue 6, 2013, Pages 1656 – 1671
Leyi Wei , Pengwei Xing , Gaotao Shi , Zhi-Liang Ji , Quan Zou,” Fast prediction of protein methylation sites using a sequence-based feature selection technique”, IEEE/ACM Transactions on Computational Biology and Bioinformatics, Volume PP, Issue 99, Pages 1-12
Nancy Yu Song and Hong Yan, “Autoregressive and Iterative Hidden Markov Models for Periodicity Detection and Solenoid Structure Recognition in Protein Sequences”, IEEE Journal of Biomedical and Health Informatics, Volume 17, Issue 2, March 2013, Pages 436 – 441
Jamal Ahmad, Faisal Javed, Maqsood Hayat, “Intelligent computational model for classification of sub-Golgi protein using oversampling and fisher feature selection methods”, Artificial Intelligence in Medicine, Elsevier, Volume 78, 2017, Pages 14–22
Mohamed F. Ghalwash, Xi Hang Cao1, Ivan Stojkovic and Zoran Obradovic, “Structured feature selection using coordinate descent optimization”, BMC Bioinformatics, Volume 17, Issue 158, 2016, Pages 1-14
Alok Sharma , Seiya Imoto and Satoru Miyano, “A Top-r Feature Selection Algorithm for Microarray Gene Expression Data”, IEEE/ACM Transactions on Computational Biology and Bioinformatics , Volume 9, Issue 3, May-June 2012, Pages 754 – 764
Bin Pang, David Schlessman, Xingyan Kuang, Nan Zhao, Daniel Shyu, Dmitry Korkin, and Chi-Ren Shyu, “An Integrated Approach to Sequence-Independent Local Alignment of Protein Binding Sites”, IEEE/ACM Transactions On Computational Biology and Bioinformatics, Volume 12, Issue 2, 2015, Pages 298-308