Classification Performance Comparison of Deep Learning and Classical Data Mining Methods on RNA-Seq Dataset

International Journal of Data Mining and Bioinformatics |

In this study, it is aimed to compare the performance of deep learning and classical classification methods in the classification of RNA-Seq data. Two datasets with different characteristics are used. The first dataset, the Lung Cancer dataset, has balanced class distributions and two classes. The second dataset is the Renal Cell Carcinoma dataset with three imbalanced classes. The classification performances of random forest, support vector machines, artificial neural network and deep learning are examined using different gene filtering methods on two datasets. In general, deep learning and support vector machines obtained the highest or second highest values in terms of performance measures such as accuracy, F measure and Kappa coefficient. In the Lung Cancer datasets that contain more genes and show a balanced class distribution, deep learning method generally has more successful results than classical classification methods and it is recommended to use.