The data sets and the source code of our paper ‘A Biclustering-based Classification Framework for Microarray Analysis’ can be downloaded from GitHub:
In recent years, microarrays have been shown to be an effective method for studying various biological processes, e.g., to improve our understanding
of diseases such as cancer. In a typical situation, microarrays can be seen as large matrices in which rows and columns represent expression
values of thousands of genes and tens of conditions such as samples from various patients. Several statistical
techniques have been proposed in the literature to analyze the gene expression matrices. Towards that end, biclustering has been demonstrated to
be one of the most effective methods for discovering gene expression patterns under various conditions.
In this paper, we present a framework to take advantage of the homogeneously expressed genes in biclusters to construct a classifier for sample class membership prediction. Our extensive experiments on 8 real cancer microarray datasets (4 diagnostic and 4 prognostic) show that our proposed classifier performed superior in both cancer diagnosis and prognosis, the latter of which was regarded quite difficult previously. Additionally, our results demonstrate that sample classification accuracy can serve as a good subjective quality measure for different types of biclusters, and hence as a tool to extrinsically evaluate the performance of various biclustering algorithms that produce those biclusters.