Home / Swingerschatfree / Validating clustering for gene expression data bioinformatics

Validating clustering for gene expression data bioinformatics sophos antivirus updating

To illustrate its application to genomics, clustering applied to genes from a set of microarray data groups together those genes whose expression levels exhibit similar behavior throughout the samples, and when applied to samples it offers the potential to discriminate pathologies based on their differential patterns of gene expression.

Although clustering has now been used for many years in the context of gene expression microarrays, it has remained highly problematic.

Factors to consider when choosing an algorithm include the nature of the application, the characteristics of the objects to be analyzed, the expected number and shape of the clusters, and the complexity of the problem versus computational power available.Although we will not go over the mathematical details of [19, 20], in this section we summarize some essential points regarding clustering error, error estimation, and inference.Within a probabilistic framework, objects to be clustered are assumed to be described by vectors of numerical values.These vectors are realizations of a random labeled point process, which produces random sets in a multi-dimensional space with unknown random labels associated with each vector.Two vectors are properly in the same cluster if and only if they have the same label produced by the random process [19].Although used for many years in the context of gene expression microarray data, clustering has remained highly problematic [2, 12, 17].Some criticisms raise the question as to whether clustering can be used for scientific knowledge [18]: how may one judge the relative worth of clustering algorithms unless the assessment is based on their inference capabilities?[9], which used the K-means algorithm to identify transcriptional regulatory sub-networks.Another graph based algorithm called CLICK was introduced in 2000 by Sharan and Shamir [10]. [11] presented the use of model based clustering, where the clusters are modeled as mixtures of Gaussian distributions, and proposed the use of the BIC criterion for selecting the number of clusters. presented in 2002 [12] an algorithm to select the best clustering rule for a dataset, based on noise injection, replication, and cluster accuracy.In comparison, clustering has historically been approached heuristically; there has been almost no consideration of learning or optimization, and error estimation has been handled indirectly validation indices.Only recently has a rigorous clustering theory been developed in the context of random sets [19].


  1. BIOINFORMATICS RESEARCH ARTICLE. Comparative analysis of clustering methods for gene expression time course data. Ivan G. CostaI; Francisco de A. T. de. In the few works in which cluster validation was applied with gene expression data, the focus was on the evaluation of the validation methodology proposed.

  2. Aug 5, 2012. has potential to provide insight into the complex associations between genes that are involved. Functional discovery is a common goal of clustering gene expression data. In fact, the functionality of genes can be inferred if their expression patterns, or profiles, are similar to genes of known function.

  3. A number of laboratories previously specialized in statistics, informatics, artificial intelligence and data mining are now turning their focus to bioinformatics, computational biomedicine and systems biology. The field of gene expression data analysis has grown fast early approaches that mainly involved clustering have.

  4. Jan 12, 2006. estimated, and this measurement error information is incorporated directly into the clustering algorithm. The algorithm, CORE Clustering Of Repeat Expression data, is presented and its performance is validated using statistical measures. By using error information about gene expression measurements.

  5. Nov 1, 2004. Pablo A. Jaskowiak, Ricardo J. G. B. Campello, Ivan G. Costa Filho, Proximity Measures for Clustering Gene Expression Microarray Data A Validation Methodology and a Comparative Analysis, IEEE/ACM Transactions on Computational Biology and Bioinformatics TCBB, v.10 n.4, p.845-857, July 2013.

Leave a Reply

Your email address will not be published. Required fields are marked *