Indexing, Mining and Modeling Spatio-Temporal Patterns of Gene Expressions
Principal Investigator:
Co-Principal Investigator:
This material
is based upon work supported by the National Science Foundation
under Grant No. DBI-0640543. Any opinions, findings, and
conclusions or recommendations expressed in this material are those
of the author(s) and do not necessarily reflect the views of the
National Science Foundation.
1. GENERAL
INFORMATION
1.1.
Abstract
Recent progress in image-based genome-scale profiling of whole body mRNA patterns via in situ hybridization (ISH) calls for development of accurate and automatic image analysis systems to facilitate efficientmining of complex temporal-spatial mRNA patterns, which will be essential for functional genomics and regulatory network inference in higher organisms. This project
tries to answer questions such as: (1) What are the differences between two embryo ISH images? How to measure the difference, quantitatively and objectively? (2) Given a collection of ISH images with gene and time stamps as well as functional attributes, how can we find the most similar ones, to a given keyword and/or image query? (3) How can we group co-expressed genes, and how to uncover underlying "themes" or biological processes? What temporal patterns can we spot? The
approach consist of three efforts: (1) development of an online interface to analyze, visualize and mine the annotated images; (2) Design of scalable algorithms/models for ISH image feature extraction, co-expressed gene grouping and (3) temporal expression pattern discovery for studying gene regulatory interactions. The resulting
tools will have a broad applicability. They will be available online to answer cross-modal queries and reveal spatio-temporal patterns to help genetic studies. The
project Web site
(
http://www.db.cs.cmu.edu/db-site/Projects/cdem) will be used for demonstration and results
dissemination.
1.2.
Keywords
Data mining, gene expression, temporal-spatial patterns.
1.3. Funding
agency
- NSF, Award Number: DBI-0640543, Duration:
08/15/2007-07/31/2010
2. PEOPLE
INVOLVED
In addition to the PI, the following graduate students worked on
the project.
3.
RESEARCH
3.1. Current
Results
We developed C-DEM: an online system for Drosophila (= fruit-fly) Embryo images Mining. It is built upon more than 10k ISH images from Berkeley Drosophila Genome Project and supports queries from all three modalities to all three, namely, (a) genes, (b) images of gene expression, and (c) annotation keywords of the images. Thus, it can find images that are similar to a given image, and/or related to the desirable annotation keywords, and/or related to specific genes. C-DEM envisions the whole database as a tri-partite graph (one type for each modality), and it uses fast and flexible proximity measures, namely, random walk with restarts (RWR).
Last updated: May 15,
2008, by Fan Guo