Description of the Community Wide Experiment on the Assessment of Gene Prediction
for the "Drosophila melanogaster" genome: The Adh region (2.9 Mbases)
Introduction
Methods for predicting gene structures and other functional sites in
DNA sequences have been advancing rapidly. We are interested in
assessing what the various methods provide, and how reliable they are.
The goal of this experiment is to obtain an in-depth and objective
assessment of the current state of the art in gene and functional site
predictions in genomic DNA. To this end, participants will predict as
much as possible about a sample genomic region that has been studied
intensively in the past. Participants are encouraged to predict
genes, functional sites such as promoters, transcription start sites,
transcription factor binding sites, splice sites, start codons, stop
codons, and other sites of interest. Biological annotations are also
welcome, as are "consensus" annotations arrived at by any method of
combining other predictions (please describe your combination method).
Besides providing the sample
genomic sequence for participants to annotate, we will make other
relevant datasets available, including complete cDNAs for Drosophila
melanogaster.
Our tutorial presentation at ISMB-99 will summarize the annotations
submitted by participants in the experiments, describe the methods
used, and discuss how we define "success" in the annotation process.
Goal
The main goals of the experiment are to address the following
questions about the current state of the art in genome annotation:
- Are the gene predictions similar to the known
gene structures?
- Are the details of the gene predictions correct (e.g.,
splice sites)?
- What other DNA features (besides gene structure) can be reliably
identified?
- Which analysis methods are the most effective?
Participation
Participation in the experiment is open to everyone, whether or not
you will be attending ISMB-99. If you plan to participate, please send us email. Predictors may
form teams; each team should have a designated group leader. Each team
will be issued a unique ID number, which will serve to identify their
predictions. Those interested in receiving mailings concerning
progress of the experiment may also register as
'observers'. Predictions must be emailed to us in GFF format before
June 30, 1999. Each team should submit a single GFF-format file that
includes all features predicted on the sample 2.9Mbase sequence.
Every participant in the experiment will be invited to a free dinner in Heidelberg the evening
of the ISMB tutorial.
Assessment of Predictions
The Drosophila Genome Center team will evaluate the predictions by
comparing them with each other and with the current "best" annotations
put together by our center. There are NO winners and losers--our
interest is in seeing which annotation methods are being used and
what their relative strengths and weaknesses are.
Release of Results
All submitted annotations and their evaluations will be made available
through this web site shortly before the ISMB meeting, at which the
submissions will be discussed and compared.
Timetable
- May 1, 1999 - June 30, 1999
- Distribution of the sample sequence and associated data to predictors.
Collection of predictions.
- June 30, 1999 - July 31, 1999
- Evaluation of the predictions by the Drosophila Genome Center
- August 6, 1999
- Tutorial #3 at the ISMB-99 conference in Heidelberg, Germany
Organizing Committee
Martin Reese Drosophila Genome Center, University of California, Berkeley, USA
Nomi Harris Drosophila Genome Center, Lawrence Berkeley National Laboratory, USA
Suzanna Lewis Drosophila Genome Center, University of California, Berkeley, USA
George Hartzell Drosophila Genome Center, University of California, Berkeley, USA
Uwe Ohler University of Erlangen, Germany
Queries
Please address any questions or queries to
[email protected]
[email protected]
Last modified: Mon Jun 7 17:43:41 PDT 1999