Srebrenka Robic, Associate Professor of Biology and Chair
Mellon-Funded Digital Faculty Fellow 2015-16
Tutor: Micaela Lewinson
Bioinformatics is an interdisciplinary area of study linking computational tools and databases to biology. Topics such as DNA sequence analysis, genome sequencing, expression of genes, 3D-structures of proteins are all considered parts of bioinformatics. In my Bioinformatics course (BIO 260) students are introduced to this exciting new interdisciplinary area spanning computational science and biology. This is a non-traditional biology course – it does not have a “wet lab”, but students spend a significant time in the computer lab using various databases (such as DNA sequence databases, protein structure databases) and software packages to solve problems in biology.
One of the class projects is analysis and annotation of a previously unpublished, newly sequenced genome. Through this project students have an opportunity to use digital research to make a novel contribution to science! This exciting project is made possible through my participation in a multi-institution collaboration Genomics Education Partnership GEP (http://gep.wustl.edu/). My last group of Bioinformatics students all became co-authors on an original research paper, which gained national attention in 2015 (https://genestogenomes.org/undergrads-power-genomics-research/ ). The whole class is centered around digital tools, and my aims are: 1) to develop a digital laboratory notebook for the GEP project, and 2) to investigate possibilities for further wider digital dissemination of my students’ results, through means such as blogging. Last time I taught this class I realized that I need a more streamlined and more collaborative method of documenting all of my students’ work. I do not need any new technology (beyond tools we already have access to such as Google-drive and free blogging platforms). Most of my effort on this project will be spent on working out the details of designing an effective digital laboratory notebook in which students would properly document all the steps of their computer-based genomics research. I will use some digital notebook guidelines developed by GEP partners, but will need to tailor them to my class and my students’ backgrounds. The outcome of my project would be a digital notebook product which would serve three goals: 1) allow students to collaborate with each other, 2) allow me to evaluate their work, and 3) serve as a starting point for submission of our results to the GEP network. The outcome of the "wider digital dissemination" part of the project would be some form of digital communication through which students could better communicate their exciting work with our campus community and a wider community of undergraduates. I think thismay serve an important role in attracting more students to bioinformatics, science and computational areas of research in general.
Digital Fellowship Report Spring 2016 Digital Learning in Bioinformatics (BIO260) Bioinformatics is an area of study, which uses digital tools (such as databases, search and sequence-comparison algorithms) to analyze complex problems in biology. Because bioinformatics is a computer-based area of study, computer-based learning and computer-based
research are the best way to introduce our students to it.
Entire class was a combination of short lectures, discussions and computer-based activities. As a part of the class students participated in Genomics Education Partnership (http://gep.wustl.edu/) project, which allowed them to analyze a newly-sequenced, not yet published part of a fruit fly genome. This was my largest bioinformatics class ever (19 students enrolled, last time I had only 5). At the end of the class, students submitted their analyses to GEP, for consideration for further publication in original research articles. As a result of my last Bioinformatics class, all five students in that class became authors on the first collaborative GEP publication with over 1000 authors! (http://www.g3journal.org/content/5/5/719.full, https://genestogenomes.org/undergrads- power-genomics-research/ ). I am hoping this group of students will also make a valid contribution to the continuing investigation of fruit fly genomes. This year, in addition to teaching the students about bioinformatics tools, such as the genome browser (Fig 1), I also required students to document all their work in form of a digital lab notebook, which they created on GoogleDrive and shared with me. This allowed me to provide
them with timely feedback by commenting on their documents directly, without printing.
Figure 1. Genome browser is an online tool students used to analyze DNA sequences. Various experimental and computational tracks can be used to visualize sequence features. This allowed students to make inferences about gene structures. To demonstrate some of the work we have done, I am submitting a sample gene annotation report (which we using as a form to share our results with our GEP collaborators at Washington University), as well as a sample digital lab notebook containing student work. Next year, I hope to also implement a venue to share major findings of student projects with a
wider campus and general community through blogging.
GEP Annotation Report
Student name: Eliza Reese
Student email: email@example.com
Faculty advisor: Srebrenka Robic
College/university: Agnes Scott College
Project name: contig6
Project species: Drosophila ficusphila
Date of submission: 5/5/16
Size of project in base pairs: 40,000
Number of genes in project: 2
Does this report cover all of the genes or is it a partial report? partial
If this is a partial report, please indicate the region of the project covered by this report:
From base 27000 to base 37600
Instructions for project with no genes
If you believe that the project does not contain any genes, please provide the
following evidence to support your conclusions:
- Perform a BLASTX search of the entire contig sequence against the nonredundant(nr) protein database. Provide an explanation for any significant (Evalue < 1e5) hits to known genes in the nr database as to why they do not correspond to real genes in the project.
- For each Genscan prediction, perform a BLASTP search using the predicted aminoacid sequence against the nr protein database using the strategy described above.
Perform a BLASTX search against the nr database using these genomic regions to
- Examine the gene expression tracks (e.g., RNASeq) for evidence of transcribed regions that do not correspond to alignments to known D. melanogaster proteins.
determine if they show sequence similarity to known or predicted proteins in the nr
Last Update: 12/17/2015
Complete the following Gene Report Form for each gene in your project. Copy and paste the sections below to create as many copies as needed within this report. Be sure to
create enough Isoform Report Forms within your Gene Report Form for all isoforms.
Gene name (e.g., D. biarmipes eyeless): CG32000
Gene symbol (e.g., dbia_ey): CG32000
Approximate location in project (from 5’ end to 3’ end): 2719337,484
Number of isoforms in D. melanogaster: 9
Number of isoforms in this project: 2
Complete the following table for all the isoforms in this project: Name(s) of unique isoform(s)
based on coding sequence