A tutorial on the ZeneMark database
(click here to see a flash demo)

In this tutorial, you will be guided to use the BLAST server at this website (http://zenemark.znomics.com/) to locate the position of your gene in the zebrafish genome and then identify if mutation(s) in your gene is (are) available.

Suppose that you are interested in mutations in the tumor protein, p53 (tumor suppressor p53). There are several steps to locate the position of tp53 in the zebrafish genome under the browser window of the ZeneMark database:

Step 1. Obtain the mRNA sequence of tp53 for zebrafish. This can be done by pointing your web-browser to the NCBI home page at "http://www.ncbi.nlm.nih.gov/", then enter the search term "tp53 AND Danio".   Set the database to search to "Nucleotide" and then click the "Go" button as shown below:

NCBI search box


This window will show up after the search is done:




Click on the first link (NM_131327). You will be directed to the page shown below:

   

Note the default "Display" is set to "GenBank". Now using the pull-down menu, set it to "FASTA". The page will automatically refresh and the mRNA sequence for tp53 will be displayed in this window:



Copy the mRNA sequence onto clipboard. Now we have the mRNA sequence for tp53. It will be used as query sequence in the next step.

Step 2. Run a BLAST search on the ZeneMark database web-site. Point you web-browser to "http://zenemark.znomics.com/" if you have not open this page yet. Look at the link at the left side bar which says "Run a BLAST search". Click on it.




A BLAST search page will open as shown below:




Now paste the mRNA sequence on the clipboard onto the text input window as shown below and then click the red "Run" button at the very bottom of this page:




Now the BLAST program starts running. It will take about 10 seconds (depending on the server load) to finish the search. You should see the page below after that:





The upper portion of this page shows that the best hit is found in chromosome 5 as indicated by the boxed triangle. Now scroll down to see the bottom portion of this page:



The bottom pannel of this page shows the details of the BLAST results. It is ordered by alignment score in this page. Your display may vary. Your can customize the display by select or de-select the options in the nine display option pull-down menu. The first column [A][S][G][C] are hyperlinks which can be used to view the BLAST alignment[A], query sequence [S], query sequence in the genomic context [G] or view the BLAST hits under the "ContigView" window [C]. Clicking on any of these will pop-up a new window showing the appropriate content under that link. Now click on the [C] link in the first row. You should see the page below:



This page only shows the first two exons of tp53.  To  view the entire gene,  click the zoom button next to the blue zoom button ( the fourth one from the plus sign). Now you should see this page:



This page shows the entire tp53 gene on the right half window of the "Detailed view" panel. Now try to center the gene by clicking on a blank area in the center of the gene, as show below. A pop-up menu will appear. Select the "Centre" option.




The page will refresh and display the tp53 gene at the center of the window:






Now look the purple triangles above the thick blue bar (DNA contigs). These are the locations of retroviral insertions in the zebrafish genome. This track is labeled as "ZeneMarker" on the left side of the "Detailed view" panel. We can see that there are 8 insertions in the entire tp53 gene.

Now try to mouse over the first purple triangle (the far left one). You should see that this zenemarker has a id of "ZM_00057254". This is the id number you should use to refer to this insertion. We can tell that the first insertion is in the seventh intron of tp53 (the tp53 gene is mapped on the botton strand the chromosome 5. By convention, genes mapped on the top strand are displayed above the thick blue contig bar and genes mapped on the bottom strand are displayed below the blue contig bar). To see the locations of the insertions better, you can try to zoom-in the display region and recenter the region of interest using the technique mentioned above. Let us try to enlarge the area where the second and third triangle point to. First, we need to put the area of interest at the center of the "Detailed view" panel. Now try to click the second (from the plus sign on the left) zoom button to enlarge the area. Your should see a display window as below:



This picture clearly indicates that the second insertion is in an exon (the fifth exon) of tp53 and the third insertion is in the fourth intron.

Step 3.  We have identified an insertion in an exon of the tp53 gene. You may be interested in the exact base position on the zebrafish genome this virus is landed. To find the answer for this, let us collapse the "Basepair view" panel right below the "Detailed view" panel if there is a plus sign right before the "Basepair view" label. Now you should see a triangle in the "ZeneMarker" track and the six frame translations of the protein sequence. The tracks right-above and -below the blue bar show the nucleotide sequences. To better view the sequences, try to use the zoom in button to make the letters for the bases clearly visible. Below is a snapshot of the window after zoming the small window in to display only the 50 bp sequences:

   

Now you can clearly see that the virus is inserted after the "GAACGGG" and before the "GCAAAGT" in the nucleotide sequence.

Step 4. Having seen the locations of the insertions, you might be interested in the actual sequences used in the mapping of the insertions. Perhaps you also want to check by youself how good a given zenemarker sequence mapped to a given location or if a zenemarker sequence can be mapped to several locations. You can follow these step to do so:

First, mouse over a insertion mark (the purple triangle) of interest and then click on it. For example, click on the second triangle you will see a small pop-up menu as shown below:




In this pop-up menu,  it shows the zenemarker id you just clicked, followed by "Details" and "Sequence" options. Now select the "Details" option. You should see a page like below (since this page does not fit into a single window, it is splitted into two):





This page shows that the ZM_00101395 is mapped on chromosome 5 as indicated by the red triangle. If you scroll this page down, you can see a "Feature Information" table. In this case, we just see a single row in this table. This means
ZM_00101395 is unambiguously mapped on chromosome 5 at the base position of 16162422. Occationaly, you may see over one rows in the "Feature Information" table. This means the zenemarker was mapped on multiple locations with the same equal match. This could be caused by the nature that the zebrafish genome contains tandom repeats. Alternatively, it can also be caused by genome sequence assembly errors.  Next, try to click on the same  insertion mark again and select "Sequence".  You should see a page like below:

   


This page shows the actual viral flanking sequence used for the mapping. By convention, we put the host-virus boundary at the 3' end of the seqence. This sequence flanks the 3' end of the virus as indicated in the FASTA defline. We tried to sequence both the 5' and the 3' sequences flanking the virus insertion sites. But the sequences on the other side may not be always available.

Now to confirm this insertion or map this insertion by youself, copy this sequence (both the first line and the second line in this small window). Go back to the ZeneMark database home page and click the "Run a BLAST search" link (you have done this at the beginning of this tutorial). Paste the sequence onto the query sequence input window then click the "Run" button. You should see the page bellow after the BLAST search is done (again, this page is splitted into two):





The top window in the page shows that the insertion is mapped on chromosome 5. The bottom window shows alignment summary. Note that there is just one row in the alignment summary table, indicating that this zenemarker is uniquely mapped on the zebrafish genome. To view the alignment, click the [S] link to view the query sequence. You should see a page like this:



Note that all the bases in the query sequence are aligned. All of the bases are high-lighted with red color. Also note that the query sequence is 51 bp in length. Next, click the [A] link to view the actual alignment:




This "BlastView" page shows that the query sequence matched perfectly to the target (the zebrafihs genomic sequence).  Next, click on the [C] link to go to the "ContigView" window to view the location of the insertion on the fish genome:




In this window, you can see that the BLAST hits is displayed as a new track right below the "ZeneMarker" track. The red rectangle in the "BLAST hits" track indicates the location of the query sequence on the fish chromosome. Clearly, the 3' end of the query sequence aligned with the tip of the first purple triangle (ZM_00101395) (Try to center this region and use the zoom tool to have a better view). This is where the virus is inserted. Now we have confirmed that there is a insertion (ZM_00101395) in an exon of the tp53 gene based on the alignment quality of the flanking sequence on the fish genome and its unique chromosomal location. At this point, users may contact us about this insertion and we will further validate this insertion by experimental method.


Congratulations! You have just finished this tutorial and now you should feel comfortable with the use of our ZeneMark database.



Last updated: Thu July 10 06:54:24 PDT 2006
Znomics Home|Contact Us
Copyright © 2002-2006 Znomics, Inc. All rights reserved.