SCDb Home The Stem Cell Database
Search BLAST Feedback Help  

SCDb Help

The following features of SCDb are described below:

Keyword Search

Keyword Search Form The Keyword Search on the SCDb home page lets you enter a single word (e.g., "insulin" or "notch") with which to search the SCDb_id, annotations and BLAST results in the database. The search is case-insensitive and the keyword comparison is performed as a "begins with". (You do not need to add an asterisk (*) to the end of the word.) This means that if you enter the word "insulin" the results would contain the word "insulinase" but not the word "linsulin".

Advanced Search

To access the Advanced Search, click on "Search" on the blue menu bar at the top of most pages. Advanced Search lets you formulate a more complex query of SCDb. Click here to open the Search page in a separate window.

You can enter text into any of the following four fields: SCDb ID, Gene Name, Notes (manual annotation), or BLAST summary (summary line from BLASTing the sequence against Swissprot, GenBank, UniGene and DOTS). The search is case-insensitive and word comparison is performed as a "begins with" (see Keyword Search above). If you specify more than one word in a field a boolean expression is constructed to "and" them together. That is, "insulin growth factor" is equivalent to "insulin AND growth AND factor". You can also construct your own boolean expression using the words "AND", "OR", "NOT" with parentheses, for example, "(exon AND splice) AND (promoter OR enhancer)".

You can also specify the type of sequence match. For example, "Homolog of known protein from any species", "EST from any species", "Reverse Orientation", and "Repetitive Sequence".



The Advanced Search page also contains a set of checkboxes so you can refine your search to one or more of the source libraries. The default is to search within any library. You can refine your search by choosing among mouse (bone marrow and/or mouse fetal) and human libraries.


You can also narrow your search by specifying a mouse or human chromosome. These use BLAT results against the mm4 version of the mouse genome, and the NCBI30 assembly of the human genome, respectively.


Your search can also contain one or more the of GO classification of terms. The terms are in three hierarchies which describe Molecular Function, Cellular Component and Biological Process of a molecule. A complete description of the GO ontologies and links to external browsers can be found at the Gene Ontology Consortium. We have mapped our clones to GO terms via sequence alignment with Megablast to the Riken FANTOM database. You can enter enter one or more GO terms directly into the field provided on the search page in the format "GO:####", with one term per line. You can also press the the Browse button to display a popup window containing the portion of the GO hierarchy represented by SCDb clones.

To navigate the GO hierarchy, click on the plus sign to the left of the term name. This expands the terms below it, and plus sign becomes a minus sign to indicate that it is open. Clicking on the minus sign closes it. Click on a term to include it in your query. It will turn yellow to indicate that it is selected. When you click on a term, all its "children" are included in your query (though they do not turn yellow). So the higher up in the hierarchy you click, the more terms you implicitly include in your query. To be more specific, select the "lowest" term in the hierarchy that satisfies your query. Not every GO term is included; only those which have been linked to one or more clones in SCDb. Also note that you cannot select the top three terms (biological process, cellular component and molecular function) as they represent too large a selection set. When you are satisifed with the terms selected, press the select button to add the terms into the main window. The clear all button clears all the terms you selected (removing the yellow background). The cancel button closes the window and does not return your selected terms. Help displays this text.


You can also specify if your search results should be sorted by the E value obtained by aligning the sequences to SwissProt, Genbank, dbEST or DOTS. The default is not to sort. Another default is to show only unique clones, that is, clones with non-full length sequences which start at the same nucleotide. You can override this default to return all the clones instead. All clones which satisfy the search parameters will be returned; you can set the number to display per page at 5, 10, 20 or 50.

Search Results Summary Page

The Search Results page displays a summary of each of the clones which satisfy the query parameters. At the top of the page your query is displayed along with the number returned. If you specified the default of unique clones only, the total number of non-unique clones is also shown. Click here to open the Search Results page showing a single result in a separate window. Each summary record includes the following:

  • The clone's SCDb identifier.
  • The length of the clone.
  • The subtracted library from which it was cloned is displayed as a clickable link to a page displaying statistics about that library.
  • The clone's (mouse only) chromosomal location with coordinates, obtained by BLAT alignment to the UCSC mm4 version of the Mouse Genome.
  • The number of sister clones, i.e., those which derive from the same transcript. If multiple sisters are present, you can click on "Show all" to display them in a simlar summary format. You can also click on "Relationships" to view a graphical representation (shown here reduced in size) of the homology between sister clones based on their alignment to Unigene, SwissProt and GenBank NR. Each sister is displayed with its identifer above a small clickable circle which will bring up the summary data for that sister clone if you click on it. Clones which did not align well are displayed as "outliers", with fewer connections to their sisters. This allows us to determine when the clustering program has gone awry, grouping unrelated clones. A contig of the sisters can be generated by clicking on "Contig". The contig is generated dynamically so it may take a little while to display.
  • Additional information about the clone displayed in the summary includes the ID category, gene name, message length, full length, and a list of its GO terms. A maximum of 5 GO terms are displayed for each clone, with the rest of them accessible by clicking on "more" at the bottom of the list of terms. This retrieves the same detail information on the clone as does clicking on its identifier. Clicking on a GO term links to the AmiGO browser where you can obtain detailed information on the term.
  • A summary of BLAST results is displayed along with their E-values. If an alignment was found, a clickable magnifying glass icon is displayed which brings up the detailed BLAST alignment information.
  • The Notes field is a manual annotation of the clone. Currently only SCDb staff can annotate the database.
  • A button ("Clone Detail") to display additional information about that clone.

You can download the results of your search in FASTA format by pressing the button on the top right of the page.

Clone Detail Page


The Clone Detail Page repeats some of the summary information and supplements it with additional data including the sequence (viewable with or without space), the full set of GO terms mapped to the clone, a breakdown of the number of sister clones obtained from other subtracted libraries, and the full set of non-contiguous genomic coordinates. Links to the UCSC Genome Browser and EnsEMBL are provided along with the "Show on UCSC Browser" option which displays the clone in a custom track on the UCSC browser.

BLAST

The BLAST option is available on the menu bar at the top of most pages. This allows you to BLAST your own sequences against SCDb. Currently only one sequence can be BLASTed at a time.