Advanced Search
To access the Advanced Search, click on "Search" on the blue menu bar at
the top of most pages. Advanced Search lets you formulate a more complex
query of SCDb. Click here to open the
Search page in a separate window.
You can enter text into any of the following four fields:
SCDb ID, Gene Name, Notes (manual annotation), or
BLAST summary (summary line from BLASTing the sequence against Swissprot,
GenBank, UniGene and DOTS). The search is case-insensitive and word
comparison is performed as a "begins with" (see Keyword Search above).
If you specify more than one word in a field a boolean expression is
constructed to "and" them together. That is, "insulin growth factor" is
equivalent to "insulin AND growth AND factor". You can also construct your own
boolean expression using the words "AND", "OR", "NOT" with parentheses, for
example, "(exon AND splice) AND (promoter OR enhancer)".
You can also specify the type of sequence match. For example, "Homolog of
known protein from any species", "EST from any species", "Reverse
Orientation", and "Repetitive Sequence".
The Advanced Search page also contains a set of checkboxes so you can refine
your search to one or more of the source libraries. The default is to search
within any library. You can refine your search by choosing among mouse (bone
marrow and/or mouse fetal) and human libraries.
You can also narrow your
search by specifying a mouse or human chromosome. These use BLAT results
against the mm4 version of the mouse genome, and the NCBI30 assembly of the human genome, respectively.
Your search can also contain one or more the of GO classification of terms.
The terms are in three hierarchies which describe Molecular Function, Cellular
Component and Biological Process of a molecule. A complete description of the
GO ontologies and links to external browsers can be found at the
Gene Ontology Consortium. We have
mapped our clones to GO terms via sequence alignment with Megablast to the
Riken FANTOM database.
You can enter enter one or more GO terms directly into the field provided
on the search page in the format "GO:####", with one term per line. You can
also press the the Browse button to display a popup window containing the portion
of the GO hierarchy represented by SCDb clones.
To navigate the GO hierarchy, click on the plus sign to the left of the term
name. This expands the
terms below it, and plus sign becomes a minus sign to indicate that it is
open. Clicking on the minus sign closes it. Click on a term to include it
in your query. It will turn yellow to indicate that it is selected. When you
click on a term, all its "children" are included in your query (though they
do not turn yellow). So the higher up in the hierarchy you click, the more
terms you implicitly include in your query. To be more specific, select the
"lowest" term in the hierarchy that satisfies your query. Not every GO term
is included; only those which have been linked to one or more clones in SCDb.
Also note that you cannot select the top three terms (biological process,
cellular component and molecular function) as they represent too large
a selection set. When you are satisifed with the terms selected, press
the select button to add the terms into the main window.
The clear all button clears all the terms you selected (removing the
yellow background). The cancel button closes the window and does
not return your selected terms. Help displays this text.
You can also specify if your search results should be sorted by the E value
obtained by aligning the sequences to SwissProt, Genbank, dbEST or DOTS.
The default is not to sort. Another default is to show only unique clones,
that is, clones with non-full length sequences which start at the same
nucleotide. You can override this default to return all the clones instead.
All clones which satisfy the search parameters will be returned; you can
set the number to display per page at 5, 10, 20 or 50.