FASTSNP:
An Always Up-to-date and Extendable Tool for SNP Function Analysis and Selection

 

Last updated: February 16, 2005; February 8, 2006

 

Extended Abstract

 

Background:

Investigating whether a single nucleotide polymorphism (SNP) is functionally involved in a disease is important for disease gene mapping.  For complex diseases, the problem is complicated because, unlike Mendelian diseases, their genetic causes might involve hundreds of genes and alleles. Although there are millions of SNPs deposited in public SNP databases, only a small proportion of them are functional polymorphisms that contribute to disease phenotypes.  Thus, prioritizing SNPs based on their phenotypic risks is essential for association studies. Assessment of the risk requires up-to-date data about the candidate SNP, which in turns requires access to a variety of heterogeneous biological databases and analytical tools.

 

Methods:

FASTSNP (Function Analysis and Selection Tool for Single Nucleotide Polymorphisms) is a web server that allows users to efficiently identify the SNPs most likely to have functional effects. It prioritizes SNPs according to twelve phenotypic risks and putative functional effects, such as changes to the transcriptional level, pre-mRNA splicing, protein structure, etc. A unique feature of FASTSNP is that the prediction of functional effects is always based on the most up-to-date information, which FASTSNP extracts from eleven external Web servers at query time using a team of re-configurable Web wrapper agents. These Web wrapper agents automate Web browsing and data extraction and can be easily configured and maintained with a tool that uses a machine learning algorithm. This allows users to configure/repair a Web wrapper agent without programming. Another benefit of using Web wrapper agents is that FASTSNP is extendable, so we can include new functions by simply deploying more Web wrapper agents. In this manner, we have already built several new functionalities, such as the inclusion of information on haplotype blocks from HapMap, checking the sequence quality of submitted SNP by mapping on UCSC Golden Path sequence and integrating both NCBI and Ensembl annotation. In addition to SNP prioritization, FASTSNP provides project management services for registered users to store and export their candidate SNPs and update the SNPs’ putative functional effects by re-submitting the query.

 

Connected Web Servers:

Name/URL  Usage

NCBI dbSNP

http://www.ncbi.nlm.nih.gov/SNP

Provides the location of a SNP in a gene and its alleles, allele frequency, and context sequence.

Ensembl

http://www.ensembl.org

Provides a cross-reference/alternative data source for dbSNP. Also provides alternative transcripts and protein domain information.

TFSearch

http://www.cbrc.jp/research/db/ TFSEARCH.html

Predicts if a non-coding SNP alters the transcription factor binding site of a gene.

PolyPhen

http://www.bork.embl-heidelberg.de/PolyPhen

Predicts if a non-synonymous SNP alters an amino acid in a protein resulting in structural changes (damaged or benign) in a protein.

ESEfinder

http://rulai.cshl.edu/ESE

Predicts if a synonymous SNP is located in a exonic splicing enhancer motif, which would diminish the motif with a different allele.

Rescue-ESE

http://genes.mit.edu/burgelab/rescue-ese

Provides a cross-reference/alternative data source for ESEfinder.

NCBI GeneBank

http://www.ncbi.nlm.nih.gov/Genbank

Provides all spliced form mRNAs and their translated proteins of the gene sequence.

SwissProt

http://us.expasy.org/sprot

Provides the information about protein domains to determine if a SNP causes an alternative splicing that leads to a protein domain being abolished.

UCSC Golden Path

http://genome.ucsc.edu

Provides information about the final draft assembly of the genome sequence (i.e., Golden Path) for quality control of candidate SNPs.

NCBI Blast

http://www.ncbi.nlm.nih.gov/BLAST

Sequence comparison and search tool for quality control of candidate SNPs

HapMap

http://www.hapmap.org/

Provides information about the haplotype and linkage disequilibrium around a SNP.

FAS-ESS

http://genes.mit.edu/fas-ess/

Predicts whether a coding SNP will abolish exonic splicing silencer motifs

 

Results:

FASTSNP allows users to select functional polymorphisms for association studies in a convenient way. Currently, our collaborating institute, the National Genotyping Center (NGC), Academia Sinica, Taiwan, is using this system to manage 5,000 candidate SNPs and has already obtained their genotyping results using the MALDI-TOF high-throughput genotyping system.

 

Availability: FASTSNP is freely available at http://fastsnp.ibms.sinica.edu.tw/. Registration is required for project management services.

 

Acknowledgements: This project was supported in part by the National Research Program in Genomic Medicine (NRPGM), National Science Council, Taiwan, under Grant No. NSC93-3112-B-001-008-Y (National Genotyping Center) and Grant No. NSC93-3112-B-001-018-Y (Bioinformatics Core Service).


Copyright © 2006 Institute of Biomedical Sciences and Institute of Information Science, Academia Sinica.
128, Sec. 2, Academia Rd., Nankang, Taipei 115, Taiwan, R.O.C.
Phone: +886-2-26523925 FAX: +886-2-2782-4066
Email: chunnan@iis.sinica.edu.tw