Franssen 2011 P. sativum unigene

Overview


Analysis Name	Franssen 2011 P. sativum unigene
Method	mira/tgicl (na)
Source	Franssen 2011 P. sativum unigene
Date performed	2011-05-11

81,449 unigenes (Franssen et al 2011) were provided from assembly of 2,209,735 transcripts generated from 454 sequencing of Pisum sativum libraries from flowers, leaves, cotyledons, epi- and hypocotyl, and etiolated and light treated etiolated seedlings. For the CSFL version of the data, sequences less than 100 bp were removed from the assembly which resulted in 42,000 contigs and 26,622 singlets. The unigenes (contigs and singlets) were then analyzed using the annotation pipeline stated in the 'Functional Annotation' section.

Properties

Additional information about this analysis:


Property Name	Value
Analysis unigene num reads	2,209,735
Analysis unigene avg length	487
Analysis unigene num singlets	26,622
Analysis unigene num clusters
Analysis unigene name	Franssen 2011 P. sativum unigene
Analysis unigene num contigs	42,000
Analysis Type	tripal_analysis_unigene

Download

Unigene Files:

Franssen 2011 P. sativum unigene [Contigs Only] [Singlets only]

Homology Analysis Reports:

Microsatelite Analysis:

Functional Analysis

Homology Analysis:
The unigenes were compared to the Uniprot Swissprot and Uniprot TrEMBL databases using the NCBI blastx program with E-value threshold of 1E-6. The results were generated in XML format and parsed into the database for online display. An in-house script was used to generate the best hit reports from the XML output which can be downloaded in Excel file format.
InterproScan:
The unigenes were scanned for protein signatures and domains using the EBI InterproScan software (version 4.7) installed on-site. IPR terms and GO terms were generated for the analyzed sequences (i.e. parameters used: -iprlookup -goterms -format html -nocrc).
KEGG/KASS:
The unigenes were submitted to the KAAS (KEGG Automatic Annotation Server) for ortholog assignment and pathway mapping. Five plant transcriptomes (i.e. Arabidopsis thaliana, Oryza sativa japonica, Ostreococcus lucimarinus, Ostreococcus tauri, Cyanidioschyzon merolae) were compared using the BBH method.

Microsatelite Analysis

The unigene contigs were scanned for microsatelites (SSR). SSRs are defined as dinucleotides repeated at least 5 times, trinucleotides repeated at least 4 times, tetranucleotides repeated at least 3 times, or pentanucleotides repeated at least 3 times.

Sequence information
Number of Sequences	42,000
Number of Sequences Having One Or More SSRs	5,095
Percentage of Sequences Having One Or More SSRs	12.1%
Total Number of SSRs Found	10,286
Number of Motifs	474

Frequency of Motif Type

Motif Length	Frequency	Percentage Frequency
2bp	1,140	17.8%
3bp	3,362	52.5%
4bp	1,461	22.8%
5bp	445	6.9%

Search form

Franssen 2011 P. sativum unigene