July 7, 2020   |   by admin

Introduction. Blast2GO is a comprehensive bioinformatics tool for the functional annotation and analysis of genome-scale sequence datasets. The software was. Annotation is the process of assigning functional categories to gene or gene products. In Blast2GO this assignment is done for each sequence. Blast2GO allows the functional annotation of (novel) sequences and the These steps will be described in this manual including further explanations and.

Author: Voodoogal Zolokree
Country: Ukraine
Language: English (Spanish)
Genre: Education
Published (Last): 23 December 2011
Pages: 238
PDF File Size: 12.33 Mb
ePub File Size: 16.59 Mb
ISBN: 478-2-72016-160-7
Downloads: 12646
Price: Free* [*Free Regsitration Required]
Uploader: Daigul

Functional annotation of novel sequence data is a primary requirement for the utilization of functional genomics approaches in plant research.

In this paper, we describe the Blast2GO suite as a comprehensive bioinformatics tool for functional annotation of sequences and data mining on the resulting annotations, primarily based on the gene ontology GO vocabulary.

Blast2GO optimizes function transfer from homologous sequences through an elaborate algorithm that considers similarity, the extension of the homology, the database of choice, the GO hierarchy, and the quality of the original annotations. The tool includes numerous functions for the visualization, management, and statistical analysis of annotation results, including gene set enrichment analysis. Blast2GO is a suitable tool for plant genomics research because of its versatility, easy installation, and friendly use.

Functional genomics research has expanded enormously in the last decade and particularly the plant biology research community has extensively included functional genomics approaches in their recent research proposals. The number of Affymetrix plant GeneChips, for example, has doubled in the last two years [ 1 ] and extensive international genomics consortia exist for major crops see last PAG Conference reports for an updated impression on current plant genomics, http: Not less importantly, many middle-sized research groups are also setting up plant EST projects and producing custom microarray platforms [ 2 ].

This massive generation of plant sequence data and rapid spread of functional genomics technologies among plant research labs has created a strong demand for bioinformatics resources adapted to vegetative species.

Blast2GO: A Comprehensive Suite for Functional Analysis in Plant Genomics

Functional annotation of novel plant DNA sequences is ttutorial one of the top requirements in plant functional genomics as this holds, to a great extent, the key to the biological interpretation of experimental results. Controlled vocabularies have imposed along the way as the strategy of bkast2go for the effective annotation of the function of gene products.

The use of controlled vocabularies greatly facilitates the exchange of tuforial knowledge and the benefit from computational resources that manage this knowledge. The gene ontology GO, http: Many bioinformatics tools and methods have been developed to assist in the assignment of functional terms to gene products reviewed in [ 8 ]. Fewer resources, however, are available when it comes to the large-scale functional annotation of novel sequence data of nonmodel species, as would be specifically required in many plant functional genomics projects.

Additionally, functional annotation capabilities are usually incorporated in EST analysis pipelines. These resources blasy2go valuable tools for the assignment of functional terms to uncharacterized sequences but usually lack high-throughput tutirial data mining capabilities, in the first case, or provide automatic solutions without much user interactivity, in the second.

The philosophy behind B2G development was the creation of an extensive, user-friendly, and research-oriented framework for large-scale function assignments. The main tutoral domain of the tool is the functional genomics of nonmodel organisms and it is primarily intended to support research in experimental labs where bioinformatics support may not be strong.

Since its release in September [ 20 ], more than labs worldwide have become B2G users and the application has been referenced in over thirty peer-reviewed publications www.

Although B2G has a broad species turorial scope, the project originated in a crop genomics research environment and there is quite some accumulated experience in the use of B2G in plants, which includes maize, tobacco, citrus, Soybean, grape, or tomato. Projects range from functional assignments of ESTs [ 21 — 24 ] to GO term annotation of custom or commercial plant microarrays [ 2526 ], functional profiling studies [ 27 — 29 ], and functional characterization of specific plant gene families [ 3031 ].

In the following sections we will explain more extensively the concepts behind Nlast2go. We will describe in detail main functionalities of the application and show a use case that illustrates the applicability of B2G to plant functional genomics research.

Four main driving concepts form the foundation of the Blast2GO software: The target users of Blast2GO are biology researchers working on functional genomics projects in labs where strong bioinformatics support is not necessarily present. Therefore, the application has been conceived to be easy to install, to have minimal setup and maintenance requirements, and to offer an intuitive user interface. B2G has been implemented as a multiplatform Java desktop application made accessible by Java Webstart technology.

  LEY 26366 PERU PDF

This solution employs the higher versatility of a locally running application while assuring automatic blast2ho provided that an internet connection is available. This implementation has proven to work very efficiently in the fast transfer to users of new functionalities and for bug fixes.

Furthermore, access to data in B2G is reinforced by graphical parameters that on one hand allow the easy identification and selection of sequences at various stages of the annotation process and, on the other hand, permit the joint visualization of annotation results and highlighting of most relevant features. Blast2GO strives to be the application blash2go choice for the annotation of novel sequences in functional genomics projects where thousands of fragments need to be characterized.

In principle, B2G accepts any amount of records within the memory resources of the user’s work station.

During the annotation process, intermediate results can be accessed and modified by the user if desired. Functional annotation in Blast2GO is based on homology transfer. Tutorual this framework, the actual annotation procedure is configurable and permits the design of different annotation strategies.

Blast2GO annotation parameters include the choice tuttorial search database, the strength and number of blast results, the extension of the query-hit match, the quality of the transferred annotations, and the inclusion of motif annotation. Data mining on annotation results. Blast2GO is not a mere generator of functional annotations. The application includes a wide range of statistical and graphical functions for the evaluation of the annotation procedure and the final results. Especially, relative abundance of functional terms can be easily assessed and visualized.

The first release of B2G covered basic application functionalities: Enhanced modules for massive blasf2go, modification of annotation intensity, curation, additional vocabularies, high-performing customizable graphs and pathway charts, data mining and sequence handling, as well as a boast2go array of input and output formats have been incorporated into the Blast2GO suite.

Figure 1 shows the basic components of the Blast2GO suite. Functional assignments proceed through an elaborate annotation procedure that comprises a central strategy plus refinement functions.

Next, visualization and data mining engines permit exploiting the annotation results to gain functional knowledge. Schematic representation of Blast2GO application.

GO annotations are generated through a 3-step process: Additional annotation data-mining tools include statistical charts and gene set enrichment analysis functions. The Blast2GO annotation procedure consists of three main steps: Once GO terms have been gathered, additional functionalities enable processing and modification of annotation results. The first step in B2G is to find sequences similar to blasy2go query set by blast [ 32 ]. B2G accepts nucleotide and protein sequences in FASTA format and supports the four basic blast programs blastx, blastp, blastn, and tblastx.

Homology searches can be launched against public databases such as the NCBI nr using a query-friendly version of blast QBlast. This is the default option and in this case, no additional installations are needed.

Alternatively, blast can yutorial run locally against a proprietary FASTA-formatted database, which requires a working www-blast installation. The Make Filtered Blast-GO-BD function in the Tools menu blast2gl the creation of customized databases containing only GO-annotated entries, which can be used in combination with the local blast option.

Other configurable parameters at the blast step are the expectation value e -value threshold, the number of retrieved hits, and the minimal alignment length hsp length which permits the exclusion of hits with short, blas2tgo e -value matches from the sources of functional blast22go.

Annotation, however, will ultimately be based on sequence similarity levels as similarity percentages are independent of database size and more intuitive than e -values.

Blast2GO parses blast results and presents the information for each sequence in table format. Mapping is the process of retrieving GO terms associated to the hits obtained after a blast search.

B2G performs three different mappings as follows. Identified gene names are searched in the species-specific entries of the gene product table of the GO database. This is the process of assigning functional terms to query sequences from the pool of GO terms gathered in the mapping step. Function assignment is based on the gene ontology vocabulary.

The B2G annotation algorithm takes into consideration the similarity between query and hit sequences, the quality of the source of GO assignments, and the structure of the GO DAG. The AS is composed of two terms. The first, direct term DTrepresents the highest similarity value tutoril the hit sequences bearing this GO term, weighted by a factor blast2ggo to its evidence code EC.


ECs vary from experimental evidence, such as inferred by direct assay IDA to unsupervised assignments such as inferred by electronic annotation IEA. The second term AT of the annotation rule introduces the possibility of abstraction into the annotation algorithm.

Abstraction is defined as the annotation to a parent node when several child nodes are present in the GO candidate pool. This term multiplies the number of total GOs unified at the node by a user-defined factor or GO weight GOw that controls the possibility and strength of abstraction. When all ECw’s are set to 1 no EC control and the GOw is set to 0 no abstraction is possiblethe annotation blast2ggo of a given GO term equals the highest similarity value among the blast hits annotated with that term.

If the ECw is smaller than one, the DT decreases and higher query-hit similarities are required to surpass the annotation threshold. If the GOw is not equal to zero, the AT becomes contributing and the annotation of a parent node is possible if multiple child nodes coexist that do not reach the annotation cutoff.

Default values of Blasf2go annotation parameters were chosen to optimize the ratio between annotation coverage and annotation accuracy [ 20 ]. Finally, the AR selects the lowest terms per branch that exceed a user-defined threshold. The annotation step in B2G can be further adjusted by setting additional filters to the hit sequences considered as annotation source.

A lower limit can be set at the e-value tutorkal to ensure a minimum confidence at the level of homology. This parameter is of importance to prevent potential function transfer from nonmatching sequence regions of modular proteins. Additionally, the minimal hsp length required at the blast step permits control of the length of the matching region.

Blast2GO includes different functionalities to complete and modify the annotations obtained through the above-defined procedure. Enzyme codes and KEGG pathway annotations are generated from the direct mapping of GO terms to their enzyme code equivalents. B2G launches sequence queries in batch, and recovers, parses, and uploads InterPro results. In this process, B2G ensures that only the lowest term per branch remains in the final annotation set, removing possible parent-child relationships originating from the merging action.

Blast2GO incorporates three additional functionalities for the refinement of annotation results. Firstly, the Annex function allows annotation augmentation tutoriaal the Second Layer concept developed by The Norwegian University of Science and Technology http: Basically, the Second Layer database is a collection of manually curated univocal relationships between GO terms from the different GO categories that permits the tuttorial of biological process and cellular component terms from molecular function annotations.

Secondly, annotation results can be summarized through GOSlim mapping. GOSlim consists of a subset of the gene ontology vocabulary encompassing key ontological terms and a mapping function between the full GO and the GOSlim.

BioHPC Cloud: User Guide

Different GOSlim mappings are available, adapted to specific biological domains. Thirdly, the manual curation function means that the user has the possibility of editing annotation results and manually modifying GO terms and sequence descriptors. One aspect of the uniqueness of the Blast2GO software is the tutorlal of a wide array of functions to bladt2go, evaluate, and visualize the annotation process and results.

The purpose of these functions is to help understand how functional annotation proceeds and to optimize performance. Summary statistics charts are generated after each of the annotation steps.

0 mapping hits for blast2go

Distribution plots for e-value and similarity within blast results give an idea of the degree of homology that glast2go sequences have in the searched database. Once mapping has been completed, the user can check the distribution of evidence codes in the recovered GO terms and the original database sources of annotations. These charts give an indication of suitable values for B2G annotation parameters. For example, when a good overall level of sequence blasy2go is obtained for the dataset, the default annotation cutoff value could be raised to improve annotation accuracy.