The Network-Based SEED API offers a framework that supports programmatic access to current SEED data. The system is distributed as a small set of Perl packages that the user downloads and installs locally.  These packages define an API that the programmer uses to  communicate over the network with the SEED environment maintained by the Fellowship for Interpretation of Genomes (FIG), Argonne National Lab (ANL) and the University of Chicago (UC).  The distribution can be easily installed on a Mac or Unix-based system, and with a little extra effort, on a Windows machine.  In addition to the packages that are used in constructing Perl programs, we offer a library of utility programs that offer predefined commands that can be used to extract data from the SEED.

The bulk of the functionality is offered via SAPserver.pm, a module that supports access to a database of genomic data that includes data on over a thousand genomes.  The database is described abstractly via an Entity-Relationship model and is managed via metadata which makes it straightforward to extend the model, and is implemented via a standard relational database.  We also offer:

  1. MODELserver.pm to support the construction and use of metabolic models and flux-based analysis,
  2. ANNOserver.pm to support annotation of DNA and protein sequences
  3. RASTserver.pm to support the submission of genomes to be annotated and the retrieval of the annotations.

Installation

Macintosh Distribution

The Macintosh distribution is a .dmg file. You will need to choose a version based on the type of your system.

  • Intel Mac systems will use the link named "myRAST-Intel.dmg". 
  • If you have an older PowerPC based system (a Powerbook G4, for instance, or a G5 tower system) use the link named "myRAST-PPC.dmg".

When the .dmg file is downloaded to your machine, open it by doubleclicking on it in the Finder (if your web browser has not already opened it for you). You should see a single file inside the folder:

PastedGraphic-5.png
The myRAST application may be run directly from the .dmg folder if you wish to try it without installing. If you want to install it on your machine, the myRAST icon may simply be dragged to your Applications folder.

If you plan on using the command line interface, be sure to put the myRAST bin directory in your path. If you placed myRAST  in the Applications folder, you would do it like this using bash:

export PATH=$PATH:/Applications/myRAST.app/bin


Windows Distribution


The Windows distribution of myRAST is packaged as a standard Windows installer here.  We describe here what you should expect to see while installing. The screen shots here are from a Windows 7 system; if you have a different release of Windows they may look slightly different for you.

When you download the installer it will be named something similar to myRAST-win32.exe. Run the executable.

If you are using a Vista or Windows 7 system with User Access Control enabled, you will be presented with a dialog asking whether you wish to allow this program to write to your computer:
myrast-uac.png


 This will bring up the first of the installer windows:


myrast-install-1-1.PNG
Click Next to proceed with the installation:

myrast-install-2-1.PNG
Choose the destination directory into which you would like myRAST to be installed. and click Next. If you have installed it in the past and are reinstalling a new release, you may get a dialog verifying that you wish to install to an already existing location. It is typically safe to accept this option.

myrast-install-3-1.PNG
The myRAST distribution can install a set of GnuWin32 utilities that include some useful command line tools like cut and grep, familiar to Unix users. These are useful for using in conjunction with the myRAST command line tools. 

myrast-install-4-1.PNG
Next we choose the folder in the Start Menu into which the program links will be installed. The default should be suitable.

myrast-install-5-1.PNG
Next, the myRAST installer will create a desktop icon or Quick Launch icon if you so desire.

myrast-install-6-1.PNG

At this point all options have been specified. You can doublecheck your choices and go back to change, or click Install to proceed with the installation.


myrast-install-7-1.PNG

The installer will show a status bar during the installation.


myrast-install-8-1.PNG

The installation is complete. Click Finish to exit the installer.


myrast-install-9-1.PNG
The myRAST links are installed in the Start Menu in the group that you chose during the installation:

myrast-uninstall-menu.PNG
Note the Uninstall myRAST option. If you wish to remove myRAST from your computer you may use that item.

There are several links to online documentation in the myRAST folder, as well as the icon called myRAST which brings up the myRAST interface. If you wish to use the command line myRAST tools you may open a new window using the myRAST Shell icon.

If you selected the Desktop Icons option you will also see myRAST icons on your desktop:

myrast-desktop.PNG
The myRAST shell is  a command shell with the myRAST environment set up for running svr commands and running your own PERL server scripts.  myRAST is the application you would run to process genomes through RAST.

This completes the installation of myRAST onto your Windows computer.

Ubuntu Distribution

There is now a release of myRAST for Ubuntu Linux. The .deb files are as follows. You will need to install the myrast-runtime package, along with a few required packages:

apt-get install blast2 libwxgtk2.8-0 libdb4.8 libxml2
32-bit packages:
64-bit packages:
Once installed you may start the myRAST gui by running the command "myrast".

Linux Distribution

Download the Network-based SEED API distribution by clicking here. It is called sas.tgz. Depending on your  browser and preferences, you may wind up with  the sas.tgz file, or you may get the uncompressed sas.tar. Use the examples below that correspond to your file.


Installation

  1. Place the tarball in a directory of your choice (we use sas) and untar it. It will create several subdirectories. 
  2. cd to the sas/modules directory and run BUILD_MODULES. 
  3. Then, put the bin directory in your path and the lib and modules/lib directories in your perl path and you should be good to go.

Bash Example for uncompressed files (sas.tar):

mkdir sas
cp ~/Downloads/sas.tar sas
cd sas
TOP=`pwd`
tar -xvf sas.tar
cd modules
./BUILD_MODULES
export PERL5LIB=$PERL5LIB:$TOP/lib:$TOP/modules/lib
export PATH=$PATH:$TOP/bin

Bash example for the compressed file (sas.tgz)

mkdir sas
cp ~/Downloads/sas.tgz sas
cd sas
TOP=`pwd`
tar -zxvf sas.tgz
cd modules
./BUILD_MODULES
export PERL5LIB=$PERL5LIB:$TOP/lib:$TOP/modules/lib
export PATH=$PATH:$TOP/bin

To verify correct installation, try this:
perl -e 'use SeedEnv'
It should produce no errors.

Server Packages

There are 4 server packages included in the distribution:

  1. The Genomics ER model server - SAPserver.pm
  2. The MODEL server - MODELserver.pm
  3. The Annotation Support Server - ANNOserver.pm
  4. The RAST server - RASTserver.pm

Utilities

Also included is a package of utilities called SeedUtils.pm that contain functions useful for bioinformatics, but that do not require access to the databases. Click here to see the current list of functions and api descriptions.

Test Releases

Following are some test releases of the myRAST tool.

Mac (Intel Leopard) Release 33A22a
Linux (platform independent perl library) Release 33A23


ModelTutorials

Once a genome has been annotated with functional roles, we can determine what chemical reactions those roles implement. For example, in B. subtilis, we annotated a gene with the functional role "Pyruvate kinase (EC 2.7.1.40)" or "Enolase". This, of course, maps directly to a reaction, converting Phosphoenolpyruvate into Pyruvate and generating one ATP in the process. Other reaction assignments are, of course, more difficult, requiring the presence of multiple genes to form enzyme complexes. Additionally, since chemically many reactions can proceed in either direction we must determine which directions are implemented by the organism.

You can manually edit your model, adding or removing reactions and changing reaction directionality, by uploading a new model file through the Model View page. See Model View Tutorial Part 6.

In the near future we hope to add the ability to add and remove reactions directly through the SEED website. Additionally, we will allow users to switch between different biomass objective functions and formulate new objective functions.

In Model View we allow you to model growth of your organism on different media conditions. These predictions are made using Flux balance analysis.

What does this entail? We start with a model, which is the network of metabolic and transport reactions an organism an perform and a biomass objective function, and add a media formulation, just a list of compounds and their concentrations. Then we use linear optimization techniques to solve for that biomass objective function. If we are able to solve the problem and produce biomass, we get a list of "fluxes" through reactions that tell us which parts of the cell's metabolism are being used.

You can run flux balance using the Model View pages. See Part 5 of the Model View Tutorial for details.

In the near future we will allow users to upload growth phenotype data. In it's basic form, this is simply whether or not an organism grew on a specific media.

Adding to this, we hope to allow for an number of complex phenotypes:

  1. Phenotype of growth of organism strains where genes have been added or removed.
  2. Gene-level microarray expression data on organisms and strains.

With these simple and complex phenotypes, hope to allow users to track how closely their models fit to observed results, allowing users to modify the organism model until predictions and observed phenotypes converge.

The PDF files below contain the presentations on metabolic modeling from the RAST Workshops.


Once we have determined which reactions are implemented by an organism's annotated genes, we can construct the initial model: this is essentially a stoichiometric matrix and a biomass objective function, essentially the parts list that a cell needs to produce the biomass it needs to grow.

However, in most cases this model is not yet functional. If we attempt to grow it on complete media, i.e. LB, it will fail to produce all the parts needed for the biomass function. This is where model gap-filling comes in. We insert candidate transport and standard reactions until the cell successfully grows on complete media. These reactions are clearly labeled in the initial model. In the Model View Tutorial, you can search for gap-filled reactions in the Reaction table.

Usage

The Sapling Server (API)

The SAPserver.pm package offers programmatic access to the data maintained in the Sapling DB within the SEED.  The Sapling DB is described by an entity-relationship model that depicts the basic entities maintained within the database and the relationships that we have encoded between them.  This offers the basic foundation upon which most of the SEED toolkit resides. The methods offered by Sapling Objects support a rich set of operations against genomic data.  Using the methods described in the API, the user has access to genomes, annotations, functional coupling data, protein families, subsystems, and a rapidly growing number of more specialized forms of data. 

 To see the overall ER diagram and the relations that implement it see the Sapling webpage.

A complete tutorial is offered in SAP tutorial.

The Annotation Support Server (API)

The ANNOserver.pm package supports capabilities relating to annotation of genomes.  It supports invocation of standard gene callers (Glimmer3 for protein-encoding genes), and newly-developed high-performace methods to assign function to protein sequences or regions of DNA fragments (based on FIGfams and a unique use of K-mers that act as signatures of FIGfams).  We include an example application based on these methods that can be used to produce relatively acurate annotation of most microbial genomes within a few minutes.

The RAST server (API)

RAST is a publicly-available server for the annotation of microbial genomes.  It is maintained by a team at Argonne National Lab and FIG.  Currently, it has over 2600 registered users, and several thousand genomes have been run through the service in the last couple of years (often several times!).  The RASTserver.pm package was created to support programmatic submision of genomes to RAST, the retrieval of status, and the retrieval of the final set of annotations.  

The Model Server (API)

This server provides access to all data associated with the biochemistry database and the genome-scale metabolic models stored within the SEED. This server also provides the user with the ability to run a set of simple flux balance analysis studies with the SEED models. A detailed description of the interface is here.

If you don't want to write Perl programs but would like to use the SEED servers to process your data, we supply a number of predefined shell scripts that provide basic bioinformatics functions using the servers. These scripts are all prefaced with "svr_" and are found in the bin directory of the distribution. These are designed to use stdin and stdout and to be piped together to form more complex processing. 

If you are a MAC or Linux user, these scripts are accessed from the command line in your terminal shell where you must put myRAST in your path, like this:

export PATH=$PATH:/Applications/myRAST.app/bin

If you are a windows user, you must use the myRAST shell, which is installed with myRAST.

The svr scripts can be directed to use the SEED or the PSEED by the use of the environmental variable SAS_SERVER. It defaults to the SEED server, but if you want your scripts to access the PSEED, you would set the shell variable SAS_SERVER to PSEED, like this, using bash shell

export SAS_SERVER=PSEED

or like this if you are a windows user

set SAS_SERVER=PSEED



As a short example of using these scripts,  to get a list of all  genomes, you could do this at the command line:

svr_all_genomes complete
This would produce a two column table of all genomes (all complete genomes if you use the "complete" argument) in the SEED or PSEED. The first column is the genome name, and the second is the id, like this:

Berardius bairdii 48742.1
Simian immunodeficiency virus 11723.1
Erythrobacter litoralis HTCC2594 314225.3
Bacteriophage N15 40631.1
Bacillus cereus plasmid pPER272 1396.18
Cyanophage P-SSP7 268748.3
Enterococcus faecium plasmid pEF1 1352.12
Lactococcus lactis subsp. lactis Il1403 272623.1
Salmonella enterica subsp. enterica serovar Newport str. SL254 423368.6
Cotton leaf curl Rajasthan virus 223259.1


With the servers, we have supplied a set of RAST batch scripts. Each of these RAST scripts (except for the submission script, which has much more complex arguments) takes as the first two arguments the username and password of a valid account on the RAST server.  
A single job may be submitted using svr__submit_RAST_job. This script takes a number of arguments which define the parameters for the submission:
--user usernameRAST login for the submitting user
--passwd passwordRAST password for the submitting user
--genbank filenameIf submitting a genbank file, the file of input data.
--fasta filenameIf submitting a FASTA file of contigs, the file of input data.
--domain Bacteria or
--domain ArchaeaDomain of the submitted genome.
--taxon_id taxonomy-idThe NCBI taxonomy id of the submitted genome
--bioname "genus species str."Biological name of the submitted genome
--genetic_code ( 11 | 4 )Genetic code for the submitted genome, either 11 or 4.
--gene_callerGene caller to use (FigFam-base RAST gene caller or straight Glimmer-3)
--reannotate_onlyPreserve the original gene calls and use RAST