Building a Metabolic Reconstruction

It is useful to group the recognized genes into the recognized
pathways, complexes, and nonmetabolic molecular machines.  Here is how
we view this process:

   1. Our annotation team has constructed sets of functional roles
      that are annotated simultaneously because the functional roles
      are related.  The roles may be distinct subunits of a complex
      (e.g., the subunits of the ATP synthase or the ribosomal
      proteins), a set of functional roles that constitute a pathway
      (e.g., Histidine Degradation) or the genes may make up a
      nonmetabolic molecular machine (e.g., a repair machine, a
      transport cassette, or a 2-component regulatory system).  We
      call each of these sets of roles a "subsystem".  Our annotators
      have carefully assembled the functional roles that make up a
      subsystem and for each one constructed a spreadsheet in which
      each row is a genome and each column is a distinct functional
      role.  The cells of the spreadsheet contain the genes from the
      specific genome that implement the specific functional role.  

   2. We automatically, using the examples contained in the manually
      curated set of subsystems, try to locate the appropriate genes
      within the newly-sequenced genome and identify a new instance
      (i.e., a new row in the spreadsheet) of the subsystem.  When we
      can identify all of the genes needed to implement an operational
      version of the subsystem, it substantially increases the
      confidence we have in the assigned functions, and it forms a
      critical piece of information needed to support the generation
      of metabolic models.

   3. Where we recognize a portion of a subsystem, we may have failed
      to accurately identify some genes, we may have misannotated
      genes, or we may have a new variant of the subsystem (e.g., a
      new variant of a common pathway),

   4. We consider a metabolic reconstruction to simply be the set of
      recognized, operational instances of our subsystem collection.
      This is distinct from an actual initial estimate of the
      metabolic network (which we provide, as well).  The metabolic
      reconstruction includes information about the nonmetabolic
      machinery supported by the genome.  We are not completely happy
      with the term "metabolic reconstruction", but that is the term
      that has stuck and the one in common usage within our group.