Download v0.9.2

Important note:
Execution of this program requires a successful run of GeneMark-ES for your genome. Running GeneMark-ES is only required once for each genome.

Requirements:

  • samtools in your path.
  • Perl package Parallel::ForkManager - this file is included if you wish to manually install it, however it may be simpler to use CPAN

Installation:
Extract the tarball on your system, then run the CSH script 'install.csh' residing in the folder /src. This will build all of the source code and move the programs to /bin. Make sure all files in /bin have execute priveleges. Now just add the /bin directory to your path and you are all set.

Running:

  • unsplicer_pair.pl (to align paired-end reads)
  • unsplicer_single.pl (for single-end reads)

Note: if /es is the location of a completed run of GeneMark-ES for your genome, then the model directory for UnSplicer will be /es/mod, and the gene predictions will be /es/pred_orig_name.gff (assuming you have mapped the predictions back to chromosome or scaffold coordinates by running 'es.pl -mapback' in the /es directory). No files other than the model file should be located in /es/mod.

Sample GeneMark-ES parameter files:
These model files can be used with UnSplicer for the associated genome. They were used to generate the results shown in the UnSplicer publication. Ab initio predictions have been made using these models on the reference assemblies (2nd column in the table below). If your genome assembly is a different version, you can create new predictions by following these instructions:

  1. place the downloaded model file in a folder called mod
  2. rename the file to model.0mtx (IMPORTANT!)
  3. run the prediction step of GeneMark-ES (es.pl -pred -mapback dna.fna in the parent folder of mod, where dna.fna is the reference sequence assembly)
  4. the folder mod and predictions pred_orig_name.gff will be given to UnSplicer as input (in addition the read sequences and reference assembly file)

Model files ab initio predictions
A. thaliana predictions on TAIR10
D. melanogaster predictions on r5.42
C. elegans predictions on Ce10
C. neoformans predictions on Broad Institute ref. assembly