What is ESO? 

Identification of mutational hotspots

Our software identifies recombination and polymerase slippage sites, which are proven to increase the chance of random mutation.

Identification of methlyation (or alternative) sites

The ESO identifies likely methylation sites as well, which through epigenetic inheritance have a high impact on the expression of mammalian and insectoid cells. Alternatively, the user may provide their own PSSM matrices for custom site identification.

Automatic optimization

As part of our algorithm, we provide a corrected sequence, which maintains translation in the ORF regions, avoids changes in "locked" locations, maintains local GC content levels, avoids problematic sites found, and optimizes codon usage to match the host organism.

Comfortable user interface

Our user interface provides easy definition and activation of the optimization process on batches of sequences, while allowing easy tracking of the changes done.

Our team

The ESO tool was created by Staubility Group, mentored by Prof. Tamir Tuller from Tel Aviv University of Israel.
The group began its journey in an international competition for synthetic biology - iGEM, in which it received an award for the Best Software Tool, for work including the alpha version of the ESO tool.
Since then, we have been working in collaboration with academic consultants and experts from the biotechnology industry, in order to improve our tool and further adapt it to the needs of our consumers.
Our goal at Staubility is to provide an efficient and high-quality solution to the problem of genomic instability of transgenic organisms. Hopefully, this will help us foster our vision – promote the Syn-Bio field to fulfill the full potential inherent in it. 

Downloads

E.S.O. Tool

Click here to try the E.S.O. tool online for free!

User Guide

Please refer to the user guide for aid in using our software!

Supplementary Materials

Click here to download the supplementary materials were used to test our software!

Frequently Asked Questions

How do I launch the software?

In order to launch ESO, follow these steps:
1. Download the ESO folder and unzip it.
2. open the unzipped folder and run the "GUI.exe" file.
3. Your operating system may request permission to run the file - provide these permissions.
4. A CMD window will open, the main software window will launch a moment later.
For further information, please refer to our User Guide. 

What is a fasta or genbank file? Does the target sequence have to be in these formats?

The Fasta and Genbank formats are text-based file formats commonly used in the synthetic biology community. They are used for representing either nucleotide sequences or amino acid (protein) sequences.
In order to use ESO, your sequences may be in either Fasta or Genbank format. As is sometimes used for larger sequences, these files may be also be gzipped (e.g. fasta.gz).
If you want to convert a sequence to FASTA format please follow the instructions on this website: https://www.ncbi.nlm.nih.gov/WebSub/html/help/fasta.html

May I optimize several sequences at once?

Yes. The input to the software is a directory, and all files within will be analyzed.
In addition, a single file may contain multiple sequences. The software will refer to the sequences with an index, starting from 0.
However, in our current beta version, we limit the user to optimizing 10 subsequences at a time.

Am I limited in the number of ORFs or "locked" regions?

No, it is possible to input an arbitrary number of ORF's or "locked" regions, formatted as "start_index_1-end_index_1, start_index_2-end_index_2,..." .

Am I limited in the length of ORFs?

First of all, an ORF can be (and by default, is) as long as the sequence.
For reasonable lengths (number of nucleotides 10-100k), our software should return an answer within 30 seconds.

Am I limited in the size of files?

We limited the total filesize allowed to 100 MBs, as larger sizes might cause the software or even the whole computer to freeze.
If you have a specfic need (extremely large files and sufficiently powerful computer to deal with them), please contact us.

How should I use the ESO output?

For your convenience, there are 2 ways you may use the software's output:
1. Manually go through the expressive sites and select those which you wish to optimize.
2. ESO can optimize or up to 10 sequences at a time for you. Tick the OPTIMIZE box and in addition to the software output you will also get the optimized sequence, considering ORF translation, "locked" regions, hotspots found, local GC content and codon usage bias. For more information, please refer to our user guide.
Note, you may select to receive your optimized sequence in Fasta or Genbank format, and may select to forgo creation of a full report.

I'm having a hard time understanding the output, what can I do?

Inside the output directory you will find the optimized sequences in the format selected. In addition, you will receive sub-directories. Each one corresponds to a single input file. The output will match the file hierarchy of the input directory.
Inside each sub-folder, you will find:
1. Optimized subsequences, corresponding the different subsequences within the input file.
2. A list of the mutational hotspots by category (CSV format), icluding polymerase slippage, recombination, and if selected - methylation or custom motif sites. Each file includes a 'sequence_number' column corresponding to the sequence index within the input file.
3. An optimization report (zip format) when 'optimize' is specified.
For more information about the output, please refer to our user guide.

Known Issues

My files aren't loading!

Our file reading system does not currently support languages other than English - make sure your path has legal characters only.
In addition, our software reads only Fasta or Genbank files with common file extensions, or gzipped versions of them.
For your convenience, when selecting an input folder, our software provides a pop-up, presenting which files were found. Make sure this list matches your expected file list.

I have this weird error popup, something about GC content...

This issue results from setting incompatible constraints - exclusion regions cannot be modified, and thus if they do not satisfy the GC content constraints, there is no way for the optimizer to resolve the sequence.
The error details the larger GC boundaries that can be satisfied by the excluded regions.

I'm facing an issue which isn't mentioned here.

Your feedback is important to us, as we are commited to providing the best software we are able to.
Please submit your issue to tamirtul@gmail.com, and we will address the issue as soon as possible.

Contact Us