SEQtools includes several options for translating nucleotide sequences as well as for back-translating protein sequences. With the Find In All Frames is is possible to identify the longest ORF in a nucleotide sequence. The Translate Specific Frame you can isolate (Copy/Paste) the isolated protein sequence). The Translate Forward Frames provide a link between the translated sequence and the underlying nucleutide sequence. Finally it possible to batch translate all nucleotide sequences contained in the project with the Create Protein Files function.
This function enables you to translate a nucleotide sequence in the specified reading frame. You have the
option to display either the longest ORF, the longest frament or a complete translation in the specifed frame.
126.96.36.199 Largest ORF submenu.
188.8.131.52 Largest Fragment submenu.
184.108.40.206 Complete Translation submenu.
With this function it is possible easily to identify the longest ORF (open reading framing) or fragment (protein region without stop codons) in an unknown nucleotide sequence. The result is displayed in text form which lists the longest orfs/fragments in all six reading frames and the longest of them all.
With information you can re-translate the longest ORF/fragment with the Translate Specific Frame
function described above to isolate the protein sequence.
The Result form listing ORF's or fragments in all six readinf frames of the nucleotide sequence.
The Translate Forward Frames displays the translation of the current DNA sequence or an extract
thereof in each of the three forward reading frames or in all three forward reading frames simultaneously.
The line numbers correspond to the coordinates of the extracted sequence region. Stop codons are denoted by stars and uncertain (codons including one or more N's) amino acids by X's. The format of the DNA sequence is independent of the selected format in the sequence editor form with block length of 3 and line length of 60 bp.
220.127.116.11 File menu - contains save and prints options for the translated sequence.
18.104.22.168 View menu - includes the available translate options.
22.214.171.124 Frame menu - selects the reading frame (forward only) for the translation.
126.96.36.199 Format menu - allow you to select line length and whether or not to divide the sequence in blocks of 10 residues.
188.8.131.52 Attributes menu - contains simple options for annotating the translated nucleotide sequence.
184.108.40.206 Transfer - it is possible with the transfer options to highlight a nucleotide region (for example corresponding to an interesting portion of the translation and - by clicking Transfer - to transfer the highlights to the normal sequence editon (see below).
Highlights corresponding to the selected region in the Forward Frame Translation form.
Re-translating the highlighted nucleotide region in the normal sequence editor displays the translation with the translated nucleutides displayed above the protein sequence.
Back-translating protein sequences is useful when designing sequencing primers. When a protein sequence is displayed in the normal sequence editor selecting the Back-Translate option prompts you to select/load a codon usage data file to supply information about frequently used codons (codon usage) for the particular organism/protein.
When a codon usage data file is successfully loaded into SEQtools the form below is displayed to enable you to select a degeneration level. Choosing leve1 1 will result in a primer without degenerate positions while level 6 will cover all possible degenerate base combinations. The cost in the latter case of course is few primers in the mixture with the correct base sequence exactly matching the nucleotide sequence.
The primer sequence after back-translation is displayed in a simple text form. You must then copy/paste the primer sequence into a separate instance of SEQtools opened for handling primers.
This utility is designed to assist you in the analysis of short EST sequences (expressed sequence tags) in cases where functional identification by data base searching has failed and the correct reading frame thus is unknown.
The utility translates all nucleotide sequences of the current project in the selected reading frame(s) and saves each protein sequence in a separate file. The extracted protein sequences can then be searched for example against the Prosite data base of protein motifs, or other data bases including protein signatures.
Complete sequences - the complete translation including X 's and stops.
Largest fragments - largest contiguous amino acid region without stops N-terminal regions; regions starting with a M and ending at the first downstream stop C-terminal regions; regions from the start of the sequence to the first stop
Frame options - (1) All reading frames. (2) The 3 forward reading frames (A, B, C). (3) The 3 reverse reading frames (D, E, F).
Filter option - allows you to disregard protein sequences shorter then the selected minimum length.
Protein file names - The protein file names are constructed by adding _N to the file names of the DNA sequences, where N denotes the reading frame (1-6, or # for all reading frames in the same file). The two characters can be added in one of four ways: (1) By replacing the extension of the DNA sequence file name with _N. (2) By adding _N to the leftmost six characters of the file name. (3) By adding _N to the rightmost six characters of the file name. (4) By adding _N to the middle six characters of the file name.
In the latter three cases, the protein file names will lack an extension. When the protein files are build, the selected file names are validated to avoid duplicate file names. If the selected naming method yields duplicate names, the building is arrested and the used advised to select another method of generating protein file names.
In cases where none of the available four methods yields unique protein file names, the original DNA sequence files must be renamed.
File format - The protein files can be saved in either Fasta or GCG format. Each protein file includes a header giving the sequence name, the reading frame and the length of the protein sequence. The protein sequences are broken into lines of 50 characters without line numbering.
Save options - The protein files can either be saved in separate files or in one file per DNA sequence. If the latter option is selected, a 5 x stop separator is inserted between each reading frame if the check box for this option is checked.