It is recommended to use any other browser besides Firefox for opening this webpage.
and relative ants with LAST aligner (see Methods for more informations). The Similarity Table displays the whole genome alignments between two genome-assemblies.
The Comparison of two genomes was performed with
LAST. In order to calculate the weighted Mean and Median of Identity and the Aligned nucleotide fraction, the following steps were taken:
Building Last-Database:
lastdb -P5 -uNEAR -R01 LASTDB-name Target-Assembly
Training LAST for optimal Matrix-parameter:
last-train -P0 --revsym --matsym --gapsym -E0.05 -C2 LASTDB-name Query-Assembly > last-train.mat
Aligning and filtering for unique best alignments of the query:
lastal -m50 -E0.05 -P0 -C2 -p last-train.mat LASTDB-NEAR $1 | last-split -m1 > last-split.maf
Get 1-to-1 alignments:
maf-swap last-split.maf |last-split -m1 | maf-swap > maf-swap.maf
Calculate Aligned nucleotide fraction of query and target:
Sum of Alignmentlength - (Sum of number of gaps + Sum of number of Mismatches)
Calculate Aligned nucleotide fraction of the query (%):
get sum of Alignmentlength (Prefix is specific for the reference genome):
grep -v '^#\|^a\|^p' maf-swap.maf | grep Prefix (eg: "NC_*") | awk '{sum+=$4} END {print sum}'
get sum of number of gaps:
grep -v '^#\|^a\|^p' maf-swap.maf | grep Prefix | grep -o "-" | wc -m
Generate blast tab output from LAST output (blast outfmt 6):
last-postmask maf-swap.maf | maf-convert blasttab | awk -F'=' '$2 <= 1e-5' > last.blasttab
Sum of number of Mismatches:
less last.blasttab | awk '{sum+=$5} END {print sum}'
Calculate Aligned nucleotide fraction of the target (%):
get sum of Alignmentlength:
grep -v '^#\|^a\|^p' maf-swap.maf | grep -v Prefix | awk '{sum+=$4} END {print sum}'
get sum of number of gaps:
grep -v '^#\|^a\|^p' maf-swap.maf | grep -v Prefix | grep -o "-" | wc -m
Two parameters were set: Word length, this is the minimum sequence length for identical subsequences used to create a hit in the dot plot. Window size, this parameter specifies the window size over which an average dot value will be calculated.