Introduction
From its Git Repo:
SMARTdenovo is a de novo assembler for PacBio and Oxford Nanopore (ONT) data. It produces an assembly from all-vs-all raw read alignments without an error correction stage. It also provides tools to generate accurate consensus sequences, though a platform dependent consensus polish tools (e.g. Quiver for PacBio or Nanopolish for ONT) are still required for higher accuracy.
SMARTdenovo consists of several separate command line tools: wtzmo for read overlapping, wtgbo to rescue missing overlaps, wtclp for identifying low-quality regions and chimaera, and wtcns or wtmsa to produce better unitig consensus. Thesmartdenovo.pl
script provides a convenient interface to call these programs in one go.
This tool has not been published yet. (20180313)
My feelings:
- easy to install/use
- not as fast as
wtdbg
, but fast - comparatively good results (at least in my case)
- docs and discussions about this tool is limited.
General usage
1 | # Download sample PacBio from the PBcR website |
In practice
An insect
- The species: high heterogeneity, high AT, high repetition.
- Genome size: male 790M, female 830M.
commands:
1 | run1, default |
stats:
1 | Size_includeN 756816708 |
SMARTdenovo
can also use zmo
overlapper. I also test this option, but it generated about 17G genome! (The estimated genome size is about 850M.)
A plant
- The species: high heterogeneity, high repetition.
- Genome size: 2.1G.
run1, with about 100X data
commands:
1 | run1, default |
And the stats I got:
1 | Size_includeN 2103140368 |
run2, with about 50X data
commands:
1 | run2, 50X |
And the stats I got:
1 | Size_includeN 2028605527 |
This was a very good N50 size! And the assembled size was close to the expected one.
Change notes
- 20180423: create the note.