Documentation


SDM option file configuration

This file contains options for the quality filtering of sequences, each option is described again in a comment above the relevant flag (comments can be written in this configuration file by starting the line with "#"). Some options start with a "*". These are relaxed parameters (i.e. they have to be less restrictive) for the secondary quality filter. Sequences in this mid-quality range will not be used for the OTU building, but only mapped to the OTU's.

FlagFunction
minSeqLengthminimal required sequence length AFTER removal of Primers, Barcodes and trimming.
maxSeqLengthmaximum sequence length
TruncateSequenceLengthruncate total Sequence length to X (length after Barcode, Adapter and Primer removals), set to -1 to deactivate.
minAvgQualityminimal required average quality AFTER removal of Primers, Barcodes and trimming.
maxAccumulatedErrorProbabilistic max number of accumulated sequencing errors. After this length, the rest of the sequence will be deleted. Complimentary to TrimWindowThreshhold. (-1) deactivates this option.
BinErrorModelMaxExpError, BinErrorModelAlphaBinomial error model of expected errors per sequence (see Arxiv paper). (BinErrorModelAlpha -1) deactivates this option.
maxAmbiguousNTmaximum ambiguous bases in Sequence
maxHomonucleotidemaximum homonucleotide run length
QualWindowWidth, QualWindowThreshholdFilter whole sequence if one window of quality scores is below QualWindowThreshhold
TrimWindowWidth, TrimWindowThreshholdTrim the end of a sequence if a window falls below quality threshhold. Useful for removing low qulaity trailing ends of sequence
keepBarcodeSeq, keepPrimerSeqkeep Barcode / Primer Sequence in the output fasta file - in a normal 16S analysis this should be deactivated (0) for Barcode and deactivated (0) for primer
fastqVersionset fastqVersion to 1 if you use Sanger, Illumina 1.8+ or NCBI SRA files. Set fastqVersion to 2, if you use Illumina 1.3+ - 1.7+ or Solexa fastq files.
TechnicalAdapterif one or more files have a technical adapter still included (e.g. TCAG 454) this can be removed by setting this option
TrimStartNTsdelete X NTs (e.g. if the first 5 bases are known to have strange biases)
PEheaderPairFmtcorrect PE header format (1/2) this is to accomodate the illumina miSeq paired end annotations 2="@XXX 1:0:4" instead of 1="@XXX/1". Note that the format will be automatically detected
AmpliconShortPEThis option should be "T" if your amplicons are possibly shorter than a read in a paired end sequencing run (e.g. amplicon of 150 in 250x2 miSeq is "T"). This works in conjunction with keepPrimerSeq (should be "F") and the ReversePrimer field in the mapping file.
RejectSeqWithoutRevPrim, RejectSeqWithoutFwdPrimsets if sequences without a forward (LinkerPrimerSequence) primer will be accepted (T=reject ; F=accept all); default=F
CheckForMixedPairsthis option should be "T" if your run possibly swaps fwd and rev pairs (this can happen depending on how your sequence provider was running illumina).
CheckForReversedSeqsthis option should be "T" if youwant to double check, if sometimes reads are completely reverse-translated.
Image Image Image

Copyright © All rights reserved | This template is made with Colorlib