This file contains options for the quality filtering of sequences, each option is described again in a comment above the relevant flag (comments can be written in this configuration file by starting the line with "#"). Some options start with a "*". These are relaxed parameters (i.e. they have to be less restrictive) for the secondary quality filter. Sequences in this mid-quality range will not be used for the OTU building, but only mapped to the OTU's.
Flag | Function |
---|---|
minSeqLength | minimal required sequence length AFTER removal of Primers, Barcodes and trimming. |
maxSeqLength | maximum sequence length |
TruncateSequenceLength | runcate total Sequence length to X (length after Barcode, Adapter and Primer removals), set to -1 to deactivate. |
minAvgQuality | minimal required average quality AFTER removal of Primers, Barcodes and trimming. |
maxAccumulatedError | Probabilistic max number of accumulated sequencing errors. After this length, the rest of the sequence will be deleted. Complimentary to TrimWindowThreshhold. (-1) deactivates this option. |
BinErrorModelMaxExpError, BinErrorModelAlpha | Binomial error model of expected errors per sequence (see Arxiv paper). (BinErrorModelAlpha -1) deactivates this option. |
maxAmbiguousNT | maximum ambiguous bases in Sequence |
maxHomonucleotide | maximum homonucleotide run length |
QualWindowWidth, QualWindowThreshhold | Filter whole sequence if one window of quality scores is below QualWindowThreshhold |
TrimWindowWidth, TrimWindowThreshhold | Trim the end of a sequence if a window falls below quality threshhold. Useful for removing low qulaity trailing ends of sequence |
keepBarcodeSeq, keepPrimerSeq | keep Barcode / Primer Sequence in the output fasta file - in a normal 16S analysis this should be deactivated (0) for Barcode and deactivated (0) for primer |
fastqVersion | set fastqVersion to 1 if you use Sanger, Illumina 1.8+ or NCBI SRA files. Set fastqVersion to 2, if you use Illumina 1.3+ - 1.7+ or Solexa fastq files. |
TechnicalAdapter | if one or more files have a technical adapter still included (e.g. TCAG 454) this can be removed by setting this option |
TrimStartNTs | delete X NTs (e.g. if the first 5 bases are known to have strange biases) |
PEheaderPairFmt | correct PE header format (1/2) this is to accomodate the illumina miSeq paired end annotations 2="@XXX 1:0:4" instead of 1="@XXX/1". Note that the format will be automatically detected |
AmpliconShortPE | This option should be "T" if your amplicons are possibly shorter than a read in a paired end sequencing run (e.g. amplicon of 150 in 250x2 miSeq is "T"). This works in conjunction with keepPrimerSeq (should be "F") and the ReversePrimer field in the mapping file. |
RejectSeqWithoutRevPrim, RejectSeqWithoutFwdPrim | sets if sequences without a forward (LinkerPrimerSequence) primer will be accepted (T=reject ; F=accept all); default=F |
CheckForMixedPairs | this option should be "T" if your run possibly swaps fwd and rev pairs (this can happen depending on how your sequence provider was running illumina). |
CheckForReversedSeqs | this option should be "T" if youwant to double check, if sometimes reads are completely reverse-translated. |
Copyright © All rights reserved | This template is made with Colorlib