spora

Installation

Requirements

The following tools/packages are required:
• Python (Ubuntu installation instructions can be found here)
• conda (Ubuntu installation instructions can be found here)

Basic installation from source

The basic installation instructions for spora are as follows:

git clone https://github.com/matt-sd-watson/spora.git
conda env create -f spora/environments/environment.yml
conda activate ncov_spora
cd spora
pip install .

Usage

Test that the installation was successful using the following commands:

spora
#OR
spora --help

which should result in the following output:

usage: 
    	spora -c <config.yaml> 
    	OR
    	spora --focal_list ...<input args>

spora: Python and snakemake outbreak workflow for COVID-19

optional arguments:
  -h, --help            Show the help output and exit.
  -c CONFIG, --config CONFIG
                        Input config file in yaml format, all command line arguments can be passed via the config file.
  -f FOCAL_SEQS, --focal-sequences FOCAL_SEQS
                        Input .txt list or multi-FASTA focal samples for outbreak. Required
  -b BACKGROUND_SEQS, --background-sequences BACKGROUND_SEQS
                        Optional input .txt list or multi-FASTA background samples to add to analysis
  -m MASTER_FASTA, --master-fasta MASTER_FASTA
                        Master FASTA of genomic sequences to select from. Required if either --focal-sequences or --background-sequences are not supplied in
                        FASTA format
  -o OUTDIR, --output-directory OUTDIR
                        Path to the desired output directory. If none is provided, a new folder named spora will be created in the current directory
  -r REFERENCE, --reference REFERENCE
                        .gb file containing the desired COVID-19 reference sequence. Required
  -p PREFIX, --prefix PREFIX
                        Prefix string to label all output files. Default: outbreak
  -t NTHREADS, --nthreads NTHREADS
                        Number of threads to use for processing. Default: 4
  -s, --snps-only       Generate a snps-only FASTA from the input FASTA. Default: False
  -rn, --rename         Rename the FASTA headers to be compatible with NML standards. Default: False
  -nc NAMES_CSV, --names-csv NAMES_CSV
                        Use the contents of a CSV to rename the input FASTA. Requires the following column headers: original_name, new_name
  -ncs, --no-constant-sites
                        Do not enable constant sites to be used for SNPs only tree generation. Default: Enabled
  -fi, --filter         Filter both the focal and background sequences based on genome completeness and length. Default: Not enabled
  -gc GENOME_COMPLETENESS, --genome-completeness GENOME_COMPLETENESS
                        Integer for the minimum genome completeness percentage for filtering. Default: 90
  -gl GENOME_LENGTH, --genome-length GENOME_LENGTH
                        Integer for the minimum genome length for filtering. Default: 29500
  -rp, --report         Generate a summary output report for the spora run. Default: Not enabled
  -v, --version         Show the current spora version then exit.
  

Next: Inputs