Comment on page
Mapping Via Sbatch
Content on this page only related to the Ecker Lab servers. But the same idea can be applied to any other servers with some modification.
Remember you need to copy the genome reference files to stampede2 and use the corresponding MappingConfig file for demultiplexing.
After demultiplexing, all the snakemake command is also summarized in the
{output_dir}/snakemake/sbatch
directory. The
snakemake_cmd.txt
contains all the snakemake command for all PCR index sub-directories. The
sbatch.sh
is a submission script file that can automatically submit all these commands via yap sbatch
. yap sbatch
will control the total number of jobs run in parallel on the sbatch. output_dir
├── snakemake
│ ├── sbatch
│ │ ├── sbatch.sh
│ │ └── snakemake_cmd.txt
# this command run all the snakemake commands for mapping
yap sbatch \
--project_name mc-V2 \
--command_file_path $SCRATCH/{lib_name}/snakemake/sbatch/snakemake_cmd.txt \
--working_dir $SCRATCH/{lib_name}/snakemake/sbatch \
--time_str 12:00:00
I assume you put the output_dir on your stampede2 $SCRATCH directory with the same name. If you do not follow this assumption, you need to modify the sbatch and snakemake command by yourself.
# you can get your scratch dir location by
# on tacc login node
echo $SCRATCH
# and then make a soft link to your home dir
# on tacc login node
ln -s $SCRATCH ~/scratch
# on local server
rsync -arv {output_dir} tacc:scratch/
Just like qsub, you only need to execute the sbatch.sh. It will generate all the sbatch script for each snakemake command and execute them. And it will also wait for all command to finish before exit. I do not recommend to run this as a separate sbatch job, because the execution time is long. You can just execute this in a screen or
nohup
# open a screen
screen -R sbatch
# in that screen, activate the mapping environment
conda activate mapping
# run the submitter interactively
sh sbatch.sh
After mapping, you can
rsync
the whole output_dir
from the remote server back to the same location. If you rsync
to the same path, you may skip the FASTQ files because they are unchanged during mapping.# the {output_dir} is the same dir uploaded to tacc
rsync -arv --exclude "*fq.gz" tacc:scratch/{lib_name} {output_dir}
Last modified 3yr ago