Prepare PlateInfo & bcl2fastq SampleSheet
Prepare bcl2fastq sample sheet for demultiplexing illumina primer index
Related Commands
# Print out template of the plate info
yap default-plate-info
# Make bcl2fastq sample sheet based on the plate info file
yap make-sample-sheetPurpose
In this step, we prepare a PlateInfo file that contains sample metadata and PCR index information of the plates in your library. This information is used to generate a SampleSheet file for bcl2fastq to demultiplex the PCR index when generating raw FASTQ files.
Input
Library preparation information, most importantly:
The sample metadata (you determine what is necessary for your experiment)
The plate IDs of this library
The barcode information of this library
Step 1: Prepare a PlateInfo file
Get the PlateInfo file template and manually edit plate and barcode information
# print V1 template
yap default-plate-info -v V1
# print V2 template
yap default-plate-info -v V2PlateInfo Examples
This example contains four plates
Each plate is indexed by one PCR index
Primer names must be unique for the same sequencing run.
PlateInfo section must be Tab
\tseparated.When demultiplexing, yap will automatically separate the single multiplex group into six, so during mapping, each plate will have six commands, the same as "V2 Index Library" on the next tab.
[CriticalInfo]
n_random_index=384
input_plate_size=384
pool_id=Pool_9
tube_label=Pool_9
email=your_email_address1@salk.edu;your_email_address2@salk.edu;your_email_address3@salk.edu
[LibraryInfo]
lib_comp_date=200518
project=DVC
organism=mm
dev_stage_age=P120
tissue_cell_type=VC
lib_type=snmCT-seq
sequencer=NovaSeq
se_pe=pe
read_length=150
requested_by=HL
[PlateInfo]
plate_id multiplex_group primer_name
DVC200116_P120_VC_B_M_1_4 1 D11
DVC200116_P120_VC_B_M_1_6 1 N6
DVC200116_P120_VC_B_M_1_8 1 K22
DVC200116_P120_VC_B_M_2_4 1 P8
# each plate is indexed by single PCR index, use multiplex_group 1 in this caseThis example contains four plates
Each plate has six multiplex groups
All primer names must be unique for the same sequencing run.
PlateInfo section must be Tab
\tseparated.
[CriticalInfo]
n_random_index=384
input_plate_size=384
pool_id=Pool_9
tube_label=Pool_9
email=your_email_address1@salk.edu;your_email_address2@salk.edu;your_email_address3@salk.edu
[LibraryInfo]
lib_comp_date=200518
project=DVC
organism=mm
dev_stage_age=P120
tissue_cell_type=VC
lib_type=snmCT-seq
sequencer=NovaSeq
se_pe=pe
read_length=150
requested_by=HL
[PlateInfo]
plate_id multiplex_group primer_name
DVC200116_P120_VC_B_M_1_4 1 D11
DVC200116_P120_VC_B_M_1_4 2 F9
DVC200116_P120_VC_B_M_1_4 3 I10
DVC200116_P120_VC_B_M_1_4 4 C16
DVC200116_P120_VC_B_M_1_4 5 D2
DVC200116_P120_VC_B_M_1_4 6 I4
DVC200116_P120_VC_B_M_1_6 1 N6
DVC200116_P120_VC_B_M_1_6 2 J17
DVC200116_P120_VC_B_M_1_6 3 N12
DVC200116_P120_VC_B_M_1_6 4 L19
DVC200116_P120_VC_B_M_1_6 5 D7
DVC200116_P120_VC_B_M_1_6 6 J5
DVC200116_P120_VC_B_M_1_8 1 K22
DVC200116_P120_VC_B_M_1_8 2 J13
DVC200116_P120_VC_B_M_1_8 3 E10
DVC200116_P120_VC_B_M_1_8 4 M21
DVC200116_P120_VC_B_M_1_8 5 J6
DVC200116_P120_VC_B_M_1_8 6 C4
DVC200116_P120_VC_B_M_2_4 1 P8
DVC200116_P120_VC_B_M_2_4 2 J19
DVC200116_P120_VC_B_M_2_4 3 I21
DVC200116_P120_VC_B_M_2_4 4 C9
DVC200116_P120_VC_B_M_2_4 5 P10
DVC200116_P120_VC_B_M_2_4 6 J4Starting v1.3.5, If each plate in your library is indexed by a single PCR index, you can put each plate in a single row with multiplex_group = 1, the demultiplex pipeline will recognize it and automatically separate files into multiple multiplex groups.
This example contains eight plates
Every two plates share a primer quarter
Possible primer_quarter values are:
SetB_Q1, SetB_Q2, SetB_Q3, SetB_Q4
Set1_Q1, Set1_Q2, Set1_Q3, Set1_Q4
Each primer_quarter appears no more than twice for the same sequencing run.
PlateInfo section must be Tab
\tseparated.
[CriticalInfo]
n_random_index=8
input_plate_size=384
pool_id=Pool_73
tube_label=Pool_73
email=your_email_address@salk.edu
[LibraryInfo]
lib_comp_date=180101
project=CEMBA
organism=mm
dev_stage_age=P56
tissue_cell_type=9C
lib_type=snmCseq2
sequencer=NovaSeq
se_pe=pe
read_length=150
requested_by=HL
[PlateInfo]
plate_id primer_quarter
CEMBA190530_9C_1 SetB_Q1
CEMBA190530_9C_2 SetB_Q1
CEMBA190530_9C_3 SetB_Q2
CEMBA190530_9C_4 SetB_Q2
CEMBA190620_9C_1 SetB_Q3
CEMBA190620_9C_2 SetB_Q3
CEMBA190620_9C_3 SetB_Q4
CEMBA190620_9C_4 SetB_Q4Step 2: Run yap make-sample-sheet
yap make-sample-sheetyap make-sample-sheettake a V1 or V2 plate info file to generate a bcl2fastq sample sheet.The sample sheet and name pattern are automatically generated so the pipeline can automatically parse cell information during and after mapping.
See usage in the command line
yap make-sample-sheet -h
yap make-sample-sheet -h
usage: yap make-sample-sheet [-h] --plate_info_path PLATE_INFO_PATH
--output_prefix OUTPUT_PREFIX
[--header_path HEADER_PATH]
optional arguments:
-h, --help show this help message and exit
Required inputs:
--plate_info_path PLATE_INFO_PATH, -p PLATE_INFO_PATH
Path of the plate information file. (default: None)
--output_prefix OUTPUT_PREFIX, -o OUTPUT_PREFIX
Output prefix, will generate 2 sample sheets, 1 for
miseq, 1 for novaseq (default: None)
Optional inputs:
--header_path HEADER_PATH
Path to the sample sheet header that contains
sequencer info. Will use default if not provided.
(default: None)Output
PlateInfo file recording library preparation information
Three sample sheets for
bcl2fastqdemultiplexing. One for MiSeq, two for NovaSeq (v1, or v1.5).
Output after bcl2fastq
The raw FASTQ generated by bcl2fastq using SampleSheet generated in this step will contain several (8 in V1 and 64 in V2) cells' reads.
The random index occurs at the 5' of R1.
The PCR index is trimmed from both R1 and R2 and put into the read name.
Last updated
Was this helpful?