Building the SNP database
Andrew Potter
2025-05-12
build_snp_database.Rmd
Creating a new variantCell project
If you haven’t already done so, you can create a new project by using the command:
#Create new variantCell object
project <- variantCell$new()
to initialize the project.
Building the SNP database
To build the SNP database, first you need to add each sample individually, optionally specifying donor and recipient (please see tutorial for identifying donor/recipient).
# add one sample to SNP database
addSampleData = function(
sample_id,
vireo_path = NULL,
cellsnp_path,
cell_data,
data_type = "seurat",
prefix_text,
donor_type = NULL,
non_transplant_mode = FALSE,
min_cells = 0,
min_alt_frac = 0,
normalize = TRUE,
scale.factor = 10000,
sample_metadata = NULL
)
This is an example, setting donor and recipient (for transplant mode):
# usage example
project$addSampleData(
sample_id = "Sample1",
vireo_path = "/path/to/donor_ids.tsv",
cellsnp_path = "/path/to/cellSNP/dir/",
cell_data = seuratObj,
data_type = 'seurat',
prefix_text = "Patient1_Sample1_",
normalize = TRUE,
donor_type = c(donor0 = "Donor", donor1 = "Recipient")
)
For initial analysis, it is recommended not to perform any filtering (min_cells=0, min_alt_frac=0) as this will help facilitate downstream DE analysis and plotting. If normalize is set to TRUE, size factor normalization is performed, using total DP (SNP counts) for each cell. Downstream analysis can then optionally use the normalized SNP counts.
After the samples have been added, you can build a unified SNP
database by executing the function: buildSNPDatabase()