Reads epigenomic annotation files from a CSV manifest and builds a unified epiRomics database for downstream analysis. Supports optional extra columns for ChIP/histone peak files (signal, pval, qval, peak).
Arguments
- db_file
character string of path to properly formatted csv file containing epigenetic data. [See vignette for more details]
- txdb_organism
a character string containing the TxDB associated with your data.
- genome
a character string naming the genome assembly associated with your data (e.g.
"mm10","hg38","rn6","dm6"). The value must match the assembly referenced bytxdb_organism/ the CSV manifest'sgenomecolumn;epiRomicsitself is organism-agnostic.- organism
a character string containing the org.db associated with your data.
- extraCols
named character vector of extra columns to read from chip/histone BED files. Default is NULL (no extra columns). Set to
c(signal = "numeric", pval = "numeric", qval = "numeric", peak = "numeric")to read narrowPeak columns.- data_dir
optional character string specifying the root directory for resolving relative file paths in the CSV manifest. When provided, any relative path in the
pathcolumn is prefixed withdata_dir. This is especially useful with cached data fromcache_datawhere the CSV uses relative paths. Default is NULL (paths used as-is).
Examples
## build_database reads external BED/BigWig files from a CSV manifest.
## Confirm that a missing file produces a clean error:
tryCatch(
build_database("nonexistent.csv",
txdb_organism = paste0("TxDb.Hsapiens.UCSC.hg38.knownGene::",
"TxDb.Hsapiens.UCSC.hg38.knownGene"),
genome = "hg38", organism = "org.Hs.eg.db"),
error = function(e) message(e$message)
)
#> build_database: The following files do not exist: nonexistent.csv