plot_enrichment_profile

plot_enrichment_profile(fileNames, sampleNames, bedFiles, basemod, outDir, threshA=129, threshC=129, windowSize=1000, colorA='#053C5E', colorC='#BB4430', colors=['#2D1E2F', '#A9E5BB', '#610345', '#559CAD', '#5E747F'], dotsize=0.5, smooth=50, min_periods=10, cores=None)
fileNames

name(s) of bam file with Mm and Ml tags

sampleNames

name(s) of sample for output file name labelling; valid names contain [a-zA-Z0-9_].

bedFiles

specified windows for region(s) of interest; optional 4th column in bed file to specify strand of region of interest as + or -. Default is to consider regions as all +. Reads will be oriented with respect to strand. Only reads overlapping regions defined in bed file will be extracted, regardless of windowSize. Plots are centered at the center of the bed file regions.

basemod

One of the following:

  • 'A' - extract mA only

  • 'CG' - extract mCpG only

  • 'A+CG' - extract mA and mCpG

outDir

directory to output plot

threshA

threshold for calling mA; default 129

threshC

threshold for calling mCG; default 129

windowSize

window size around center point of feature of interest to plot (+/-); default 1000 bp

colorA

color in hex for mA; default #053C5E

colorC

color in hex for mCG; default #BB4430

colors

color list in hex for overlay plots; default is [“#2D1E2F”, “#A9E5BB”, “#610345”, “#559CAD”, “#5E747F”]

dotsize

size of points; default is 0.5

smooth

window over which to smooth aggregate curve; default of 50 bp

min_periods

minimum number of bases to consider for smoothing: default of 10 bp

cores

number of cores over which to parallelize; default is all available

Example

For single file and region:

>>> dm.plot_enrichment_profile("dimelo/test/data/mod_mappings_subset.bam", "test", "dimelo/test/data/test.bed", "A+CG", "dimelo/dimelo_test", windowSize=500, dotsize=1)

To overlay multiple regions of interest (can conversely also overlay multiple samples over a single region if a list of files is provided):

>>> dm.plot_enrichment_profile("dimelo/test/data/mod_mappings_subset.bam", ["test1","test2"], ["dimelo/test/data/test.bed","dimelo/test/data/test.bed"], "A", "dimelo/dimelo_test", windowSize=500, dotsize=1)

Return

  • Aggregate profile of fraction of bases modified centered at features of interest

  • Single molecules centered at features of interest

  • Base abundance centered at features of interest

Example Plots

dimelo-plot-enrichment-profile - CLI interface

Plot DiMeLo enrichment profile

dimelo-plot-enrichment-profile [-h] -f FILENAMES [FILENAMES ...] -s SAMPLENAMES
                               [SAMPLENAMES ...] -b BEDFILES [BEDFILES ...] -m {A,CG,A+CG} -o
                               OUTDIR [-t SMOOTH] [-n MIN_PERIODS] [--colorA COLORA]
                               [--colorC COLORC] [--colors COLORS [COLORS ...]] [-d DOTSIZE]
                               [-A THRESHA] [-C THRESHC] [-w WINDOWSIZE] [-p CORES]

dimelo-plot-enrichment-profile optional arguments

  • -h, --help - show this help message and exit

  • -A THRESHA, --threshA THRESHA - threshold above which to call an A base methylated (default: 129)

  • -C THRESHC, --threshC THRESHC - threshold above which to call a C base methylated (default: 129)

  • -w WINDOWSIZE, --windowSize WINDOWSIZE - window size around center point of feature of interest to plot (+/-) (default: 1000)

  • -p CORES, --cores CORES - number of cores over which to parallelize (default: None)

dimelo-plot-enrichment-profile required arguments

  • -f FILENAMES, --fileNames FILENAMES - bam file name(s) (default: None)

  • -s SAMPLENAMES, --sampleNames SAMPLENAMES - sample name(s) for output file labelling (default: None)

  • -b BEDFILES, --bedFiles BEDFILES - name of bed file(s) defining region(s) of interest (default: None)

  • -m BASEMOD, --basemod BASEMOD - which base modification to extract (default: None)

  • -o OUTDIR, --outDir OUTDIR - directory to output plot (default: None)

dimelo-plot-enrichment-profile smoothing options

  • -t SMOOTH, --smooth SMOOTH - window over which to smooth aggregate curve (default: 50)

  • -n MIN_PERIODS, --min_periods MIN_PERIODS - minimum number of bases to consider for smoothing (default: 10)

dimelo-plot-enrichment-profile plotting options

  • --colorA COLORA - color in hex (e.g. "#BB4430") for mA (default: #053C5E)

  • --colorC COLORC - color in hex (e.g. "#BB4430") for mCG (default: #BB4430)

  • --colors COLORS - color list in hex (e.g. "#BB4430") for overlay plots (default: ['#2D1E2F', '#A9E5BB', '#610345', '#559CAD', '#5E747F'])

  • -d DOTSIZE, --dotsize DOTSIZE - size of points (default: 0.5)