barAnno()
barAnno()
draws an ggplot2
-based barplot …
As input, barAnno()
takes a named list of data frames with the column annotation
.
First, peak coordinates have to be read and the coordinate columns must be named as seqnames, start and end. ^Note: Since I will work with lists, I will use the purrr
package for most of the data transformations.^
# read the peaks into a list
peak_list <- list.files("../testdata", "peak", full.names = T, recursive = T) %>%
purrr::set_names(c("PeakX", "PeakY")) %>%
purrr::map(~read.delim(.x)) %>%
# the files have the already annotated peaks, but we
# will use only the coordinates to make the annotation ourselves
# the columns seqnames, start and end are required
purrr::map(~dplyr::select(.x, seqnames, start, end))
peak_list[[1]][1:5,]
## seqnames start end
## 1 chr7 3664442 3664743
## 2 chr8 83199578 83200817
## 3 chr13 20147802 20148111
## 4 chr4 56639205 56639610
## 5 chr18 61614281 61614892
Then, we can use ChIPseeker::annotatePeak()
to annotate the peaks into genomic regions.
# Load annotations for mouse mm10 genome
library(TxDb.Mmusculus.UCSC.mm10.knownGene)
library(org.Mm.eg.db)
# Annotate peaks into csAnno object
peak_list_csAnno <- peak_list %>%
# convert to granges
purrr::map(~plyranges::as_granges(.x)) %>%
# annotate peaks
purrr::map(~ChIPseeker::annotatePeak(peak = .x, tssRegion = c(-2500, 2500),
TxDb = TxDb.Mmusculus.UCSC.mm10.knownGene,
annoDb = "org.Mm.eg.db", level = "transcript"))
Now that peaks are annotated in the different genomic regions and we have created a csAnno object for each set of peaks, we must retrieve the anno
data frame inside each csAnno object. This is not really necessary, since the corresponding transformations are already done inside barAnno()
, but it is recommendable in case we want to do more things with these data.
peak_list <- peak_list_csAnno %>% purrr::map(~as_tibble(.x))
We can run barAnno()
using as input:
barAnno(anno_list = peak_list)
ChIPseeker::annotatePeak()
barAnno(anno_list = peak_list_csAnno)
barAnno(anno_list = peak_list, fill_position = F)
barAnno(anno_list = peak_list, anno_num = 2) # default
barAnno(anno_list = peak_list, anno_num = 3)
barAnno(anno_list = peak_list, anno_num = "all")
barAnno(anno_list = peak_list,
anno_names = c("Condition X", "Condition Y"),
names_order = c("Condition Y", "Condition X"))
We can only set all the peak sets with the same name and they will be counted as one set.
## Write counts
We can write the number and percentage of observations in each annotation by specifying count_label = T
. The angle and the size of these labels can be controlled through counts_angle
and counts_size
. Here there are some examples:
barAnno(anno_list = peak_list, counts_label = T)
barAnno(anno_list = peak_list, anno_num = 3, counts_label = T, counts_angle = 0)
barAnno(anno_list = peak_list, anno_names = c("cond1", "cond1"), protein = c("protX", "protY"), counts_label = T)
barAnno(anno_list = peak_list, counts_label = T, counts_size = 5)
barAnno(anno_list = peak_list, width = 0.6) # default
barAnno(anno_list = peak_list, width = 0.9)
barAnno(anno_list = peak_list, width = 1)
barAnno(anno_list = peak_list, main = "This is a title", subtitle = "This is a subtitle",
xlab = "This is the X-axis label", ylab = "This is the Y-axis label")
barAnno(anno_list = peak_list, legend_position = "none")
barAnno(anno_list = peak_list, legend_position = "left")
barAnno(anno_list = peak_list, legend_position = "bottom")
barAnno(anno_list = peak_list, legend_position = "top")
barAnno(anno_list = peak_list, xangle = 60)
If the length of color_palette
argument in barAnno()
is 1 (e.g. Set2
, the default), the function uses a predefined palette passed through scale_fill_brewer()
. Available palettes can be found here: https://ggplot2.tidyverse.org/reference/scale_brewer.html. For example:
barAnno(anno_list = peak_list, color_palette = "Set1")
barAnno(anno_list = peak_list, color_palette = "Pastel2")
barAnno(anno_list = peak_list, color_palette = "Oranges")
Nevertheless, if color_palette
is a character vector of length greater than 1 and has valid color names (e.g. c("blue", "gold3")
), the function takes these colors and passes them through scale_fill_manual()
to fill the bars. If the number of annotated regions is greater than the default of 2 (i.e. anno_num = 3
or anno_num = "all"
), the number of colors passed through color_palette
must be the same length.
barAnno(anno_list = peak_list, anno_num = "all",
color_palette = c("blue", "gold3", "pink", "darkgreen", "darkred", "orange", "purple"))
The colors can be also generated through other functions, such as rainbow()
or circlize::rand_col()
.
Since barAnno()
outputs a ggplot2
-based bar plot, it can be further customized with scales
or theme
, etc.