Last updated: 2023-01-04
chromRegions()
chromRegions()
takes a file with the sizes of the chromosomes and draws a ggplot2
-based bar plot. Then takes a list of regions in a BED-like format to draw them into each chromosome. The inputs chrom.sizes
and regions
can be supplied as a list of character with path to the file where the information is stored or a list of data frame.
The crhom_sizes
input is:
chrom_sizes = "../testdata/mm10.chrom.sizes"
chrom_sizes %>% read.delim(header = F) %>% head()
## V1 V2
## 1 1 195471971
## 2 10 130694993
## 3 11 122082543
## 4 12 120129022
## 5 13 120421639
## 6 14 124902244
The regions_list
is a list of characters or dataframes. It can have as many elements as wanted and it has the following structure:
regions_list <- list("Regions1" = "../testdata/mm10.regions.tsv",
"Regions2" = "../testdata/mm10.regions2.tsv",
"Regions3" = "../testdata/mm10.regions3.tsv")
regions_list[[1]] %>% read.delim(header = F) %>% head()
## V1 V2 V3 V4 V5 V6
## 1 1 57348975 57377520 region2 28545 -
## 2 1 91403055 91406029 region4 2974 -
## 3 1 92992344 92997067 region6 4723 -
## 4 1 125174891 125177979 region10 3088 +
## 5 10 18796805 18831930 region18 35125 -
## 6 10 20310505 20312760 region19 2255 -
The default run requires only the chrom_sizes
and the regions
arguments, either as a path to a file or a data frame.
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes", regions_list = regions_list)
# Read the data
chrom_sizes = "../testdata/mm10.chrom.sizes"%>% read.delim(header = F)
regions = regions_list %>% purrr::map(~read.delim(.x, header = F))
chromRegions(chrom_sizes = chrom_sizes, regions_list = regions)
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
chr_order = c(1,3,4,2,5,6,7,9,8,19,11,10,12,13,15,14,16,17,18,"X","Y","M", "MT") )
Very often, the genome assemblies of a lot of species have chromosomes/scaffolds with strange names, which are not nice to plot. These can be excluded using the chr_exlude
argument with a vector of regular expressions that match the chromosomes to exclude. By default chr_exclude
removes the most usuall strange chromosomes, but if you want to remove more chromosomes or don’t want to remove any, you can change the chr_exclude
argument.
An example that excludes all the chromosomes that contain a dot in the name:
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
chr_exclude = c("\\."))
Here you have an example that does not remove any chromosome.
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
chr_exclude = c(" "))
Note that, since we are converting the chromosome names to an ordered factor with chr_order
, the chromosome names that do not appear in chr_order
will be groupped and plotted into a NA
category.
chromRegions()
allows flipping the axes by setting the coord_flip
argument to TRUE.
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
coord_flip = T)
By default, chromRegions()
draws a line/rectangle and a point in the middle of each region. To avoid drawing the points, the argument draw_points
can be set to FALSE.
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
draw_points = F)
By default, the regions are colored by region (i.e. each element in regions_list
). This can be controlled with the colors
argument, which accepts a character vector with valid color names and the same length as regions_list
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
colors = c("Black", "Yellow", "Green"))
If you want to color by strand, just turn color_by
to "strand"
.
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
colors = c("Darkgreen", "Darkred"),
color_by = "strand")
Now, imagine that we have regions that do not have a defined strand (e.g. most ChIP-seq peaks). In this case, the color_by
is internally converted to "region"
and the regions are colored by region set (i.e. elements in regions_list
). Look at this example with only one region set whose strand values are converted to “.”.
# Read and format regions file to have strand as "."
regions_no_strand <- list("Regions1" = read.delim("../testdata/mm10.regions.tsv", header = F) %>% dplyr::mutate(V6 = "."))
# Draw the plot
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_no_strand,
color_by = "strand",
colors = c("Gold3", "Darkgreen"))
Title and subtitle can be supplied through the arguments title
and subtitle
, respectively. By default, they are set to NULL, but can accept a character of length 1.
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
title = "This is a title",
subtitle = "This is a subtitle")
Also, the labels of the axes can be set or removed through the xlab
and ylab
arguments. By default, the X axis label is set to “Chromosome” and the Y axis is set to "", but they can be removed setting the corresponding arguments to NULL or changed to any value.
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
xlab = NULL,
ylab = "Position in the chromosome")
Finally, a caption can be included in the bottom-right corner by setting the caption
argument. By default, caption
is set to NULL and it can be set to TRUE or any character. If caption
is set to a character, whatever is written will be placed in the bottom-right corner. Instead, if it is set to TRUE, what will be written will be the number of regions in the input region sets.
Here there is an example with any character:
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
caption = "This is a caption")
On the other hand, if caption
is set to TRUE, the caption will show the number of regions in the input regions set (regions
).
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
caption = TRUE)
The position of the legend is, by default, the bottom of the plot. This can be changed by changing the legend
argument to one of “bottom”, “right”, “top”, “left” or “none” (no legend). The legend
argument is passed through ggpubr::theme_pubr()
.
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
legend = "right")
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
legend = "none")
### Y axis
By default, the Y axis is not plotted, which means that the y_text_size
is set to NULL
. However, it can be plotted by setting the size_y_text
argument to a number which will be used as size of the text in the Y axis.
chromRegions(chrom_sizes = "../testdata/mm10.chrom.sizes",
regions_list = regions_list,
y_text_size = 10)
Since chromRegions()
outputs a ggplot2
-based bar plot, it can be further customized like any other ggplot2
-based plot.