This function randomly permutes genomic locations among a genome defined in a genome file. Note: the permutation stays in the same chromosome. Cross-chromosome shuffle is not supported at this point.

shuffle_bed(x, genome = NULL, excluded_region = NULL, sort = TRUE, seed = NULL)

Arguments

x

A GRanges.

genome

Specify the reference genome for the BED file. genome can be a valid genome name in GenomeInfoDb::Seqinfo(), e.g. GRCh37, or hs37-1kg, which is a genome shipped with this package, or any custom chromosome size files (local or remote). If NULL, will try to obtain such information from input data. Refer to read_bed().

excluded_region

A GRanges containing regions that you don't want to exclude from the shuffle. Shuffled intervals are guaranteed not to fall within such regions.

sort

Logical value indicating whether to sort the shuffled features. When the input data set is large, sorting can be very expensive. If you don't need the output to be sorted, set sort to FALSE.

seed

An integer seed for the random number generator. If NULL, the seed is randomly chosen.

Value

A GRanges containing shuffled features.

References

Manual page of bedtools shuffle: https://bedtools.readthedocs.io/en/latest/content/tools/shuffle.html

Examples

# Load BED tables
tbl <- read_bed(system.file("extdata", "example_merge.bed", package = "bedtorch"), use_gr = FALSE, genome = "hs37-1kg")
excluded <- read_bed(system.file("extdata", "example_intersect_y.bed", package = "bedtorch"), use_gr = FALSE)

# Basic usage
result <- shuffle_bed(tbl)
head(result)
#>    chrom    start      end score
#> 1:    21  7597606  7597609     7
#> 2:    21  7690377  7690390     1
#> 3:    21 14565410 14565415     4
#> 4:    21 20645610 20645615     2
#> 5:    21 24075302 24075308     9
#> 6:    21 26125197 26125203     5
#> -------
#> genome: hs37-1kg.

# Shuffle the data, but exclude certain regions. Plus, set the RNG seed to 1
result <- shuffle_bed(tbl, genome = "hs37-1kg", excluded_region = excluded, seed = 1)
head(result)
#>    chrom    start      end score
#> 1:    21  6841070  6841083     1
#> 2:    21 12201288 12201295     5
#> 3:    21 17633255 17633260     4
#> 4:    21 18120575 18120578     7
#> 5:    21 21612316 21612320     7
#> 6:    21 30845248 30845254     5
#> -------
#> genome: hs37-1kg.