This function allows you to calculate mean ChIP signal for each ORF in the genome. The function goes through all included features in the supplied gff data and for each one it adds ORF length (in bp) and the mean of the signal collected from the supplied ChIP-seq data. This can then be used to analyse mean signal as a function of ORF length. The function takes as input the wiggle data as a list of 16 chromosomes (output of readall_tab). Note: Our wiggle data always contains gaps with missing chromosome coordinates and ChIP-seq signal. The way this function deals with that is by skipping affected genes. The number of skipped genes in each chromosome is printed to the console, as well as the final count (and percentage) of skipped genes.

signal_per_orf_length(inputData, gff, gffFile, saveFile = FALSE)

Arguments

inputData

As a list of the 16 chr wiggle data (output of readall_tab). No default.

gff

Optional dataframe of the gff providing the ORF cordinates. Must be provided if gffFile is not. No default. Note: You can use the function gff_read in hwglabr to load your selected gff file.

gffFile

Optional string indicating path to the gff file providing the ORF cordinates. Must be provided if gff is not. No default.

saveFile

Boolean indicating whether output should be written to a .txt file (in current working directory). If saveFile = FALSE, output is returned to screen or an R object (if assigned). Defaults to FALSE.

Value

A data frame equivalent to the supplied gff table with the following additional columns:

  1. length Length of the ORF in bp

  2. mean_signal Mean of the signal collected from the supplied data

Note: Skipped genes are included in the output with 'NA' for mean_signal.

Examples

# NOT RUN {
signal_per_orf_length(WT, gff = gff)

signal_per_orf_length(WT, gffFile = S288C_annotation_modified.gff, saveFile = TRUE)
# }