star_to_mat returns a count matrix from STAR ReadsPerGene.out.tab files.

star_to_mat(dir, rgx, column, rm_ens_vers = TRUE)

Arguments

dir

A character scalar indicating the directory containing all the STAR ReadsPerGene.out.tab files.

rgx

A character scalar representing a regex used to parse out the sample name from the name of the ReadsPerGene.out.tab file.

column

An integer indicating the column to extract counts from. 1 = unstranded, 2 = 1st read strand aligned with RNA, 3 = 2nd read strand aligned with RNA with RNA

rm_ens_vers

Logical indicating whether version number should be removed from gene ID (as indicated by trailing period and integers).

Value

A matrix where the column names are the sample names and the row names are the gene names

Details

This function will read in all the '*ReadsPerGene.out.tab' files in the specified directory and converts it to a counts matrix.