Bioinformatics Bioinformatics Scripting

R Modules

Pinterest LinkedIn Tumblr

In mathematics, a module is one of the fundamental algebraic structures used in abstract algebra. A module over a ring is a generalization of the notion of vector space over a field, wherein the corresponding scalars are the elements of an arbitrary given ring (with identity) and a multiplication (on the left and/or on the right) is defined between elements of the ring and elements of the module. A module taking its scalars from a ring R is called an R-module.

Modules are very closely related to the representation theory of groups. They are also one of the central notions of commutative algebra and homological algebra, and are used widely in algebraic geometry and algebraic topology.

The key idea of this package is to provide a unit of source code which has its own scope. The main and most reliable infrastructure for such organizational units in the R ecosystem is a package. Modules can be used as stand-alone, ad-hoc substitutes for a package or as a subunit within a package.

We can create a module using the module::module function. A module is similar to a function definition; it consists of:

  • the body of the module
  • the environment in which it is created (defined implicitly)
  • the environment used for the search path, in most cases baseenv() (defined implicitly)

Just like classes and objects, modules present a way to group functions into one entity. They behave as a first class citizen in the sense that they can be treated like any other data structure in R:

  • they can be created anywhere, including inside another module
  • they can be passed to functions
  • returned from functions

In addition they provide:

  • local namespace features by declaring imports and exports
  • encapsulation by introducing a local scope
  • code reuse by various modes of composition
  • interchangeability with other modules implementing the same interface

R packages are a collection of R functions, compiled code and sample data. They are stored under a directory called “library” in the R environment. By default, R installs a set of packages during installation.

A core set of packages is included with the installation of R, with more than 15,000 additional packages available at the Comprehensive R Archive Network (CRAN), Bioconductor, Omegahat, GitHub, and other repositories. 

Other R package resources include Crantastic, a community site for rating and reviewing all CRAN packages, and R-Forge, a central platform for the collaborative development of R packages, R-related software, and projects. 

The Bioconductor project provides R packages for the analysis of genomic data used in Bioinformatics. This includes object-oriented data-handling and analysis tools for data from Affymetrix, cDNA microarray, and next-generation high-throughput sequencing methods. 

Following modules are used in R;

To load data

  • DBI 
  • odbc 
  • RMySQL
  • XLConnect/xlsx 
  • foreign 
  • haven 

To manipulate data

  • tidyverse 
  • dplyr 
  • tidyr 
  • stringr 
  • lubridate 

To visualize data

  • ggplot2 
  • ggvis 
  • rgl 
  • htmlwidgets 
  • googleVis 

To model data

  • tidymodels 
  • car 
  • mgcv 
  • lme4/nlme 
  • randomForest 
  • multcomp 
  • vcd 
  • glmnet 
  • survival 
  • caret 

To report results

  • shiny 
  • R Markdown 
  • xtable 

For Spatial data

  • sp/maptools 
  • maps 
  • ggmap 

For Time Series and Financial data

  • zoo 
  • xts 
  • quantmod 

To write high performance R code

  • Rcpp 
  • data.table 
  • parallel 

To work with the web

  • XML 
  • jsonlite 
  • httr 

To write our own R packages

  • devtools 
  • testthat 
  • roxygen2 

Write A Comment