Initialization

Make sure that the R6 and cgraph package are loaded. Both packages are avaiable on CRAN.

  library(R6)
  library(cgraph)

A computational graph can be initialized by:

x <- cgraph$new()

This yields a new cgraph object which holds all the data associated with the graph (i.e. its nodes and their values) as well as all the methods that can be performed on the graph.

Nodes

A computational graph consists of a set of nodes. There are four types of nodes: constants, inputs, parameters, and operations. They have three common properties:

  • Name, how it is named.
  • Type, what kind of node it is.
  • Value, the value attached to it.

They differ in whether or not they are assigned a fixed value and whether or not they are differenatiated when differentiating a graph.

Type Fixed Value? Differentiated?
Constant Yes No
Input No No
Parameter No Yes
Operation No Yes

Constants

Constant nodes are added to a computational graph by function const().

  cst <- const(10, name = "a")

They represent some fixed value that can be consumed by other nodes in the graph.

Inputs

Input nodes are added to a computational graph by function input().

  ipt <- input(name = "ipt")

Input nodes behave similarly as constant nodes but are not given a fixed value upon creation. Instead, they function as placeholders. Their value needs to be provided once a graph is evaluated of differentiated.

Parameters

Parameter nodes are added to a computational graph by function parm().

  prm <- parm(10, name = " prm")

Parameter nodes are given a fixed value upon creation but this value is subject to an optimization process and will likely change over time.

Operations

Operation nodes are added to a computational graph by calling operators on a set of existing nodes. For example, to subtract node node2 from node1, we call:

  opr <- cg_sub(node1, node2, name = "opr")

This adds an operation node to the graph that is currently active. Operation nodes behave as placeholders. Their value is calculated by performing an operation on existing nodes in the graph.


NOTE

When no name is provided to const(), input(), parm(), the resulting node is added to the graph under an automatically generated name.


The cgraph package provides operators for many base R functions. These operators are named cg_<name> where <name> is the name of the base R function. For example, to apply the base mean function to a node, you can call operator cg_mean(). The cgraph package also provides overloaded operators for many base inflix operations like addition or subtraction. For example, to subtract node2 from node1, we can simply call:

  opr <- node1 - node2

However, due to the nature of inflix functions, we are not able to supply a name for this operation. Instead, the operation node is added to the active graph under an automatically generated name.


NOTE

It is possible to define custom operators by function opr(). See the documentation of function $opr() for more information on how this can be done.


Active Graph

All nodes are added to the computational graph that is currently active. This also applies to operator nodes that are created by operators like cg_mean or overloaded inflix operators like + or -. A cgraph object is set to be the active graph upon creation. Method $active() can be called on an existing cgraph object to change the active graph.

x <- cgraph$new()
y <- cgraph$new()

# this node is added to y
a <- parm(10)

x$active()

# this node is added to x
b <- parm(20)

NOTE

Only one computational graph can be active at a time.


Plot a Graph

It can be useful to visualize the dependencies of nodes in a computational graph. This can be done by calling function plot on a cgraph object.

x <- cgraph$new()

a <- const(2)

b <- const(4)

c <- const(6)

d <- (a ^ b + b ^ c)

e <- (b ^ a + c ^ b)

plot(x)

The plot function draws a node for each node in the graph. An edge is drawn from one node to the other if the former node is needed to compute the latter node.


NOTE

The RGraphviz package needs to be installed to plot a computational graph. This package can be downloaded from the Bioconductor repository.


Evaluating a Single Node

A node can be evaluated by function val(). The behavior of var() depends on the type of node that is provided to the function. For a constant node, input node, or parameter node, the function retrieves the value that is assigned to the node. For an operation node, the function evaluates the node based on its parents. If these parents are unavailable, the function will traverse the graph and try to evaluate all ancestors that are needed to evaluate the parents.

x <- cgraph$new()

a <- const(1)

b <- const(2)

c <- a + b

d <- c - b

val(d)
## [1] 1

NOTE

For computationally efficiency, nodes evaluated by val() are cached. Their values are temporary stored so that they do not have to be re-evaluated if the function is called again. For example, if we call val(d) again in the code block above, d is not evaluated by adding a to b and than substracting b from the result. Instead, d is evaluated by subtracting b from the cached value of c that is evaluated in the previous call of val(d).


Dynamic Control Flow

Function val() can be used to dynamically build a computational graph. By combing val() with control statements of R, we can build a graph whose structure depends on the values of the nodes. This allows, for example, to construct a graph that calculates Catalan numbers.

x <- cgraph$new()

c <- const(1)

i <- 0

while(val(c) < 42)
{
  c <- const((4 * i + 2) / (i + 2)) * c
  
  i <- i + 1
  
  print(val(c))
}
## [1] 1
## [1] 2
## [1] 5
## [1] 14
## [1] 42

Change the Value of a Node

The value of a node can be changed by function set().

x <- cgraph$new()

a <- const(10)

set(a, 20)

val(a)
## [1] 20

However, one should be carefull when applying set() on an operation node. Consider the following example:

x <- cgraph$new()

a <- const(2)

b <- const(4)

c <- a + b

set(c, 5)

val(c)
## [1] 6

Although, we fixed the value of c to 5, the node evaluates to 6 because it is re-evaluated based on a and b.

Moreover, function set() removes the value assigned to all descendants of the provided node in the computational graph. The value of these descendants is no longer valid and needs to be re-evaluated. Hence, one should be carefull when applying set() subsequently on nodes that depend on each other. Consider the following example:

x <- cgraph$new()

a <- const(2)

b <- a ^ const(2)

c <- b - const(2)

set(a, 3)
set(b, 4)

val(c)
## [1] 2
set(b, 4)
set(a, 3)

val(c)
## [1] 7

The order in which we change a and b matters. If we change a before b, than c is evaluated based on the fixed value of b. If, however, we change b before a, than c is evaluated based on the value of b, which in turn is evaluated based on a.

Run a Graph

In some cases, it is desireable to evaluate a target node and also retrieve the value of all its ancestors in the computational graph. This procedure is called a forward-pass.

Function run() can be used to perform a forward-pass.

x <- cgraph$new()

a <- input(name = "a")

b <- const(4, name = "b")

c <- cg_pow(a, b, name = "c")

values <- run(c, list(a = 2))

as.list(values)
## $c
## [1] 16
## 
## $a
## [1] 2

In order to perform a forward-pass, all ancestors of the target node must be available or they must be able to be computed at run-time. Argument values of run() can be used to substitute values for nodes that do not yet have a value (i.e. inputs). The output of run() is an environment containing the value of the target node and the values of all its ancestors that are (re)-evaluated.


NOTE

An important difference between run() and val() is that, when traversing trough the ancestors of the target node, val() will not evaluate a node if it is already available (e.g. when it has a fixed value or is cached), whereas run() always (re)-evaluates a node regardsless whether it is already available.


Differentiate a Graph

A nice feature of a computational graph is that its nodes can be differentiated by reservse automatic differenation. This is done by traversing the ancestors of a given target node in the reverse direction and applying elementary rules of calculus to differentiate the nodes. The partial derivatives of the target node with respect to all its ancestors can be efficiently evaluated in a single backward-pass through the graph.

Function gradients() can be used to perform a backward-pass.

x <- cgraph$new()

a <- parm(2, name = "a")

b <- const(2, name = "b")

c <- cg_pow(a, b, name = "c")

values <- run(c)

grads <- gradients(c, values)

as.list(grads)
## $a
## [1] 4
## 
## $c
## [1] 1

In order to perform a backward-pass, all ancestors of the target node must have a value. These values can be collected by first performing a forward-pass for the node using function run(). The environent obtained by run() can be supplied to the values argument of gradients().

Currently, the cgraph package can only differentiate scalar target nodes. In case the target node is a vector or array, argument index of gradients() can be used to specific which element of the vector or array is differentiated. The output of gradients() is an environment containing the partial derivative of the target node with respect to each ancestor of the node that is an operator node or parameter node. These derivatives have the same shape as the nodes.