Skip to end of metadata
Go to start of metadata

Submitting an R job is very similar to submitting a serial job shown on Submitting Jobs.  

When executing R scripts, output will be directed to an Rout file. This file will have the same name as the submitted script but with the extension `Rout` instead of `R`. For instance, the following examples will have output saved to the file `Rcode.Rout`.


serial_R.submit

#!/bin/sh
#SBATCH --time=00:30:00
#SBATCH --mem-per-cpu=1024
#SBATCH --job-name=TestJob
#SBATCH --error=TestJob.%J.stdout
#SBATCH --output=TestJob.%J.stderr

 

module load R/3.3

R CMD BATCH Rcode.R

You will notice that the example is very similar to to the serial example. The important line is the module load command. That tells Tusker to load the R framework into the environment so jobs may use it.

Multicore (parallel) R submission

Submitting a multicore R job to SLURM is very similar to Submitting an OpenMP Job, since both are running multicore jobs on a single node. Below is an example:

parallel_R.submit

#!/bin/sh
#SBATCH --ntasks-per-node=16
#SBATCH --nodes=1
#SBATCH --time=00:30:00
#SBATCH --mem-per-cpu=1024
#SBATCH --job-name=TestJob
#SBATCH --error=TestJob.%J.stdout
#SBATCH --output=TestJob.%J.stderr

 

module load R/3.3

R CMD BATCH Rcode.R

The above example will submit a single job which can use up to 16 cores.  

Be sure to use limits in your R code so you only use 16 cores, or your performance will suffer.  For example, when using the parallel package function mclapply:

parallel.R
library("parallel")
...
mclapply(rep(4, 5), rnorm, mc.cores=16)

Multinode R submission with Rmpi

Submitting a multinode MPI R job to SLURM is very similar to Submitting an MPI Job, since both are running multicore jobs on a multiple nodes. Below is an example of running Rmpi on Crane on 2 nodes and 32 cores:

Rmpi.submit

#!/bin/sh
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --time=00:30:00
#SBATCH --mem-per-cpu=1024
#SBATCH --job-name=TestJob
#SBATCH --error=TestJob.%J.stdout
#SBATCH --output=TestJob.%J.stderr

module load compiler/gcc/4.9 openmpi/1.10 R/3.3
export OMPI_MCA_mtl=^psm
mpirun -n 1 R CMD BATCH Rmpi.R

When you run Rmpi job on Crane, please use the line export OMPI_MCA_mtl=^psm in your submit script. On the other hand, if you run Rmpi job on Tusker, you do not need to add this line. This is because of the different Infiniband cards Tusker and Crane use. Regardless of how may cores your job uses, the Rmpi package should always be run with mpirun -n 1 because it spawns additional processes dynamically.

Please find below an example of Rmpi R script provided by The University of Chicago Research Computing Center:

Rmpi.R
library(Rmpi)

# initialize an Rmpi environment
ns <- mpi.universe.size() - 1
mpi.spawn.Rslaves(nslaves=ns)

# send these commands to the slaves
mpi.bcast.cmd( id <- mpi.comm.rank() )
mpi.bcast.cmd( ns <- mpi.comm.size() )
mpi.bcast.cmd( host <- mpi.get.processor.name() )

# all slaves execute this command
mpi.remote.exec(paste("I am", id, "of", ns, "running on", host))

# close down the Rmpi environment
mpi.close.Rslaves(dellog = FALSE)
mpi.exit()

Adding packages

Users are allowed to install R packages into their own home directories.  Instructions are provided by OSU.

  • No labels