- summary of the times spent in different function calls
- memory usage report
Feb., 2021
\(\textrm{Surface circle} = \left ( \frac{\textrm{Surface circle}}{\textrm{Surface square}} \right ) * (\textrm{Surface square})\)
is always valid. Knowing that \(\textrm{Surface circle} = \pi * r^2\), \(\pi\) can be computed as:
\(\pi = \frac{1}{r^2} \left ( \frac{\textrm{Surface circle}}{\textrm{Surface square}} \right ) * (\textrm{Surface square})\)
the ratio in parentheses is approximated with a Monte Carlo process throwing random points
sim <- function(l) { c <- rep(0,l) hits <- 0 pow2 <- function(x) { x2 <- sqrt( x[1]*x[1]+x[2]*x[2] ) return(x2) } for(i in 1:l){ x = runif(2,-1,1) if( pow2(x) <=1 ){ hits <- hits + 1 } dens <- hits/i pi_partial = dens*4 c[i] = pi_partial } return(c) }
The accuracy of the calculation increases with the number of iterations
size <- 100000 res <- sim(size) plot(res[1:size],type='l', xlab="Nr. iterations", ylab="Pi") lines(rep(pi,size)[1:size], col = 'red')
This function is included in R by default
size <- 500000 system.time( res <- sim(size) )
## user system elapsed ## 1.58 0.00 1.58
Another way to obtain execution times is by using the tictoc package:
install.packages("tictoc")
one can nest tic and toc calls and save the outputs to a log file:
library("tictoc") size <- 1000000 sim2 <- function(l) { c <- rep(0,l) hits <- 0 pow2 <- function(x) { x2 <- sqrt( x[1]*x[1]+x[2]*x[2] ); return(x2) } tic("only for-loop") for(i in 1:l){ x = runif(2,-1,1) if( pow2(x) <=1 ){ hits <- hits + 1 } dens <- hits/i; pi_partial = dens*4; c[i] = pi_partial } toc(log = TRUE) return(c) }
tic("Total execution time") res <- sim2(size)
## only for-loop: 2.96 sec elapsed
toc(log = TRUE)
## Total execution time: 2.97 sec elapsed
tic.log()
## [[1]] ## [1] "only for-loop: 2.96 sec elapsed" ## ## [[2]] ## [1] "Total execution time: 2.97 sec elapsed"
tic.clearlog()
Rprof should be present in your R installation. For a graphical analysis, we will use proftools package. One needs to install this package in case it is not already installed. For R versions < 3.5 the instructions are:
install.packages("proftools") source("http://bioconductor.org/biocLite.R") biocLite(c("graph","Rgraphviz"))
while for R > 3.5 one needs to do
install.packages("proftools") if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install() BiocManager::install(c("graph","Rgraphviz"))
the profiling is performed with the following lines:
size <- 500000 Rprof("Rprof.out") res <- sim(size) Rprof(NULL)
the profiling is performed with the following lines:
summaryRprof("Rprof.out")
## $by.self ## self.time self.pct total.time total.pct ## "runif" 0.82 51.25 0.82 51.25 ## "sim" 0.56 35.00 1.60 100.00 ## "pow2" 0.22 13.75 0.22 13.75 ## ## $by.total ## total.time total.pct self.time self.pct ## "sim" 1.60 100.00 0.56 35.00 ## "block_exec" 1.60 100.00 0.00 0.00 ## "call_block" 1.60 100.00 0.00 0.00 ## "eval" 1.60 100.00 0.00 0.00 ## "evaluate" 1.60 100.00 0.00 0.00 ## "evaluate::evaluate" 1.60 100.00 0.00 0.00 ## "evaluate_call" 1.60 100.00 0.00 0.00 ## "FUN" 1.60 100.00 0.00 0.00 ## "generator$render" 1.60 100.00 0.00 0.00 ## "handle" 1.60 100.00 0.00 0.00 ## "in_dir" 1.60 100.00 0.00 0.00 ## "knitr::knit" 1.60 100.00 0.00 0.00 ## "lapply" 1.60 100.00 0.00 0.00 ## "process_file" 1.60 100.00 0.00 0.00 ## "process_group" 1.60 100.00 0.00 0.00 ## "process_group.block" 1.60 100.00 0.00 0.00 ## "render" 1.60 100.00 0.00 0.00 ## "render_one" 1.60 100.00 0.00 0.00 ## "rmarkdown::render" 1.60 100.00 0.00 0.00 ## "rmarkdown::render_site" 1.60 100.00 0.00 0.00 ## "sapply" 1.60 100.00 0.00 0.00 ## "suppressMessages" 1.60 100.00 0.00 0.00 ## "timing_fn" 1.60 100.00 0.00 0.00 ## "withCallingHandlers" 1.60 100.00 0.00 0.00 ## "withVisible" 1.60 100.00 0.00 0.00 ## "runif" 0.82 51.25 0.82 51.25 ## "pow2" 0.22 13.75 0.22 13.75 ## ## $sample.interval ## [1] 0.02 ## ## $sampling.time ## [1] 1.6
here you can see that the functions runif and pow2 are the most expensive parts in our code. A graphical output can be obtained through the proftools package:
library(proftools) p <- readProfileData(filename = "Rprof.out")
plotProfileCallGraph(p, style=google.style, score="total")
One most probably needs to install this package as it is not included by default in R installations:
install.packages("rbenchmark")
then we can benchmark our function sim()
library(rbenchmark) size <- 500000 bench <- benchmark(sim(size), replications=10)
bench
## test replications elapsed relative user.self sys.self user.child sys.child ## 1 sim(size) 10 15.03 1 14.97 0 NA NA
the elapsed time is an average over the 10 replications we especified in the benchmark function.
If this package is not installed, do as usual:
install.packages("microbenchmark")
and do the benchmarking with:
library(microbenchmark) bench2 <- microbenchmark(sim(size), times=10)
bench2
## Unit: seconds ## expr min lq mean median uq max neval ## sim(size) 1.437368 1.448631 1.493887 1.45538 1.488293 1.788415 10
in this case we obtain more statistics of the benchmarking process like the mean, min, max, …
Timing your R code is useful to see what parts require optimization or a better package.
system.time and tic-toc will give you a single evaluation of the time taken by some R code
rbenchmark, microbenchmark functions will give statistics over independent replicas of the code
More useful information from profiling functions will be obtained if one uses functions to enclose independent tasks in your code (remember pow2, runif in the Pi calculation)
Once you know what are the bottlenecks of your code, working on a few of the most expensive ones could be more effective than working on many less significative functions