Benchmark Experiments

R package benchmark

In statistical learning benchmarking is the methodology of comparing learners or algorithms with respect to a certain performance measure. The benchmarking process abstractly consists of three levels: Setup, Execution and Analysis. (1) The Setup defines the design of a benchmark experiment; data set, candidate algorithms, performance measures and a suitable resampling strategy are declared. (2) In the Execution level the defined setup is executed. Here, computational aspects play a major role; an important example is the parallel computation of the experiment on different computers. (3) In the Analysis level the computed raw performance measures are analyzed using exploratory and inferential methods. This package is mainly concerned with the Analysis level; in what the derivation of a statistically correct order of the candidate algorithms is a major objective.


The stable version of benchmark is available on CRAN; issue the following from within R to install and load it:

R> install.packages("benchmark")
R> library("benchmark")


benchplot: Exploratory and inferential analysis of the monks3 benchmark experiment.

R> demo("benchplot", package = "benchmark")

lsbenchplot-uci621: Exploratory and inferential analysis of the UCI domain benchmark experiment.

The analysis of six common learning algorithms on well-known UCI data sets.
R> demo("lsbenchplot-uci621", package = "benchmark")

lsbenchplot-uci621-atypes: Archetypal analysis of the UCI domain benchmark experiment.

In general, comparing algorithms often means comparing with a "best" or "worst" algorithm, i.e., comparing with an extreme algorithm (the benchmark). However, in case of of more than one performance measure or more than one data set no uniquely defined extreme values are available---archetypal analysis can be used to compute data-driven benchmark algorithms.
R> demo("lsbenchplot-uci621-atypes", package = "benchmark")

lsbenchplot-gh: Exploratory and inferential analysis of the Grasshopper domain benchmark experiment.

R> demo("lsbenchplot-gh", package = "benchmark")


The development version is available on R-Forge.