progressr: An Introduction
Henrik Bengtsson
Source:vignettes/progressr-01-intro.md
progressr-01-intro.Rmd
The progressr package provides a minimal API for reporting progress updates in R. The design is to separate the representation of progress updates from how they are presented. What type of progress to signal is controlled by the developer. How these progress updates are rendered is controlled by the end user. For instance, some users may prefer visual feedback such as a horizontal progress bar in the terminal, whereas others may prefer auditory feedback. The progressr framework is designed to work out-of-the-box also with parallel and distributed processing, especially with the futureverse ecosystem.
Design motto:
The developer is responsible for providing progress updates but it’s only the end user who decides if, when, and how progress should be presented. No exceptions will be allowed.
Two Minimal APIs - One For Developers and One For End-Users
A simple example
Assume that we have a function slow_sum()
for adding up
the values in a vector. It is so slow, that we like to provide progress
updates to whoever might be interested in it. With the
progressr package, this can be done as:
slow_sum <- function(x) {
p <- progressr::progressor(along = x)
sum <- 0
for (kk in seq_along(x)) {
Sys.sleep(0.1)
sum <- sum + x[kk]
p(message = sprintf("Adding %g", x[kk]))
}
sum
}
Note how there are no arguments
(e.g. .progress = TRUE
) in the code that specify how
progress is presented. This is by design and because the only task for
the developer is to decide on where in the code it makes sense to signal
that progress has been made. As we will see next, it should be up to the
end user, and end user only, of this code to decide whether they want to
receive progress updates or not, and, if so, in what format. Asking them
to specify a special “progress” argument adds a lot of friction, it
clutters up the code, and, importantly, might not even be possible for
end users to do (e.g. they call a package function that in turn calls
the progress reporting function of interest).
Now, if we call this function, without further settings:
the default is that there will be no progress updates. To get progress updates, we need to request them to be “handled”, which we do by:
After this, progress will be reported;
> y <- slow_sum(1:10)
|==================== | 40%
> y <- slow_sum(10:1)
|======================================== | 80%
To disable reporting again, do:
Customizing how progress is reported
By default, progressr presents
progress via the built-in utils::txtProgressBar()
. It
presents itself as a rudimentary ASCII-based horizontal progress bar in
the R terminal. See help("handler_txtprogressbar")
for how
to customize the look of “txtprogressbar”, e.g. colorization and
Unicode. There are many other ways to report on progress, including
visually, auditory, and via notification systems. You can also use a mix
of these, e.g.
See the ‘Customizing How Progress is Reported’ vignette for for examples.
Additional Features
Support for progressr elsewhere
Note that progression updates by progressr is designed to work out of the box for any iterator framework in R. See the different package vignettes for details. Prominent examples are:
-
lapply()
etc. of base R -
map()
etc. by the purrr package -
llply()
etc. by the plyr package -
foreach()
iterations by the foreach package
and near-live progress reporting in parallel and distributed processing via the future framework:
-
future_lapply()
etc. by the future.apply package -
future_map()
etc. by the furrr package -
llply()
etc. by the plyr and doFuture packages -
foreach()
iterations via the foreach and doFuture packages -
bplapply()
etc. by the BiocParallel and doFuture packages
Other uses of progressr are:
Use regular output as usual alongside progress updates
In contrast to other progress-bar frameworks, output from
message()
, cat()
, print()
and so
on, will not interfere with progress reported via
progressr. For example, say we have:
slow_sqrt <- function(xs) {
p <- progressor(along = xs)
lapply(xs, function(x) {
message("Calculating the square root of ", x)
Sys.sleep(2)
p(sprintf("x=%g", x))
sqrt(x)
})
}
we will get:
> library(progressr)
> handlers(global = TRUE)
> handlers("progress")
> y <- slow_sqrt(1:8)
Calculating the square root of 1
Calculating the square root of 2
- [===========>-----------------------------------] 25% x=2
This works because progressr will briefly buffer any
output internally and only release it when the next progress update is
received just before the progress is re-rendered in the terminal. This
is why you see a two second delay when running the above example. Note
that, if we use progress handlers that do not output to the terminal,
such as handlers("beepr")
, then output does not have to be
buffered and will appear immediately.
Comment: When signaling a warning using
warning(msg, immediate. = TRUE)
the message is immediately
outputted to the standard-error stream. However, this is not possible to
emulate when warnings are intercepted using calling handlers. This is a
limitation of R that cannot be worked around. Because of this, the above
call will behave the same as warning(msg)
- that is, all
warnings will be buffered by R internally and released only when all
computations are done.
Sticky messages
As seen above, some progress handlers present the progress message as
part of its output, e.g. the “progress” handler will display the message
as part of the progress bar. It is also possible to “push” the message
up together with other terminal output. This can be done by adding class
attribute "sticky"
to the progression signaled. This works
for several progress handlers that output to the terminal. For example,
with:
slow_sum <- function(x) {
p <- progressr::progressor(along = x)
sum <- 0
for (kk in seq_along(x)) {
Sys.sleep(0.1)
sum <- sum + x[kk]
p(sprintf("Step %d", kk), class = if (kk %% 5 == 0) "sticky", amount = 0)
p(message = sprintf("Adding %g", x[kk]))
}
sum
}
we get
and