14 Expressions
When we use R, we write code which is then passed to the console to be executed (evaluated). Before the code is executed though, it is just an expression.
An expression can therefore be defined as a section of R code that has not yet been fully evaluated. That does not mean that all expressions have to be valid. For example, a piece of code like this mean()
is a valid expression, but will error when it is evaluated because mean
is missing its required arguments.
Expressions themselves are made up of 4 constituent parts: calls, constants, names and pairlists. For now though, we’re not going to look at the bits that make up expressions, but instead we’ll focus on expressions as a whole.
14.1 Creating expressions
Creating an expression (an unevaluated piece of code) is done in base R using the quote()
function. Unfortunately, the expression()
function in R doesn’t actually create an expression in the sense we’re talking about, so use quote()
instead.
When creating single line expressions, you can just provide the expression directly within the quote()
function:
quote(x + 10)
## x + 10
When providing multiple line expressions, wrap the argument in {}
like this:
quote({
+ 10
x - 5
y })
## {
## x + 10
## y - 5
## }
Unfortunately, testing whether something is an expression in R isn’t that easy, because the base R functions are made for the constituent parts of the expression (e.g. is.call()
, is.name()
, etc.). Instead, you can use the is_expression()
function from the rlang
package to test whether something is an expression:
::is_expression(
rlangquote(1 + 1)
)
## [1] TRUE
14.2 Evaluating expressions
Once you’ve created your expression, you can evaluate it using the eval()
function:
quote(1 + 1)
my_expr <-eval(my_expr)
## [1] 2
Of course in this example, this is essentially just the same as 1 + 1
as we’re evaluating the expression in the same environment in which it was created. However, the eval()
function accepts an envir
parameter where you can pass an environment for the expression to be evaluated in:
new.env()
new_environ <-$num <- 10
new_environ quote(num + 5)
my_expr <-eval(my_expr) # this will error because num doesn't exist in our parent environment
## Error in eval(my_expr): object 'num' not found
eval(my_expr, new_environ) # this won't error because num exists in new_environ
## [1] 15
Using this, you can create expressions in one environment without evaluating them, and then evaluate them later in different environments to where they were created.
14.3 Substitution
As well as hard coding in the objects and names in our expression, we can substitute in values from our environment. For example, lets say we wanted to create an x <- y + 1
expression, but we wanted to change what the value of y
was when we created it. We could acheive this by using the substitute()
function. substitute()
requires two parameters, expr
which must be an expression, and env
which must be an environment or a list and contains the objects you want to substitute.
substitute(x <- y + 1, list(y = 1))
## x <- 1 + 1
As you can see, this doesn’t evaluate the expression, it simple substitutes the provided names with the values provided in the env
parameter. This can be a really powerful tool for building up expressions.
14.4 Quasiquotation
A related subject to expressions and substitution is the idea of quasiquotation, used heavily in the tidyverse
packages. Quasiquotation is the process of quoting (creating expressions) and unquoting (evaluating) parts of that expression.
A good example of quasiquotation in action is the dplyr
package. Within the dplyr
package functions, you’ll provide column names to various analysis and data manipulation functions. When you provide those names however, you provide them as raw names (i.e. not in quotation marks): dplyr::mutate(data, new_column = old_column + 1)
. Those column names are then quoted (as in quote()
) and then evaluated in the context of the dataset that you’ve provided:
data.frame(col_1 = c(1,2,3))
test_df <-eval(quote(col_1), env = test_df)
## [1] 1 2 3
I won’t go into quasiquotation here because Hadley’s chapters on the subject in his Advanced R book summarises the topic much better than I ever could. But if you’re interested, I would recommend using the tidyverse
packages and trying to understand how quoting and unquoting has been implemented in those packages. If you can get your head round it and even implement similar ideas in your own projects, you can greatly expand your flexibility and efficiency.