class: center, middle # Replicate and the Apply Family of Functions - 1 ## Introductory Computer Programming ### Deepayan Sarkar
--- # Loop alternatives * R has several general purpose functions that provide alternatives to loops * We will discuss two of the main approaches today: - `replicate()` for repeating a simulation experiment - Various `*apply()` functions to loop over varying function arguments * These are not necessarily more efficient than loops (unlike true vectorized operations) * However, they are conceptually simpler (and easier to parallelize) --- layout: true # Replicating a simulation experiment --- * Consider this example from our previous assignment: ```r n <- 20 x <- sample(n, 2) max(x) %% min(x) == 0 ``` ``` [1] FALSE ``` * To estimate the probability of this event, we want to run it several times and compute proportion --- * We have seen how to do this using a loop * An alternative approach is to use the `replicate()` function ```r replicate(10, { x <- sample(n, 2) max(x) %% min(x) == 0 }) ``` ``` [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE ``` * `replicate()` takes the _expression_ supplied as its second argument and evauates it multiple times ```r s <- replicate(5000, { x <- sample(n, 2) max(x) %% min(x) == 0 }) sum(s) / length(s) # estimated probability of success ``` ``` [1] 0.2398 ``` --- layout: true # Return value of `replicate()` --- * The return value of `replicate()` depends on what the expression returns when evaluated * In the above example, each evaluation results in a scalar logical * In that case, the results are aggregated to produce a vector -- * If the result is always a vector of same length (greater than one), the result is combined into a matrix ```r quantile(rnorm(500), probs = c(0.25, 0.5, 0.75)) ``` ``` 25% 50% 75% -0.66396457 -0.01230398 0.57424684 ``` ```r replicate(6, quantile(rnorm(500), probs = c(0.25, 0.5, 0.75))) ``` ``` [,1] [,2] [,3] [,4] [,5] [,6] 25% -0.64963116 -0.70360965 -0.67474961 -0.69725569 -0.67091756 -0.61606464 50% 0.03544182 -0.06543701 -0.01299035 -0.09834945 -0.03744973 0.06074351 75% 0.67997507 0.66759673 0.69118021 0.64266682 0.60572714 0.69812549 ``` --- * If the result has inconsistent length, the result is a list ```r set.seed(20200101) replicate(6, unique(sample(10, 5, replace = TRUE))) ``` ``` [[1]] [1] 1 4 5 8 [[2]] [1] 4 7 8 6 [[3]] [1] 10 9 5 2 1 [[4]] [1] 6 4 1 2 [[5]] [1] 6 9 2 3 [[6]] [1] 1 9 6 2 4 ``` --- * This may sometimes lead to unanticipated behaviour ```r set.seed(20200101) replicate(2, unique(sample(10, 5, replace = TRUE))) ``` ``` [,1] [,2] [1,] 1 4 [2,] 4 7 [3,] 5 8 [4,] 8 6 ``` -- * When output length is known to be variable, it is better to explicitly disable simplification ```r set.seed(20200101) replicate(2, unique(sample(10, 5, replace = TRUE)), simplify = FALSE) # always return list ``` ``` [[1]] [1] 1 4 5 8 [[2]] [1] 4 7 8 6 ``` --- layout: false class: middle, center # Questions?