First, import
datastepr
.
The basic idea behind this package was inspired by SAS data steps. In
each step, the environment is populated by a slice of data from a data
frame. Then, operations are performed. The environment is whole-sale
appended to a results
data frame. Then, the datastep
repeats.
Let’s begin with a brief tour of the dataStepClass
.
First, create an instance.
Please read the dataStepClass documentation before continuing, which is extensive.
Our example will be Euler’s method for solving differential equations. In fact, it is unimportant if you understand the method itself. The differential equation to be solved is given below: $$ \dfrac{dy}{dx} = xy $$
First, we will set initial values. The x values are the series of x values over which the method will be applied.
Our initial y value will only be for the first iteration of the data-step.
Now here is our stair function. First, begin
is called,
setting up an evaluation environment in the function’s
environment()
. Next, only in the first step, initialize y.
Note, importantly, that without another set call later (or a manual
override of continue), the data step would only run once. A lag of x is
stored in all but the first step. This is important, because after the
set
call, x is overwritten using a slice of the dataframe
above. Then, a new y is estimated using the new x, the lag of x, and the
derivative estimate (in all but the first step). Next, a derivative is
estimated (see equation above). Finally, we output the results.
stairs = function(...) {
step$begin(environment())
if (step$i == 1) step$set(y_initial)
if (step$i > 1) lagx = x
step$set(xFrame)
if (step$i > 1) y = y + dydx*(x - lagx)
dydx = x*y
step$output()
step$end(stairs)
}
stairs()
Let’s take a look at our results!
dydx | x | y | lagx |
---|---|---|---|
0 | 0 | 1 | NA |
1 | 1 | 1 | 0 |
4 | 2 | 2 | 1 |
18 | 3 | 6 | 2 |
96 | 4 | 24 | 3 |
600 | 5 | 120 | 4 |
4320 | 6 | 720 | 5 |
35280 | 7 | 5040 | 6 |
322560 | 8 | 40320 | 7 |
3265920 | 9 | 362880 | 8 |