There's lots of discussion around the new |> operator & whether a(b(x)) is "better" than x |> b() |> a(), performance, debugging, etc. Note, they're identical to #rstats after parsed;
e0 <- quote(a(b(x)))
e1 <- quote(x |> b() |> a())
> identical(e0, e1)
[1] TRUE
/1 https://twitter.com/henrikbengtsson/status/1334703130378788866
e0 <- quote(a(b(x)))
e1 <- quote(x |> b() |> a())
> identical(e0, e1)
[1] TRUE
/1 https://twitter.com/henrikbengtsson/status/1334703130378788866
This is because |> is processed during *parsing*, which is the first step performed by any programming language. Parsing runs nothing! It just deconstructs the human-readable code into an abstract syntax tree (AST). Evaluation happens afterward
/2
/2
We can also see this identity using {lobstr}:
> lobstr::ast(g(f(x)))
+- g
└-+- f
└- x
> lobstr::ast(x |> f() |> g())
+- g
└-+- f
└- x
/3
> lobstr::ast(g(f(x)))
+- g
└-+- f
└- x
> lobstr::ast(x |> f() |> g())
+- g
└-+- f
└- x
/3
We can only tell the difference from the 'srcref' attribute
> parse(text="g(f(x))")
expression(g(f(x)))
> parse(text="x |> f() |> g()")
expression(x |> f() |> g())
but that's just for display; R evaluates the two expressions in the exact same way with the same performance
/4
> parse(text="g(f(x))")
expression(g(f(x)))
> parse(text="x |> f() |> g()")
expression(x |> f() |> g())
but that's just for display; R evaluates the two expressions in the exact same way with the same performance
/4
This duality of g(f(x)) and x |> f() |> g() in the parser makes it "safe" to introduce |> into the R language bc its 100% backward compatible w/ the existing R ecosystem. No matter how hard you try, you shouldn't be able to find a case where f(x) works, but x |> f() doesn't
/5
/5
Basically, we don't have to worry about surprising corner cases and side effects showing next months, in a year, or ten years from now (*)
(*) This claim/prediction is soo gonna come back to me
/6
(*) This claim/prediction is soo gonna come back to me
/6
In contrast, other core-level changes to R are much more complicated to introduce. For example, getting to the point where if (1:2 == 1) { ... } produces an error in R (as it should be) is a much slower roll-out process since it breaks some existing code
/7
/7
Now, the magrittr %>% pipe didn't have the luxury of being able to work at the parser level, so they had to try to achieve the above at runtime … and they did it very well
I think it's important to understand that |> is not the same as %>% ... but they're certainly similar
/8
I think it's important to understand that |> is not the same as %>% ... but they're certainly similar
/8
Some important differences: Static code inspection can be done with code using |> but that can't be done reliably with %>%, or any other infix operator that can be redefined at runtime. This matters in, for instance, 'R CMD check' and parallel processing
/9
/9
Watching from the sideline, the original author & following maintainers have tried their best to have %>% emulate as far as possible what we're getting with |>. They made another big leap towards harmonizing it further in magrittr 0.2.0 (Nov 2020) https://www.tidyverse.org/blog/2020/08/magrittr-2-0/
/10
/10
Although I'm not a heavy user, it's been facinating to follow the evolution of magrittr %>% and its uptake from the perspective of the R language and the R community. It's a process that's been going on for many years
/11
/11
The magrittr %>% to base R |> shows there's a path for community-driven language changes to #rstats. It started out as proof-of-concept, was picked up by several, and embraced by many more to a point of no return, and R Core listened and brought it in
/12
/12
I should clarify that the above is based on my understanding and interpretation of magrittr %>% and base R |>, and their history. I haven't contributed to either but I can say: Good job, good job.
I'm now handing over the mic to those who know this better than I do.
/13
I'm now handing over the mic to those who know this better than I do.
/13