dplyr / left_join中的嵌套管道链
发布时间:2020-05-23 16:04:45 所属栏目:MsSql 来源:互联网
导读:在尝试获取分组滞后变量(仅使用滞后不可能)的过程中,建议的解决方案是将数据拉出,滞后于不同的行,然后重新加入它. 我更喜欢在不创建中间对象的情况下这样做,并且希望在链中间进行.然而,它似乎没有像我期望的那样工作,并且问题似乎是使用之间的一些交互.和left
|
在尝试获取分组滞后变量(仅使用滞后不可能)的过程中,建议的解决方案是将数据拉出,滞后于不同的行,然后重新加入它. 我更喜欢在不创建中间对象的情况下这样做,并且希望在链中间进行.然而,它似乎没有像我期望的那样工作,并且问题似乎是使用之间的一些交互.和left_join中的嵌套链. require(tidyverse)
#> Loading required package: tidyverse
df <- data.frame(Team = c("A","A","B","C","D","D"),Date = c("2016-05-10","2016-05-10","2016-05-12","2016-05-15","2016-05-30","2016-05-30"),Points = c(1,4,3,2,1,5,6,9)
)
#This works:
df %>% left_join(x = .,y = df %>%
distinct(Team,Date) %>%
mutate(Date_Lagged = lag(Date)))
#> Joining,by = c("Team","Date")
#> Team Date Points Date_Lagged
#> 1 A 2016-05-10 1 <NA>
#> 2 A 2016-05-10 4 <NA>
#> 3 A 2016-05-10 3 <NA>
#> 4 A 2016-05-10 2 <NA>
#> 5 B 2016-05-12 1 2016-05-10
#> 6 B 2016-05-12 5 2016-05-10
#> 7 B 2016-05-12 6 2016-05-10
#> 8 C 2016-05-15 1 2016-05-12
#> 9 C 2016-05-15 2 2016-05-12
#> 10 D 2016-05-30 3 2016-05-15
#> 11 D 2016-05-30 9 2016-05-15
#And this works:
df %>% left_join(x = .,y = .)
#> Joining,"Date","Points")
#> Team Date Points
#> 1 A 2016-05-10 1
#> 2 A 2016-05-10 4
#> 3 A 2016-05-10 3
#> 4 A 2016-05-10 2
#> 5 B 2016-05-12 1
#> 6 B 2016-05-12 5
#> 7 B 2016-05-12 6
#> 8 C 2016-05-15 1
#> 9 C 2016-05-15 2
#> 10 D 2016-05-30 3
#> 11 D 2016-05-30 9
#This doesn't work despite the fact that `.` is df.
df %>% left_join(x = .,y = . %>%
distinct(Team,Date) %>%
mutate(Date_Lagged = lag(Date)))
#> Error in UseMethod("tbl_vars"): no applicable method for 'tbl_vars' applied to an object of class "c('fseq','function')"
#Desired output
distinct(df,Team,Date) %>%
mutate(Date_Lagged = lag(Date)) %>%
right_join(.,df) %>%
select(Team,Date,Points,Date_Lagged)
#> Joining,"Date")
#> Team Date Points Date_Lagged
#> 1 A 2016-05-10 1 <NA>
#> 2 A 2016-05-10 4 <NA>
#> 3 A 2016-05-10 3 <NA>
#> 4 A 2016-05-10 2 <NA>
#> 5 B 2016-05-12 1 2016-05-10
#> 6 B 2016-05-12 5 2016-05-10
#> 7 B 2016-05-12 6 2016-05-10
#> 8 C 2016-05-15 1 2016-05-12
#> 9 C 2016-05-15 2 2016-05-12
#> 10 D 2016-05-30 3 2016-05-15
#> 11 D 2016-05-30 9 2016-05-15
由reprex package(v0.2.0)创建于2018-06-12. 解决方法为了让你的代码工作,你需要在y参数周围加一个大括号,如下所示df %>% left_join(x = .,y = {.} %>%
distinct(Team,Date) %>%
mutate(Date_Lagged = lag(Date)))
Joining,"Date")
Team Date Points Date_Lagged
1 A 2016-05-10 1 <NA>
2 A 2016-05-10 4 <NA>
3 A 2016-05-10 3 <NA>
4 A 2016-05-10 2 <NA>
5 B 2016-05-12 1 2016-05-10
6 B 2016-05-12 5 2016-05-10
7 B 2016-05-12 6 2016-05-10
8 C 2016-05-15 1 2016-05-12
9 C 2016-05-15 2 2016-05-12
10 D 2016-05-30 3 2016-05-15
11 D 2016-05-30 9 2016-05-15
你可以这样做 df %>% left_join(df%>%
distinct(Team,Date) %>%
mutate(Date_Lagged = lag(Date))) (编辑:安卓应用网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
