r - nth occurrence in a dataframe -

- February 15, 2013

this question has answer here:

create counter multiple variables [duplicate] 6 answers

i have data.frame 2 columns (firstname , state).

my.df = data.frame(firstname = c('john', 'paul', 'john', 'sarah', 'haley', 'paul', 'john'),                    state = c('vic', 'nsw', 'vic', 'qld', 'tas', 'nsw', 'vic'))  firstname state    john   vic    paul   nsw    john   vic   sarah   qld   haley   tas    paul   nsw    john   vic

i include additional column lists nth occurance each value in firstname column. example, 'john' appears in rows 1, 3 , 6 - new column therefore list '1' in row 1, '2' in row 3 (as second time 'john' listed) , '3' in row 6 (as third time 'john' listed).

my desired outcome appear follows:

firstname state index    john   vic     1    paul   nsw     1    john   vic     2   sarah   qld     1   haley   tas     1    paul   nsw     2    john   vic     3

any assistance appreciated

or if you're feeling dplyr-ishly loopless:

new.df <- my.df %>%     group_by(firstname) %>%     mutate(index=1:n())

or can use row_number()

or using data.table

library(data.table) setdt(my.df)[, index := seq_len(.n), = firstname]

or base r

with(my.df, ave(seq(firstname), firstname, fun = function(x) seq(length(x))))

Search This Blog

Dil

r - nth occurrence in a dataframe -

Comments

Post a Comment

Popular posts from this blog

c# - Store DBContext Log in other EF table -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -

Nuget pack csproj using nuspec -