How to select multiple years from a dataset of multiple years

I have a dataset consisting of daily values ​​for about 20 years. I only have to choose a few years, for example 10 years. My dataset consists of data from 1996

to 2013

. I need to create a file that is only 10 years old from 2004

to 2013

.

However, I am familiar with using the grep command to select one specific year.

Df <- Df[grep("2013", Df$Year), ] 

      

Is it possible to choose several years at the same time.

I tried doing

Df[grep(c("2004", "2005", "2006"), Df$Year), ] 

      

but it doesn't work.

+3


source to share


3 answers


Ya, just put those last digits inside a character class.

Df <- Df[grep("201[345]", Df$Year), ] 

      

This allows you to select a line in which there is 2013

, 2014

, 2015

the number present in the column Year

.



I need to create a file that is only 10 years old from 2004 to 2013.

Df <- Df[grep("20(0[4-9]|1[0-3])", Df$Year), ] 

      

DEMO

+5


source


Maybe this could help:

Df <- Df[(as.numeric(Df$Year) >= 2004) & (as.numeric(Df$Year) <= 2013),]

      



or in a more compact form, as suggested by @DavidArenburg:

Df <- Df[as.numeric(Df$Year) %in% 2004:2013, ]

      

+4


source


Why is grep used when you can use a subset or whatever?

subset(DF, Year >= 2004 & Year <= 2013)

      

Or a filter from dplyr

library(dplyr)
DF %>% filter(Year >= 2004 & Year <= 2013)

      

or data.table if you are using this package:

library(data.table)
setDT(DF)[Year >= 2004 & Year <= 2013]

      

+2


source







All Articles