scala - How to read some specific files from a collection of files as one RDD -

i have collection of files in directory , want read specific files form these files 1 rdd , example:

2000.txt 2001.txt 2002.txt 2003.txt 2004.txt 2005.txt 2006.txt 2007.txt 2008.txt 2009.txt 2010.txt 2011.txt 2012.txt

and want read every specific range these files, example:

range = 4 = 2004  read files : 2004.txt , 2005.txt , 2006.txt , 2007.txt 1 rdd (data)

how can in spark scala?

because spark's textfile exposes hadoop's fileinputformat, can specify varargs of directories , wildcards. hence should work (untested):

def datedrange(fromyear: int, years: int) =    sc.textfile(seq.tabulate(years)(x => fromyear + x).map(y => s"/path/to/dir/$y"): _*)

Dil