regex - How to split array of strings from two sides? -
i have array of strings (n=1000) in format:
strings<-c("gsm1264936_2202_4866_28368_150cgy-gcsf6-m3_mouse430a+2.cel.gz", "gsm1264937_2202_4866_28369_150cgy-gcsf6-m4_mouse430a+2.cel.gz", "gsm1264938_2202_4866_28370_150cgy-gcsf6-m5_mouse430a+2.cel.gz")
i'm wondering may easy way this:
strings2<-c(2201_4866_28368_150cgy-gcsf6-m3_mouse430a+2.cel, 2202_4866_28369_150cgy-gcsf6-m4_mouse430a+2.cel, 2203_4866_28370_150cgy-gcsf6-m5_mouse430a+2.cel)
which means trim off "gsm1234567" front , ".gz" end.
just gsub
solution matches strings starts ^
digits , alphabetical symbols, 0 or more times *
, until _
encountered , (more precisely "or") pieces or strings have .gz
@ end $
.
gsub("^([[:alnum:]]*_)|(\\.gz)$", "", strings) [1] "2202_4866_28368_150cgy-gcsf6-m3_mouse430a+2.cel" [2] "2202_4866_28369_150cgy-gcsf6-m4_mouse430a+2.cel" [3] "2202_4866_28370_150cgy-gcsf6-m5_mouse430a+2.cel"
edit
i forget escape second point.
Comments
Post a Comment