sublimetext2 - Regex to Select a Sub-Set of a Regex Select -
i haven't had luck searching on , believe that's because don't know key terms use explain i'm looking for. have following regex i'm using distinguish internal links on set of html pages external links:
(?<=a href=")[^http](.*?)(\.html")
so won't select "http://www.example.com/foo/bar.html" from:
<a href="http://www.example.com/foo/bar.html">bar</a>
but select "/foo/bar.html" from:
<a href="/foo/bar.html">bar</a>
this working great. want subselect on selected string "/foo/bar.html" isolate ".html" part. possible? possibly substring or lookbehind/forward? i've setup example here:
https://www.regex101.com/r/gz6bp5/2
this global find/replace in sublime text editor. believe restricted regex this. understand variable find/replace possible, have not been able find example of in action.
edit: clarify, regex have distinguish between external/internal links works great (although imperfectly commenters have noted). question how select ".html" portion of match.
thanks in advance!
this seems trick:
(?<=a href=")(?!http)[^"]*\/([^"]+)(?=">)
the idea:
- use look-behind
(?<=a href=")
ensure in link anchor. - use look-ahead
(?=">)
ensure anchor ends. - use negative look-ahead
(?!http)
ensure things don't start http. - use greed match
[^"]*
capture characters last slash, without crossing quote-boundary. - grab characters after last slash before quote boundary in capture group
([^"]+)
problems may encounter:
- this valid html
<a target="_blank" href="bob.html">
. - this valid link
<a href="ftp://bob.html">
.
though can build regexes deal these well.
to deal target issue, drop look-behind, , final look-ahead:
<a[^>]*href="(?!http)[^"]*\/([^"]+)
now matching string starts <a
, looking href="
inside of it. dropping (?=">)
, able handle anchors many tags.
to deal ftp
, following:
<a[^>]*href="(?!(http|ftp))[^"]*\/([^"]+)
now, can wrap beginning of string in capture group:
(<a[^>]*href="(?!(http|ftp))[^"]*\/)([^"]+)
and alter $1
(the part filename.extenion) , $2
(the filename.extension) see fit.
an example at: https://www.regex101.com/r/gz6bp5/3.
Comments
Post a Comment