regex - How to match a hex sequence of characters and replace it with white space in PHP -
i have text need clean characters. characters showed in pictures attached question. want replace them white space x20
.
my attempt use preg_replace
.
$result = preg_replace("/[\xef\x82\xac\x09|\xef\x81\xa1\x09]/", "\x20", $string);
for particular case approach works, cases won't, because example had text comma , matched x82
, removed text.
how write regex search exact sequence ef 82 ac 09
, or other 1 ef 81 a1 09
, , not each pair separately ef
82
ac
09
?
1.) match of 6 different hex bytes or pipe character in character class. wanted use group (?:
...|
...)
matching different byte sequences.
2.) byte sequences not match image. seems messed 2 bytes. picture shows: ef 82 a1 09
, ef 81 ac 09
vs try: \xef\x82\xac\x09
| \xef\x81\xa1\x09
3.) when testing input sample
$str = "de la nouvelle; fourniture $ option :"; foreach(preg_split("//u", $str) $v) { var_dump($v, bin2hex($v)); echo "\n"; }
it turned out, 09
much. characters removed ef81ac
, ef82a1
. right regex (?:\xef\x81\xac|\xef\x82\xa1)
$result = preg_replace("/(?:\xef\x81\xac|\xef\x82\xa1)/", "\x20", $string);
see test @ eval.in
Comments
Post a Comment