regex - split sql statements in php on semicolons (but not inside quotes) -
i have system causing errors when users use semicolon in free format field. have traced down simple explode statement:
$array = explode( ";", $sql );
because line in subroutine called on system replace line split things properly, without breaking rest of system. thought onto winner str_getcsv, isn't sophisticated enough either. @ following example
$sql = "begin;insert table_a (a, b, c) values('42', '12', '\'ab\'c; def');insert table_b (d, e, f) values('42', '43', 'xy\'s z ;uvw') on duplicate key update f='xy\'s z ;uvw';commit;"; $array = str_getcsv($sql, ";", "'"); foreach( $array $value ) { echo $value . "<br><br>"; }
when run outputs following:
begin
insert table_a (a, b, c) values('42', '12', '\'ab\'c
def')
insert table_b (d, e, f) values('42', '43', 'xy\'s z
uvw') on duplicate key update f='xy\'s z
uvw'
commit
so doesn't notice semicolons inside quotes. (as far can see quoted strings different places in system in single quotes, possible @ times double quotes, not sure that.)
can tell me how this? suspect can complicated regex, on head.
(*skip)(*fail) magic
this live php demo shows output of 2 options below (with or without semi-colon).
this need:
$splits = preg_split('~\([^)]*\)(*skip)(*f)|;~', $sql);
see demo see splitting on right semi-colons.
output:
[0] => begin [1] => insert table_a (a, b, c) values('42', '12', '\'ab\'c; def') [2] => insert table_b (d, e, f) values('42', '43', 'xy\'s z ;uvw') [3] => commit [4] =>
the empty item #4 match on other side of final ;
. other option keep semi-colons (see below).
option 2: keep semi-colons
if want keep semi-colons, go this:
$splits = preg_split('~\([^)]*\)(*skip)(*f)|(?<=;)(?![ ]*$)~', $sql);
output:
[0] => begin; [1] => insert table_a (a, b, c) values('42', '12', '\'ab\'c; def'); [2] => insert table_b (d, e, f) values('42', '43', 'xy\'s z ;uvw'); [3] => commit;
explanation
this problem classic case of technique explained in question "regex-match pattern, excluding..."
in left side of alternation |
, regex \([^)]*\)
matches complete (parentheses)
deliberately fails, after engine skips next position in string. right side matches ;
word want, , know right ones because not matched expression on left. safe split on it.
in option 2, keep semi-colons, our match on right matches position, no characters. position asserted lookbehind (?<=;)
, asserts ;
precedes position, , negative lookahead (?![ ]*$)
, asserts follows not optional spaces end of string (so avoid last empty match).
sample code
please examine live php demo.
reference
Comments
Post a Comment