Re: Regex expert needed
On Monday 02 May 2005 15:06, Alex Malinovich wrote:
> On Sat, 2005-04-30 at 21:42 +0100, Alan Chandler wrote:
> > I am using PHP and I am trying to parse a string into substrings
> > delimited by a single character. In some instances this is a ',' in
> > others it will be '='.
> > I think the php function preg_split is the one to use, but I need find
> > the right regular expression to match the delimeter character. Not being
> > an expert and seeing its just not an obvious single character (see below)
> > it has confused me completely.
> > Unfortunately a subset of the substrings themselves may have ',' or '='
> > signs in them, although each string in this subset will itself have had
> > the addslashes() function performed on it to escape internal quote
> > characters and then completely surrounded in a pair of single quote
> > characters.
> > In otherwords my regular expression does not want to match a ',' (or in
> > the other case a '=') if it is inside an unescaped quote.
> > So can any regular expression experts out there help me define the right
> > one to match the criteria I have described.
> Some sample data would be useful. Your explanation is pretty good, but
> as soon as you started with "Unfortunately ...", that made it much more
> difficult to deal with (for me at least) without sample data.
For instance I want the parse this string
Form = 'This is a bad bad thing, but not as bad as it would be if \"x = 5\",
so thats it',Connection=5
Into the strings
'This is a bad bad thing, but not as bad as it would be if \"x=5\", so thats
using the php preg_split() finction. So, I first want to find a regex that
finds the commas I have split out (bit not the common in the quoted string,
and then for each substring put that through another version of preg_split()
which finds the = as delimeters. Again in the quoted string, I want to
ignore the x=5 equal sign.
I actually did this in the end by the brute force method of scanning the
string character by character in php and using logic to determine where the
boundaries are. Its just a regex ought to be much more elegant.