If I have a String which is delimited by a character, let’s say this:
a-b-c
and I want to keep the delimiters, I can use look-behind & look-ahead to keep the delimiters themselves, like:
string.split("((?<=-)|(?=-))");which results in
a-b-c
Now, if one of the delimiters is escaped, like this:
a-b\-c
And I want to honor the escape, I figured out to use a regex like this:
((?<=-(?!(?<=\\-))) | (?=-(?!(?<=\\-))))
ergo
string.split("((?<=-(?!(?<=\\\\-)))|(?=-(?!(?<=\\\\-))))"):Now, this works & results in:
a-b\-c
(The backslash I’d after remove with string.replace("\\", "");, I haven’t found a way to include that in the regex)
My Problem is one of understanding.
The way I understood it, the regex would be, in words,
split ((if '-' is before (unless ('\-' is before))) or (if '-' is after (unless ('\-' is before))))
Why shouldn’t the last part be “unless \ is before”? If ‘-‘ is after, that means we’re between ‘\’ & ‘-‘, so only \ should be before, not \\-, yet it doesn’t work if I alter the regex to reflect that like this:
((?<=-(?!(?<=\\-))) | (?=-(?!(?<=\\))))
Result: a, -, b\, -c
What is the reason for this? Where is my error in reasoning?
Why shouldn’t the last part be “unless \ is before”?
In
(?=-(?!(?<=\\-)))) ^here
cursor is after - so "unless \ is before" will always be false since we always have - before current position.
Maybe easier regex would be
(?<=(?<!\\\\)-)|(?=(?<!\\\\)-)
(?<=(?<!\\\\)-)will check if we are after-that has no\before.(?=(?<!\\\\)-)will check if we are before-that has no\before.
No comments:
Post a Comment