If I have a String which is delimited by a character, let’s say this:
a-b-c
and I want to keep the delimiters, I can use look-behind & look-ahead to keep the delimiters themselves, like:
string.split("((?<=-)|(?=-))");
which results in
a
-
b
-
c
Now, if one of the delimiters is escaped, like this:
a-b\-c
And I want to honor the escape, I figured out to use a regex like this:
((?<=-(?!(?<=\\-))) | (?=-(?!(?<=\\-))))
ergo
string.split("((?<=-(?!(?<=\\\\-)))|(?=-(?!(?<=\\\\-))))"):
Now, this works & results in:
a
-
b\-c
(The backslash I’d after remove with string.replace("\\", "");
, I haven’t found a way to include that in the regex)
My Problem is one of understanding.
The way I understood it, the regex would be, in words,
split ((if '-' is before (unless ('\-' is before))) or (if '-' is after (unless ('\-' is before))))
Why shouldn’t the last part be “unless \
is before”? If ‘-‘ is after, that means we’re between ‘\’ & ‘-‘, so only \
should be before, not \\-
, yet it doesn’t work if I alter the regex to reflect that like this:
((?<=-(?!(?<=\\-))) | (?=-(?!(?<=\\))))
Result: a
, -
, b\
, -c
What is the reason for this? Where is my error in reasoning?
Why shouldn’t the last part be “unless \ is before”?
In
(?=-(?!(?<=\\-)))) ^here
cursor is after -
so "unless \ is before"
will always be false since we always have -
before current position.
Maybe easier regex would be
(?<=(?<!\\\\)-)|(?=(?<!\\\\)-)
(?<=(?<!\\\\)-)
will check if we are after-
that has no\
before.(?=(?<!\\\\)-)
will check if we are before-
that has no\
before.
No comments:
Post a Comment