The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages - More
> Regex Programming
|
Simple question
Discuss Simple question in the Regex Programming forum on Dev Shed. Simple question Regular expressions forum covering PCRE and POSIX techniques, practices, and standards. Regular expressions help shorten coding time by providing the ability to compact many lines of code into one string.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

January 7th, 2010, 04:37 AM
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 5
Time spent in forums: 41 m 57 sec
Reputation Power: 0
|
|
|
Simple question
hi,
i have these strings:
STRING:TEXT
STRING:\:->
etc
i want to split them to {STRING,TEXT} and {STRING,:->}
i tried many options but i was sure this would work:
[^\\]:
it removes the G from string and keeps the \.
i am using Java
String[] strArr = str.split("[^\\]:");
BTW i can choose any delimiter i want before the :-> it could be \\ or xx or anything i choose for it to be.
any solutions?
Thanks
|

January 7th, 2010, 07:53 AM
|
|
Registered User
|
|
Join Date: Jan 2010
Location: Norcross, GA
|
|
Try this:
The (?<!...) part is called a zero-width negative lookbehind assertion, which is a fancy term for "check if something is NOT to the left (in this case, a backslash), but don't actually capture it, just look". That's why your G is being removed, because your pattern was actually capturing it.
|

January 7th, 2010, 12:04 PM
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 5
Time spent in forums: 41 m 57 sec
Reputation Power: 0
|
|
|
thanks, still not working
Thanks i did try it and it doesn't work
i tried a few others and they didn't work either
String[] strArr = str.split("(?<!\\):");
is the ! and the ^ the same?
is the syntax correct for java?
when i try what you sent me it i get (note how it ignores the second \) :
java.util.regex.PatternSyntaxException: Unclosed group near index 7
(?<!\):
^
at java.util.regex.Pattern.error(Unknown Source)
at java.util.regex.Pattern.accept(Unknown Source)
at java.util.regex.Pattern.group0(Unknown Source)
at java.util.regex.Pattern.sequence(Unknown Source)
at java.util.regex.Pattern.expr(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.util.regex.Pattern.<init>(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.lang.String.split(Unknown Source)
at java.lang.String.split(Unknown Source)
|

January 7th, 2010, 12:35 PM
|
|
Registered User
|
|
Join Date: Jan 2010
Location: Norcross, GA
|
|
Hm, my gut tells me that this is more of a string error than a regex error. The error message you pasted shows only a single "\", which would escape the ")", which explains why your compiler thinks that the group is not closed.
It has been many years since I've touched Java, so I'm afraid I can't offer much advice in that area. Try double escaping the "\" by typing in "\\\\" and see what happens. I know, that looks terribly ugly.
To answer your other questions, no, "!" and "^" are not really equivalent.
|

January 7th, 2010, 01:49 PM
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 5
Time spent in forums: 41 m 57 sec
Reputation Power: 0
|
|
|
ok, almost there :->
i removed the \\ . it was too confusing for now.
i am using x as a delimeter
"STRING:x:->"
what i want is {STRING,:->}
what i am getting is {STRING,x:->}
with this code:
String[] strArr = str.split("(?<!x):");
how do i "capture" the delimiter as well (in this case the 'x')?
Thanks again for your help
|

January 7th, 2010, 03:24 PM
|
|
Registered User
|
|
Join Date: Jan 2010
Location: Norcross, GA
|
|
Well, not capturing the "\" was the only tricky part. If you no longer care about it, then the regex to split on ":x" becomes:
You don't even need regex anymore. 
|

January 9th, 2010, 01:05 PM
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 5
Time spent in forums: 41 m 57 sec
Reputation Power: 0
|
|
|
ok, correction
ok, i didn't make myself clear.
i need for this string
"STRING:\:->"
to get
{STRING,:->}
and for STRING:\hello
to get {STRING,\hello}
so...i need to remove the \: if it exists or : for the rest of the cases that is why i need regex.
Thanks again for all your help
|

January 10th, 2010, 08:16 PM
|
|
Registered User
|
|
Join Date: Jan 2010
Location: Norcross, GA
|
|
|
The original pattern I gave is pretty much the solution for splitting your input into pairs. After that is done, then you could do a string replace on the second value of each pair to replace all "\:" with ":".
Or, if the input you're trying to parse always starts with "STRING:", then you probably don't even need regex since the part you want to parse away is static. For example, you could just do a string split (not regex split) with "STRING:".
|

January 13th, 2010, 05:06 AM
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 5
Time spent in forums: 41 m 57 sec
Reputation Power: 0
|
|
|
thanks
Ok, thanks lonekorean, i will try and see how to resolve this.
maybe the best thing is to remove the first char after the split if it is '\'. i thought there would be a cleaner solution...but i guess not
Thanks for all your help
|

January 20th, 2010, 06:22 AM
|
 |
kill 9, $$;
|
|
Join Date: Sep 2001
Location: Shanghai, An tSín
|
|
Quote: | Originally Posted by regex Ok, thanks lonekorean, i will try and see how to resolve this.
maybe the best thing is to remove the first char after the split if it is '\'. i thought there would be a cleaner solution...but i guess not
Thanks for all your help |
Just the first character?
Generally, this sort of situation arises when using a backslash to 'escape' another character (in this case to differentiate between the colon character that is a delimiter and the one that's actually part of the data). Are you absolutely guaranteed that the only colons in the data that need escaping are going to be at the very start of the segment?
If not, then you need to unescape all the colon characters after you've done your split. On first instinct, that might suggest that you just remove all backslash characters. However, there might be backslash characters in the data too, which would themselves be escaped (i.e. represented by \\) to differentiate themselves from the backslashes that are designed to escape things. So basically you need to remove any backslash characters that aren't themselves preceded by another backslash.
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|