Regex Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreRegex Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old January 7th, 2010, 04:37 AM
regex regex is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2010
Posts: 5 regex User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 41 m 57 sec
Reputation Power: 0
Simple question

hi,
i have these strings:
STRING:TEXT
STRING:\:->
etc

i want to split them to {STRING,TEXT} and {STRING,:->}

i tried many options but i was sure this would work:
[^\\]:

it removes the G from string and keeps the \.
i am using Java
String[] strArr = str.split("[^\\]:");

BTW i can choose any delimiter i want before the :-> it could be \\ or xx or anything i choose for it to be.

any solutions?
Thanks

Reply With Quote
  #2  
Old January 7th, 2010, 07:53 AM
lonekorean lonekorean is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2010
Location: Norcross, GA
Posts: 8 lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 h 41 m 10 sec
Reputation Power: 0
Try this:

Code:
(?<!\\):


The (?<!...) part is called a zero-width negative lookbehind assertion, which is a fancy term for "check if something is NOT to the left (in this case, a backslash), but don't actually capture it, just look". That's why your G is being removed, because your pattern was actually capturing it.

Reply With Quote
  #3  
Old January 7th, 2010, 12:04 PM
regex regex is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2010
Posts: 5 regex User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 41 m 57 sec
Reputation Power: 0
thanks, still not working

Thanks i did try it and it doesn't work
i tried a few others and they didn't work either
String[] strArr = str.split("(?<!\\):");

is the ! and the ^ the same?
is the syntax correct for java?

when i try what you sent me it i get (note how it ignores the second \) :
java.util.regex.PatternSyntaxException: Unclosed group near index 7
(?<!\):
^
at java.util.regex.Pattern.error(Unknown Source)
at java.util.regex.Pattern.accept(Unknown Source)
at java.util.regex.Pattern.group0(Unknown Source)
at java.util.regex.Pattern.sequence(Unknown Source)
at java.util.regex.Pattern.expr(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.util.regex.Pattern.<init>(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.lang.String.split(Unknown Source)
at java.lang.String.split(Unknown Source)

Reply With Quote
  #4  
Old January 7th, 2010, 12:35 PM
lonekorean lonekorean is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2010
Location: Norcross, GA
Posts: 8 lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 h 41 m 10 sec
Reputation Power: 0
Hm, my gut tells me that this is more of a string error than a regex error. The error message you pasted shows only a single "\", which would escape the ")", which explains why your compiler thinks that the group is not closed.

It has been many years since I've touched Java, so I'm afraid I can't offer much advice in that area. Try double escaping the "\" by typing in "\\\\" and see what happens. I know, that looks terribly ugly.

To answer your other questions, no, "!" and "^" are not really equivalent.

Reply With Quote
  #5  
Old January 7th, 2010, 01:49 PM
regex regex is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2010
Posts: 5 regex User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 41 m 57 sec
Reputation Power: 0
ok, almost there :->

i removed the \\ . it was too confusing for now.
i am using x as a delimeter
"STRING:x:->"

what i want is {STRING,:->}
what i am getting is {STRING,x:->}

with this code:
String[] strArr = str.split("(?<!x):");
how do i "capture" the delimiter as well (in this case the 'x')?
Thanks again for your help

Reply With Quote
  #6  
Old January 7th, 2010, 03:24 PM
lonekorean lonekorean is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2010
Location: Norcross, GA
Posts: 8 lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 h 41 m 10 sec
Reputation Power: 0
Well, not capturing the "\" was the only tricky part. If you no longer care about it, then the regex to split on ":x" becomes:

Code:
:x


You don't even need regex anymore.

Reply With Quote
  #7  
Old January 9th, 2010, 01:05 PM
regex regex is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2010
Posts: 5 regex User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 41 m 57 sec
Reputation Power: 0
ok, correction

ok, i didn't make myself clear.
i need for this string
"STRING:\:->"
to get
{STRING,:->}

and for STRING:\hello
to get {STRING,\hello}


so...i need to remove the \: if it exists or : for the rest of the cases that is why i need regex.
Thanks again for all your help

Reply With Quote
  #8  
Old January 10th, 2010, 08:16 PM
lonekorean lonekorean is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2010
Location: Norcross, GA
Posts: 8 lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level)lonekorean User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 h 41 m 10 sec
Reputation Power: 0
The original pattern I gave is pretty much the solution for splitting your input into pairs. After that is done, then you could do a string replace on the second value of each pair to replace all "\:" with ":".

Or, if the input you're trying to parse always starts with "STRING:", then you probably don't even need regex since the part you want to parse away is static. For example, you could just do a string split (not regex split) with "STRING:".

Reply With Quote
  #9  
Old January 13th, 2010, 05:06 AM
regex regex is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2010
Posts: 5 regex User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 41 m 57 sec
Reputation Power: 0
thanks

Ok, thanks lonekorean, i will try and see how to resolve this.
maybe the best thing is to remove the first char after the split if it is '\'. i thought there would be a cleaner solution...but i guess not
Thanks for all your help

Reply With Quote
  #10  
Old January 20th, 2010, 06:22 AM
ishnid's Avatar
ishnid ishnid is offline
kill 9, $$;
Dev Shed God 4th Plane (6500 - 6999 posts)
 
Join Date: Sep 2001
Location: Shanghai, An tSín
Posts: 6,894 ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level)ishnid User rank is General 44th Grade (Above 100000 Reputation Level) 
Time spent in forums: 4 Months 2 Weeks 1 Day 22 h 36 m 34 sec
Reputation Power: 3885
Quote:
Originally Posted by regex
Ok, thanks lonekorean, i will try and see how to resolve this.
maybe the best thing is to remove the first char after the split if it is '\'. i thought there would be a cleaner solution...but i guess not
Thanks for all your help


Just the first character?

Generally, this sort of situation arises when using a backslash to 'escape' another character (in this case to differentiate between the colon character that is a delimiter and the one that's actually part of the data). Are you absolutely guaranteed that the only colons in the data that need escaping are going to be at the very start of the segment?

If not, then you need to unescape all the colon characters after you've done your split. On first instinct, that might suggest that you just remove all backslash characters. However, there might be backslash characters in the data too, which would themselves be escaped (i.e. represented by \\) to differentiate themselves from the backslashes that are designed to escape things. So basically you need to remove any backslash characters that aren't themselves preceded by another backslash.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreRegex Programming > Simple question

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap