Regex Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreRegex Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old February 25th, 2009, 01:39 PM
fishtoprecords's Avatar
fishtoprecords fishtoprecords is offline
Contributing User
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Sep 2007
Location: outside Washington DC
Posts: 2,642 fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)fishtoprecords User rank is General 41st Grade (Above 100000 Reputation Level)  Folding Points: 2568392 Folding Title: Super Ultimate Folder - Level 6Folding Points: 2568392 Folding Title: Super Ultimate Folder - Level 6Folding Points: 2568392 Folding Title: Super Ultimate Folder - Level 6Folding Points: 2568392 Folding Title: Super Ultimate Folder - Level 6Folding Points: 2568392 Folding Title: Super Ultimate Folder - Level 6Folding Points: 2568392 Folding Title: Super Ultimate Folder - Level 6Folding Points: 2568392 Folding Title: Super Ultimate Folder - Level 6Folding Points: 2568392 Folding Title: Super Ultimate Folder - Level 6Folding Points: 2568392 Folding Title: Super Ultimate Folder - Level 6Folding Points: 2568392 Folding Title: Super Ultimate Folder - Level 6
Time spent in forums: 3 Weeks 4 Days 23 h 21 m 56 sec
Reputation Power: 3682
Using regex to parse arguments

I'm working on parsing a string from an RFC, and I can't get my regex to work. So I've written a small Java program to test. I don't understand the results, so I can't figure out what I'm doing wrong.

The applicable section deals with a "type=" string.

The regex that I'm using is:
Code:
type=(HOME|WORK|PREF|MSG|CELL)(,(HOME|WORK|PREF|MSG|CELL))*(;type=(HOME|WORK|PREF|MSG|CELL)(,(HOME|W  ORK|PREF|MSG|CELL))*)*

The specs are that there can be either a series of type=X separated by semicolons,
type=X;type=Y;type=Z
or you can have a series of arguments,
type=X,Y,Z
where the X values are keywords

Code:
private static final String teltypesarg = "HOME|WORK|PREF|MSG|CELL";
private static final String teltypeseq = "type=("+teltypesarg + ")(,(" + teltypesarg +"))*";
private static final String teltypefull = teltypeseq + "(;"+teltypeseq + ")*";
static final Pattern teltypesPat = Pattern.compile(teltypefull, Pattern.CASE_INSENSITIVE);
    String[] tests = {
        "type=CELL,pref:(301) 996-1054",
        "type=INTERNET;type=WORK;type=pref:jiabr@comcast.net",
        "type=CELL,pref,msg:(703) 304-8914",
    };
    System.out.println(teltypefull);
    for (String s : tests) {
        System.out.println(s);
        Matcher m = teltypesPat.matcher(s);
        if ( m.find()) {
            for ( int j =1; j <= m.groupCount(); j++)
                System.out.println("gc: " + j + " = " + m.group(j) );
        }
    }


It seems to work fine for the "type=X;type=Y" model
The output doesn't do a proper greedy match with the series of keywords separated by commas. such as

Code:
type=CELL,pref,msg:(703) 304-8914
gc: 1 = CELL
gc: 2 = ,msg
gc: 3 = msg
gc: 4 = null
gc: 5 = null
gc: 6 = null
gc: 7 = null


Thanks
pat

Reply With Quote
  #2  
Old February 26th, 2009, 05:56 PM
OmegaZero OmegaZero is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: May 2007
Posts: 737 OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level)OmegaZero User rank is General (90000 - 100000 Reputation Level) 
Time spent in forums: 3 Weeks 4 Days 23 h 23 m 50 sec
Reputation Power: 928
If I'm not mistaken, Java uses a PCRE. One of the limitations of that is the regex
Code:
( pattern )*

only captures the last time it matches.

You could try this instead for a comma separated list
Code:
( pattern (?: , pattern )* )


Though if I were doing it, I would use String.split() first on the semi-colon then on the equals then on the comma.
__________________
sub{*{$::{$_}}{CODE}==$_[0]&& print for(%:: )}->(\&Meh);

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreRegex Programming > Using regex to parse arguments

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap