The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages - More
> Regex Programming
|
JavaScript - Regex "digit-digit"
Discuss Regex "digit-digit" in the Regex Programming forum on Dev Shed. Regex "digit-digit" Regular expressions forum covering PCRE and POSIX techniques, practices, and standards. Regular expressions help shorten coding time by providing the ability to compact many lines of code into one string.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

November 7th, 2012, 03:53 PM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 4
Time spent in forums: 10 m 14 sec
Reputation Power: 0
|
|
|
JavaScript - Regex "digit-digit"
I am trying to create regular expression that will parse digits input like: digit1 - digit2 OR digit1 - digit2,digit3 - digit4,digit5 - digit6,digit7 - digit8......
With examples:
ex1. 1-3
ex2. 1-3,5-7
ex3. 1-3,5-7,10-15
ex4. 1-3,7-10,12-15,19-25
...
Please note that last char should not be "," and there should not be two or more "-" like 1-5-10!
If anyone can help me with JavaScript regex, thanks in advance!
|

November 7th, 2012, 04:55 PM
|
 |
Still alive
|
|
Join Date: Mar 2007
Location: Washington, USA
|
|
|
\d is a digit and (...)* will repeat whatever is inside there as many times as possible, possibly not at all. Also ^ marks the beginning of the string and $ marks the end.
Hint: the expression will have the number stuff repeated twice.
|

November 7th, 2012, 05:03 PM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 4
Time spent in forums: 10 m 14 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by requinix \d is a digit and (...)* will repeat whatever is inside there as many times as possible, possibly not at all. Also ^ marks the beginning of the string and $ marks the end.
Hint: the expression will have the number stuff repeated twice. |
Thank you. I read those from regex tutorials. But I don't know how to write all regex string!
Note that with "..." I mean ETC. It is not part from input 
|

November 7th, 2012, 06:05 PM
|
 |
Still alive
|
|
Join Date: Mar 2007
Location: Washington, USA
|
|
|
And note that with "..." I mean whatever you want to put in there. It is not part of the regex.
You should have all the parts you need to construct the expression. Give it a first shot and we'll take it from there.
|

November 8th, 2012, 01:09 AM
|
|
|
|
It is not clear to me whether you know which digits you want to match or whether you want to match any digit and "-" combination.
I.e., for example #1 (1-3): do you want to match 1, 2, or 3, or do you want to match any digit range. In the first case, I would do a character class such as [1-3] or [123], in the second case I would do something like \d\-\d.
Please be more specific.
|

November 8th, 2012, 03:50 PM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 4
Time spent in forums: 10 m 14 sec
Reputation Power: 0
|
|
|
Any digit, something like
X-Y
X-Y,K-M
where X,Y,K,M are any possible digit
|

November 9th, 2012, 03:15 AM
|
|
|
You could start with something like this:
which should match (a digit, a dash, a digit and an optional comma) or (a digit and an optional comma), the whole thing at least once or repeated any number of times.
However, I do not know enough about your data, the above expression may match things that you do not want to match (for example, it will match a single number in your data.
If you don't want this to happen, and if a single digit (without interval) cannot happen, then it might be better to try this:
which will match (a number, a dash, a number and a comma) 0 or several times, followed by (a digit, a dash and a digit).
There may be further refinements (for example, can there be spaces between two intervals?), but it all depends on your real data, and we don't know enough about that.
|

November 9th, 2012, 05:29 AM
|
 |
pollyanna
|
|
Join Date: Jul 2012
Location: Germany
|
|
Hi,
yeah, your description is pretty confusing. You're talking about single digits all the time, but your example includes numbers with multiple digits.
So I guess what you actually want is
Code:
/^\d+-\d+(?:,\d+-\d+)*$/
If you also want to rule out leading zeros (like in 01), that would be
Code:
/^(?:[1-9]\d*|0)-(?:[1-9]\d*|0)(?:,(?:[1-9]\d*|0)-(?:[1-9]\d*|0))*$/
|

November 9th, 2012, 09:55 AM
|
|
|
Jacques, the original requireement did not say that the string should contain only these patterns, so I think the string start and end anchors (^and $) should probably be taken away (unless, of course, the OP clarifies this point differently).
I did not pay attention to the examples with 2-digit numbers. If this can happen, I would change my suggestion to:
Code:
(\d+-\d+,)*(\d+-\d+)
But again, regex is often a delicate balance between matching everything that need to be matched, and not matching what should not be matched. For this, a very precise description of the data is needed, we are very far from that.
|

November 9th, 2012, 10:26 AM
|
 |
|
|
Join Date: Jan 2004
Location: New Springfield, OH
|
|
Is this for parsing page ranges? If so, these work.
Code:
^(\s*\d+\s*\-\s*\d+\s*,?|\s*\d+\s*,?)+$ (allows spaces)
^(\d+-\d+,?|\d+,?)+$ (does not allow spaces)
^(\d+|\d+-\d+)(,?=(\d+|\d+-\d+))*$ (a different approach)
As mentioned, you could remove the ^ and $ is these patterns should match inside other strings. If you're using this for some form of validation, they should be left in so that a match doesn't allow any extra information.
Last edited by Nilpo : November 9th, 2012 at 10:28 AM.
|

November 9th, 2012, 10:52 AM
|
 |
pollyanna
|
|
Join Date: Jul 2012
Location: Germany
|
|
Quote: | Originally Posted by Laurent_R Jacques, the original requireement did not say that the string should contain only these patterns, so I think the string start and end anchors (^and $) should probably be taken away (unless, of course, the OP clarifies this point differently). |
A substring match obviously would make no sense in this case. Since he doesn't extract anything, it would be same as simply checking for \d+-\d+. The optional stuff after that makes no difference (unless he's specifically looking for index information or something).
The whole task only makes sense if he wants to check if a complete string matches this pattern.
Quote: | Originally Posted by Laurent_R I did not pay attention to the examples with 2-digit numbers. If this can happen, I would change my suggestion to:
Code:
(\d+-\d+,)*(\d+-\d+)
|
Putting the comma in the first pattern will force the regex parser to backtrack at the last entry. So it's better to make the first part mandatory and put the comma in the last part (which also makes more sense when you read it).
------------
Quote: | Originally Posted by Nilpo
Code:
^(\s*\d+\s*\-\s*\d+\s*,?|\s*\d+\s*,?)+$ (allows spaces)
^(\d+-\d+,?|\d+,?)+$ (does not allow spaces)
^(\d+|\d+-\d+)(,?=(\d+|\d+-\d+))*$ (a different approach)
|
He specifically ruled out lists ending with a comma, so that won't work.
|

November 9th, 2012, 01:00 PM
|
|
|
Quote: | Originally Posted by Jacques1 A substring match obviously would make no sense in this case. Since he doesn't extract anything, it would be same as simply checking for \d+-\d+. The optional stuff after that makes no difference (unless he's specifically looking for index information or something).
|
We just don't know whether the OP wants to extract something or not. I used capturing parens because I suspected the aim was to capture the ranges, you used non capturing parens because you suspected something different. That's the point: the description of the requirement is far too vague. Therefore, we can only give some tips, but not figure out a complete solution.
Quote: | Originally Posted by Jacques1
Putting the comma in the first pattern will force the regex parser to backtrack at the last entry. So it's better to make the first part mandatory and put the comma in the last part (which also makes more sense when you read it).
|
No, I do not think there is backtracking in my regex: it just gradually matches the string with the first part of the regex, and when the first part fails, it tries the second part. If the second part matches, there is no backtracking; if it fails, yes, it backtracks, but just once. I have just tried it on a string with 5,000 ranges, the result is immediate.
To tell the true, my regex will successfully match the string even if the last range is followed by the comma (but it will not capture the comma), and yours will also if you have to remove the end of line anchor. My assumption was that having a comma at the end of the matches is not wrong, but that what was required was that it should match the last interval even if there is no comma at the end. Here again, the requirement is vague.
If this is to be avoided, then I would have to change my regex to prevent the match if there is a trailing comma. I would then add that the last matched interval must be followed by something else than a comma or by the end of the string:
Code:
(\d+-\d+,)*(\d+-\d+)[^,]|$
|

November 9th, 2012, 03:04 PM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 4
Time spent in forums: 10 m 14 sec
Reputation Power: 0
|
|
|
Seems that we have more solutions. There are few solution for my task. Thank you for all your posts and ideas!
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|