Regex Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreRegex Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old September 17th, 2012, 10:42 PM
jemagee jemagee is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2009
Posts: 41 jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 7 h 8 m 57 sec
Reputation Power: 9
Newbie Please Help

I was progressing (I thought) pretty well at understanding regex using ruby but something I thought should work is not working

<title>Philadelphia 76ers vs. New York Knicks

Code:
/<title>(\w+\s){1,3}\svs/


In my mind should return:
<title>Philadelphia 76ers vs

Obviously it doesn't or I wouldn't be here, I'm guessing that when you use the ranges on the metacharacters there is a trick to grouping them together?

I was running through the tutorial at regular-expressions.info and thought I knew what I was doing but after 30 minutes I don't wanna keep throwing things at it that will frustrate me. I know I'm missing either a trick or something dead easy, regular expressions have always been a bugaboo for me - they just don't click - I thought they were and it's vital for my project that I get it - but I just don't see this.

Any help - even a push in the right direction would be vastly appreciated.

Thank you

Reply With Quote
  #2  
Old September 18th, 2012, 04:49 PM
Kravvitz's Avatar
Kravvitz Kravvitz is offline
CSS & JS/DOM Adept
Dev Shed God 30th Plane (19500 - 19999 posts)
 
Join Date: Jul 2004
Location: USA
Posts: 19,835 Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level) 
Time spent in forums: 6 Months 1 Day 22 h 11 m
Reputation Power: 4192
I suspect that it's not working because your code requires two spaces (or other white-space characters) before the "vs".

Try this, which will require one or more white-space characters (technically, one plus zero or more):
Code:
/<title>(\w+\s){1,3}\s*vs/
__________________
Spreading knowledge, one newbie at a time. I'm available for hire at Dynamic Site Solutions.

Check out my blog. | Learn CSS. | PHP includes | X/HTML Validator | CSS validator | Common CSS Mistakes | Common JS Mistakes

Remember people spend most of their time on other people's sites (so don't violate web design conventions).

Reply With Quote
  #3  
Old September 18th, 2012, 07:38 PM
jemagee jemagee is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2009
Posts: 41 jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 7 h 8 m 57 sec
Reputation Power: 9
Quote:
Originally Posted by Kravvitz
I suspect that it's not working because your code requires two spaces (or other white-space characters) before the "vs".

Try this, which will require one or more white-space characters (technically, one plus zero or more):
Code:
/<title>(\w+\s){1,3}\s*vs/


That didn't work sadly, let's say we drop the vs issue


Code:
/<title>(\w+\s{1,3}/


Still won't yeild the right issue - it yields only 76ers.

One Solution I came up with today at work is

Code:
/\w+\svs/

which does yield 76ers vs which I can work with

However this was just the first half of my problem

Here is the full line of what I'm working with

Code:
<title>Philadelphia 76ers vs. New York Knicks - Box Score - January 11, 2012 - ESPN</title>


What I'd LIKE to isolate is Philadelphia 76ers AND New York Knicks

If you know the NBA you know where i'm going with this, the city / names between the vs will vary file after file but I need to extract that information to populate a database

Reply With Quote
  #4  
Old September 18th, 2012, 08:39 PM
jemagee jemagee is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2009
Posts: 41 jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 7 h 8 m 57 sec
Reputation Power: 9
Solved?

So after some trial and error and a bit more study I believe I have a solution (that I've tested by adding additional words between <title> and vs

For anyone interested in such a scan - what worked in ruby was this

Code:
/<title>([\w+\s+]*)vs/


The parentheses and the [] combination works while the single parentheses didn't - and ([]) works differently than ([])

I confirmed that this works by adding additional words and the regex works each time.

Reply With Quote
  #5  
Old September 18th, 2012, 09:11 PM
Kravvitz's Avatar
Kravvitz Kravvitz is offline
CSS & JS/DOM Adept
Dev Shed God 30th Plane (19500 - 19999 posts)
 
Join Date: Jul 2004
Location: USA
Posts: 19,835 Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level)Kravvitz User rank is General 48th Grade (Above 100000 Reputation Level) 
Time spent in forums: 6 Months 1 Day 22 h 11 m
Reputation Power: 4192
Oh, right. The "{1,3}" after the capturing group would only capture the last time it's used.

Congrats on finding a solution yourself. In case you're interested in an alternative solution...

The "?:" at the beginning of the inner pair of parenthesis makes it just a plain group instead of a capturing group.
Code:
/<title>((?:\w+\s){1,3})\s*vs/

Reply With Quote
  #6  
Old September 19th, 2012, 10:05 AM
jemagee jemagee is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2009
Posts: 41 jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level)jemagee User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 7 h 8 m 57 sec
Reputation Power: 9
I haven't fully completed the regular-expressions.info tutorial yet my understanding of the ?: was that it had to do with back references?

In the end - after some thought, the information I'm going to need is that right before the vs and right before the - (I need only the team name to identify the team abbreviation from my other table for insertion into another table - it's all part of a parsing system I'm building to download nba box scores and shot charts) so I'll have to figure those out, but I was glad I solved the issue just because regex has always been a problem to me - I'm glad I toughed it out. I'll look at that ?: more deeply so I can learn the difference between a plain group and capturing group.

I think what you're saying is that if it's a capturing group it's looking for the same thing over and over that it captured the first time (like philadelphia, repeatedly?), but I thought that only referred to the back references?

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreRegex Programming > Newbie Please Help

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap