Python Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesPython Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old September 17th, 2012, 06:00 AM
SachinS SachinS is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2012
Posts: 12 SachinS User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 4 h 37 m 41 sec
Reputation Power: 0
Regex help

Hi,
I have a dump from the web which I have to read and get the key value pairs present. But I am not able to find out the most optimum/(speediest) way to get this done.

The input data looks like the following:

And currently I am splitting based on the parameter name, and then looping over to concatenate the values. Could somebody help in a regular expression or a more speed/CPU efficient way of getting the values. For e.g, at the end of parsing; parameter Build will have a concatenated value as shown below.

Build = "R_Fzzz_v1, R_Fxxx_v1, R_Fyyy_v1"

Data Text:

<input name="Build" type="hidden" value="R_Fzzz_v1">
<input name="Build" type="hidden" value="R_Fxxx_v1">
<input name="Build" type="hidden" value="R_Fyyy_v1">
<input name="SDChangeNote" type="hidden" value="">
<input name="$SDTestResponsiblePersons" type="hidden" value="">
<input name="$SDTLStates" type="hidden" value="Passed">
<input name="$SDTLBuilds" type="hidden" value="">
<input name="$SDTLCases" type="hidden" value="">
<input name="Versions" type="hidden" value="SS1 SS.1.5">
<input name="Versions" type="hidden" value="SS2 SS4.26">
<input name="Versions" type="hidden" value="SS1 SS_4.28">
<input name="Versions" type="hidden" value="SS1 SS4.28">
<input name="Group" type="hidden" value="team1">
<input name="Group" type="hidden" value="team2">
<input name="SDType" type="hidden" value="Release">

Reply With Quote
  #2  
Old September 17th, 2012, 11:05 AM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,460 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 4 Days 6 h 56 m 42 sec
Reputation Power: 403
If this program is not fast enough, I can provide a significantly faster code using flex.
Lambert Electronics, USA. NY.
b49p23tivg at stny.rr.com

Code:
data = '''
    <input name="Build" type="hidden" value="R_Fzzz_v1">
    <input name="Build" type="hidden" value="R_Fxxx_v1">
    <input name="Build" type="hidden" value="R_Fyyy_v1">
    <input name="SDChangeNote" type="hidden" value="">
    <input name="$SDTestResponsiblePersons" type="hidden" value="">
    <input name="$SDTLStates" type="hidden" value="Passed">
    <input name="$SDTLBuilds" type="hidden" value="">
    <input name="$SDTLCases" type="hidden" value="">
    <input name="Versions" type="hidden" value="SS1 SS.1.5">
    <input name="Versions" type="hidden" value="SS2 SS4.26">
    <input name="Versions" type="hidden" value="SS1 SS_4.28">
    <input name="Versions" type="hidden" value="SS1 SS4.28">
    <input name="Group" type="hidden" value="team1">
    <input name="Group" type="hidden" value="team2">
    <input name="SDType" type="hidden" value="Release">
'''

import collections, re, pprint

result = collections.defaultdict(list)

findall = re.compile('"[^"]*"').findall   # is pattern sufficiently general?

for line in data.split('\n'):
    line = line.strip()
    if line.startswith('<input name=') and (' value="' in line):
        strings = findall(line)
        key = strings[0][1:-1]
        value = strings[-1][1:-1]
        result[key].append(value)

print('**** displaying the dictionary determined from your data****')
pprint.pprint(result)  # It seems that this dictionary is what you should actually want as output.

print('\n'*3+'**** displaying the environment you request****')
your_environment = {key:', '.join(value) for (key,value,) in result.items()}
pprint.pprint(your_environment)

print('\n'*3+'****use your parameter? I assumed you mean "variable" ****')
exec('print("the value of variable Versions is "+Versions)',your_environment) # run statements in your_environment



Output for you lazy heads who won't bother to run it:
Code:
$ python p.py
**** displaying the dictionary determined from your data****
defaultdict(<type 'list'>, {'$SDTLBuilds': [''], 'SDType': ['Release'], 'Group': ['team1', 'team2'], '$SDTestResponsiblePersons': [''], 'Versions': ['SS1 SS.1.5', 'SS2 SS4.26', 'SS1 SS_4.28', 'SS1 SS4.28'], 'SDChangeNote': [''], 'Build': ['R_Fzzz_v1', 'R_Fxxx_v1', 'R_Fyyy_v1'], '$SDTLCases': [''], '$SDTLStates': ['Passed']})



**** displaying the environment you request****
{'$SDTLBuilds': '',
 '$SDTLCases': '',
 '$SDTLStates': 'Passed',
 '$SDTestResponsiblePersons': '',
 'Build': 'R_Fzzz_v1, R_Fxxx_v1, R_Fyyy_v1',
 'Group': 'team1, team2',
 'SDChangeNote': '',
 'SDType': 'Release',
 'Versions': 'SS1 SS.1.5, SS2 SS4.26, SS1 SS_4.28, SS1 SS4.28'}



****use your parameter? I assumed you mean "variable" ****
the value of variable Versions is SS1 SS.1.5, SS2 SS4.26, SS1 SS_4.28, SS1 SS4.28
__________________
[code]Code tags[/code] are essential for python code!

Last edited by b49P23TIvg : September 17th, 2012 at 11:07 AM. Reason: Added the point of the message

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPython Programming > Regex help

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap