#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Location
    NYC
    Posts
    3
    Rep Power
    0

    Splitting a query string on a & char to get different arguments; Python:


    I'm looking to split the current query string on a '&' char so I can get different query arguments. From those arguments I am looking to put them into different files, i.e. p_file.txt, blog_file.txt, portfolio_file.txt, etc. I have been stuck trying to split a list of queries but it is not possible. I am open for help.

    def parse_file():
    # Open the file for reading
    infile = open("URLlist.txt", 'r')
    # Read every single line of the file into an array of lines
    lines = infile.readlines()

    # For every line in the array of lines, do something with that line
    for line in lines:
    # The lines we get back from readlines will have a newline
    # character appended. So, let's strip that out as we parse
    # the URL from the line into its components
    line = line.strip()
    url = urlparse(line)
    # If the url has a query component...(ie. url.query)
    if url.query:
    # ...then print it out! We need to strip the trailing newline
    # character from the url query, because urlparse doesn't do that
    # for us.
    queryvars = url.query
    print queryvars
    #for q in queryvars:
    #print q
    parse_file()

    I'm pretty sure I need to use urlparse.parse_qs. I, however, do not know how to integrate this to the code nor do I know how to get it to start printing out separate char's into different files, eg: p=47 would go to p_file.txt, or iframe = 57 would go into iframe_file.txt.
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,995
    Rep Power
    481
    Please show some representative sample lines from your file
    URLlist.txt
    and also show what you'd like done with them. Are there special cases such as a quoted ampersand does not mark a split, or &0123 denotes some strange unicode character which also doesn't mark a place to split lines?
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Location
    NYC
    Posts
    3
    Rep Power
    0
    Originally Posted by b49P23TIvg
    Please show some representative sample lines from your file
    URLlist.txt
    and also show what you'd like done with them. Are there special cases such as a quoted ampersand does not mark a split, or &0123 denotes some strange unicode character which also doesn't mark a place to split lines?
    sample: My file is a list of URL's which I am not permitted to post as of right now.

    As for special case's, I have none for now. I am just looking to get this code up and running ASAP.
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,995
    Rep Power
    481
    Replace the secret but irrelevant parts with gibberish.
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Location
    NYC
    Posts
    3
    Rep Power
    0
    asdasdasdasdasdasd./tony/?p=8327
    asdasdas casdasdsa./tony/?p=8346
    gfgfdgdfgfdgdfgdfgfdgdf/tony/?p=921
    fdsgssdfgfgdg/foreclosure-myths-debunked
    sdafsaf dfdfdfd/?attachment_id=350
    asfdsfsafsfsaf/2011/08/20/wide-ranges-of-jobs-at-sacramento-are-available

    luckily, the query is after the ?. And it is what i need to be split.
  10. #6
  11. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,995
    Rep Power
    481
    Maybe you want something like this?
    Code:
    # setup
    import io
    
    inp = io.StringIO('''asdasdasdasdasdasd./tony/?p=8327
    asdasdas casdasdsa./tony/?p=8346
    gfgfdgdfgfdgdfgdfgfdgdf/tony/?p=921
    fdsgssdfgfgdg/foreclosure-myths-debunked
    sdafsaf dfdfdfd/?attachment_id=350
    asfdsfsafsfsaf/2011/08/20/wide-ranges-of-jobs-at-sacramento-are-available
    ''')
    
    
    
    # Initially key_and_file associates keys with file names.
    # The values are transformed into open files---I should have use a different variable.
    # The following program logic depends on the file name as type str.
    
    key_and_file = {                           # reads like a jail break
            'p': 'name_for_p_file',
            'attachment_id': 'another_appropriate_file_name',
        }
    
    for line in inp:
        fields = line.split('?')
        if len(fields) != 2:
            continue
        subfields = fields[1].split('=')
        if len(subfields) != 2:
            continue
        (key, value,) = subfields
        if key not in key_and_file:
            continue
        if isinstance(key_and_file[key], str):     # need to open the file
            key_and_file[key] = open(key_and_file[key], 'w') # I think you want to write to these files
        key_and_file[key].write(value)                       #Something like this
    
    for (key, value,) in key_and_file.items():    # close any open files.
        if not isinstance(value, str):
            value.close()
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo