#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2018
    Posts
    3
    Rep Power
    0

    Assistance Parsing Info


    Hello everyone, I need some assistance parsing a variable length field for some info. It is currently a single field in a spreadsheet and I need to grab an identifier to store in a new field so users can sort.

    An example of the field in question:

    ObjectId:"https:\/\/someone.com\/sites\/TYAB\/Collaboration\/CooperDivision\/FGF"
    ObjectId:"https:\/\/someone.com\/teams\/OAH\/Solomon"
    ObjectId:"https:\/\/someone.com\/sites\/MDH\/cfh\/CYresources"
    ObjectId:"https:\/\/someone.com\/teams\/DPSC\/DVL\/SupportServices"
    ObjectId:"https:\/\/someone.com\/sites\/MDHR"
    ObjectId:"https:\/\/someone.com\/teams\/TYAB"

    The unique identififer is after the first "/" after the words "sites" or after the word "teams"

    Just FYI, the "\/" you see is a "\" and then a "/" which makes it look like a V.

    So, I'm looking to grab TYAB, OAH, MDH, DPSC, MDHR and TYAB from these particular lines. I have thousands of entries to parse through like this.

    Any help would be appreciated on just how I should approach this.

    Thank you.
  2. #2
  3. Impoverished Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,796
    Rep Power
    9646
    So look for "sites" or "teams", the escaped slash, then the unique identifier.

    What regex have you tried and how did it not work?
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2018
    Posts
    3
    Rep Power
    0
    Well, I know if I start with this following I can start to pinpoint the beginning but I'm not sure how to address "sites" and "teams" after that. I know I ultimately want everything between the 4th "/" and the 5th "\" or end of line as not all will have a fifth.

    com\\\/

    I'm guessing the easier way is to address the 4th "\" and 5th "/" but I'm having a problem trying to get my head around it. I'm new to this.
  6. #4
  7. Impoverished Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,796
    Rep Power
    9646
    You don't even have to care about the "com" or really anything else on that line because there's only one "sites\/" and one "team\/". That's all you need.

    Code:
    (sites|team)\/([A-Z]+)
    Needs one or three two more backslashes to deal with the one already in there.
    Last edited by requinix; March 5th, 2018 at 11:22 AM.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2018
    Posts
    3
    Rep Power
    0
    I modified it slightly to the following:

    (sites|teams)(\\\/)([A-Z]+)

    I had to do this as it wasn't finding the "\/" correctly.

    So, now it will highlight either sites\/(Identifier) or teams\/(Identifier) but I only want the Identifier

    Example of what is being highlighted in my current search:

    sites\/MDHR
    teams\/MDHR

    I'm trying to target just the "MDHR" piece which is the identifier.

    After I get this then I will have to figure out how I transition the identifier over to a new field or file, but baby steps for me right now.
  10. #6
  11. Impoverished Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,796
    Rep Power
    9646
    The identifier will be $3 or \3 or both depending on your editor/regex engine. It's okay if it highlights the rest.

IMN logo majestic logo threadwatch logo seochat tools logo