March 5th, 2018, 08:40 AM
-
Assistance Parsing Info
Hello everyone, I need some assistance parsing a variable length field for some info. It is currently a single field in a spreadsheet and I need to grab an identifier to store in a new field so users can sort.
An example of the field in question:
ObjectId:"https:\/\/someone.com\/sites\/TYAB\/Collaboration\/CooperDivision\/FGF"
ObjectId:"https:\/\/someone.com\/teams\/OAH\/Solomon"
ObjectId:"https:\/\/someone.com\/sites\/MDH\/cfh\/CYresources"
ObjectId:"https:\/\/someone.com\/teams\/DPSC\/DVL\/SupportServices"
ObjectId:"https:\/\/someone.com\/sites\/MDHR"
ObjectId:"https:\/\/someone.com\/teams\/TYAB"
The unique identififer is after the first "/" after the words "sites" or after the word "teams"
Just FYI, the "\/" you see is a "\" and then a "/" which makes it look like a V.
So, I'm looking to grab TYAB, OAH, MDH, DPSC, MDHR and TYAB from these particular lines. I have thousands of entries to parse through like this.
Any help would be appreciated on just how I should approach this.
Thank you.
March 5th, 2018, 08:56 AM
-
So look for "sites" or "teams", the escaped slash, then the unique identifier.
What regex have you tried and how did it not work?
March 5th, 2018, 09:16 AM
-
Well, I know if I start with this following I can start to pinpoint the beginning but I'm not sure how to address "sites" and "teams" after that. I know I ultimately want everything between the 4th "/" and the 5th "\" or end of line as not all will have a fifth.
com\\\/
I'm guessing the easier way is to address the 4th "\" and 5th "/" but I'm having a problem trying to get my head around it. I'm new to this.
March 5th, 2018, 09:48 AM
-
You don't even have to care about the "com" or really anything else on that line because there's only one "sites\/" and one "team\/". That's all you need.
Code:
(sites|team)\/([A-Z]+)
Needs one or three two more backslashes to deal with the one already in there.
Last edited by requinix; March 5th, 2018 at 11:22 AM.
March 5th, 2018, 10:20 AM
-
I modified it slightly to the following:
(sites|teams)(\\\/)([A-Z]+)
I had to do this as it wasn't finding the "\/" correctly.
So, now it will highlight either sites\/(Identifier) or teams\/(Identifier) but I only want the Identifier
Example of what is being highlighted in my current search:
sites\/MDHR
teams\/MDHR
I'm trying to target just the "MDHR" piece which is the identifier.
After I get this then I will have to figure out how I transition the identifier over to a new field or file, but baby steps for me right now.
March 5th, 2018, 11:23 AM
-
The identifier will be $3 or \3 or both depending on your editor/regex engine. It's okay if it highlights the rest.