Perl Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesPerl Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old February 28th, 2013, 09:33 AM
gw1500se gw1500se is online now
Contributing User
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Jul 2003
Posts: 2,875 gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level) 
Time spent in forums: 1 Year 1 Week 6 Days 5 h 34 m 24 sec
Reputation Power: 581
Complex (for me) string manipulations

I am trying to write a script that modifies tables in Wiki markup (not that that really matters) without changing the rest of the page content. The make up of my string is as follows.
1) Text preceding a table
2) A table
3) Text following the table
4) If another table then 2)
5) else done

That is essentially how I want to split the string. A table start is identified by a double vertical bar or pipes (||). There are many pipes (including double pipes) in the table to denote cells and the end of a row is denoted by a pipe and new line. The real problem (for me) is finding the end of the table. When a row ends (|\n) and other than white space and pipe is found, the table has ended. But this is after the fact.

I could use some help figuring out how to construct the logic for this. I can find the start of a table with index and the double pipe then split the string at that point for 1). What I don't know how to do is find the last pipe of that table without running into the first pipe of the next table. Once I do that I'm home free as I can just look for the double pipe again in what is left of the string. TIA
__________________
There are 10 kinds of people in the world. Those that understand binary and those that don't.

Last edited by gw1500se : February 28th, 2013 at 11:18 AM.

Reply With Quote
  #2  
Old February 28th, 2013, 10:57 AM
Laurent_R Laurent_R is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jun 2012
Posts: 504 Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 4 Days 18 h 54 m 9 sec
Reputation Power: 385
If the volume of your data is not too big, one easy approach would be to store your input in an array of lines. It is then easier to go back to the previous line when you find a line end.

Another approach is deferred action by having always two lines (current and next) in memory: only do what you want to do to line n after you have checked what line n + 1 has.

It would be much easier to help you if you provided some samples lines of your input.

Reply With Quote
  #3  
Old February 28th, 2013, 11:12 AM
gw1500se gw1500se is online now
Contributing User
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Jul 2003
Posts: 2,875 gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level) 
Time spent in forums: 1 Year 1 Week 6 Days 5 h 34 m 24 sec
Reputation Power: 581
Thanks for the reply. This exists as a single string but I suppose I could break it up on new line characters. What I can't predict is how well behaved the string will be. Supposed a new line character shows up in the middle of a cell (although unusual it is not illegal)?

Here is a sample of the Wiki markup:
Code:
h2. Available FBE's\\                                                                                                                                                                                       
                                                                                                                                                                                                            
|| h4. Products ||                                                                                                                                                                                          

| [ADBU-FBE-1.2_rh72_WAAS|PBE:ADBU-FBE-1.2_rh72_WAAS] | [FBE-1.0-fc10-i386_Gateway-7908|PBE:FBE-1.0-fc10-i386_Gateway-7908] | [SPVTG_FBE-1.0_fc10-i386_SCM-G6-RP|PBE:SPVTG_FBE-1.0_fc10-i386_SCM-G6-RP] |
| [SPVTG_FBE-1.0_fc10-i386_SCM-GW7908|PBE:SPVTG_FBE-1.0_fc10-i386_SCM-GW7908] | [SPVTG_FBE-1.0_fc13-i386_SCM-G8|PBE:SPVTG_FBE-1.0_fc13-i386_SCM-G8] | [SPVTG_FBE-1.0_fc13-i386_SCM-G8-RP|PBE:SPVTG_FBE-1.0_fc13-i386_SCM-G8-RP]\\ |
| [SPVTG_FBE-1.0_fc6-i386_SCM-RTN|PBE:SPVTG_FBE-1.0_fc6-i386_SCM-RTN] | [SPVTG_FBE-1.0_ubuntu10.04-i386_CSWBU-YBE|PBE:SPVTG_FBE-1.0_ubuntu10.04-i386_CSWBU-YBE] | [SPVTG_FBE-1.1_fc10-i386_SCM-G6-RP|PBE:SPVTG_FBE-1.1_fc10-i386_SCM-G6-RP] |
| [SPVTG_FBE-1.0_fc3-i386_SCM-NGP|PBE:SPVTG_FBE-1.0_fc3-i386_SCM-NGP]\\ |

h5.  \\

|| h4. Shrinkwraps ||

| [cel5.03-i386-1.0|PBE:cel5.03-i386-1.0] | [cel5.03-i386-2.0|PBE:cel5.03-i386-2.0] | [cel5.03-i386-2.1|PBE:cel5.03-i386-2.1] | [cel5.03-x86_64-1.0|PBE:cel5.03-x86_64-1.0] |
| [cel5.03-x86_64-2.0|PBE:cel5.03-x86_64-2.0] | [cel5.03-x86_64-2.1|PBE:cel5.03-x86_64-2.1] | [cel5.50-x86_64-1.0|PBE:cel5.50-x86_64-1.0] | [cel5.50-x86_64-1.1|PBE:cel5.50-x86_64-1.1] |
| [cel6.20-x86_64-1.0|PBE:cel5.50-x86_64-1.0] | [cel6.20-x86_64-1.1|PBE:cel5.50-x86_64-1.1] | [f10-i386_1.0|PBE:f10-i386_1.0] | [f13-i386_1.0|PBE:f13-i386_1.0] |
| [f15-i386_1.0|PBE:f15-i386_1.0] | [fc10-1.1_i386-SCM-G6-RP|PBE:fc10-1.1_i386-SCM-G6-RP] | [fc3-1.1_i386-SCM_NGP|PBE:fc3-1.1_i386-SCM_NGP] | [fc3-i386_1.0|PBE:fc3-i386_1.0] |
| [fc6-i386_1.0|PBE:fc6-i386_1.0] | [GIS-suse-8.1-i386_1.0|PBE:GIS-suse-8.1-i386_1.0]\\ | [rh7.3-i386_1.0|PBE:rh7.3-i386_1.0] | [rh72_adbu_i386_1.0|PBE:rh72_adbu_i386_1.0] |
| [rhel3-i386_1.0|PBE:rhel3-i386_1.0] | [rhel4.8-i386_1.0|PBE:rhel4.8-i386_1.0] | [suse-8.1-i386_1.0|PBE:suse-8.1-i386_1.0] | [ubuntu10.04-amd64_1.0-ALPHA|PBE:ubuntu10.04-amd64_1.0-ALPHA] |
| [ubuntu10.04-i386_1.0|PBE:ubuntu10.04-i386_1.0] |

h5.  \\


Reply With Quote
  #4  
Old February 28th, 2013, 01:23 PM
gw1500se gw1500se is online now
Contributing User
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Jul 2003
Posts: 2,875 gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level) 
Time spent in forums: 1 Year 1 Week 6 Days 5 h 34 m 24 sec
Reputation Power: 581
I guess I got it. It was a brute force method using index and rindex but it works.

Reply With Quote
  #5  
Old March 1st, 2013, 01:43 AM
Laurent_R Laurent_R is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jun 2012
Posts: 504 Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 4 Days 18 h 54 m 9 sec
Reputation Power: 385
Quote:
Originally Posted by gw1500se
Thanks for the reply. This exists as a single string but I suppose I could break it up on new line characters. What I can't predict is how well behaved the string will be. Supposed a new line character shows up in the middle of a cell (although unusual it is not illegal)?


You could split on pīpe + new line:

Perl Code:
Original - Perl Code
  1. my @lines = split /\|\n/, $input;

Reply With Quote
  #6  
Old March 1st, 2013, 06:52 AM
gw1500se gw1500se is online now
Contributing User
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Jul 2003
Posts: 2,875 gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level) 
Time spent in forums: 1 Year 1 Week 6 Days 5 h 34 m 24 sec
Reputation Power: 581
Thanks but that would not distinguish one table from the next nor separate out the intervening text.

Reply With Quote
  #7  
Old March 1st, 2013, 09:20 AM
Laurent_R Laurent_R is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jun 2012
Posts: 504 Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 4 Days 18 h 54 m 9 sec
Reputation Power: 385
It would not distinguish by itself, but it would enable you to look up what comes at the beginning of the next line to figure out whether you are at the end of a table or not.

Reply With Quote
  #8  
Old March 1st, 2013, 10:23 AM
gw1500se gw1500se is online now
Contributing User
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Jul 2003
Posts: 2,875 gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level)gw1500se User rank is Colonel (50000 - 60000 Reputation Level) 
Time spent in forums: 1 Year 1 Week 6 Days 5 h 34 m 24 sec
Reputation Power: 581
That brings me back to my brute force method finding a non-table amid any amount of white space then rindexing back to the end of that table. I'm satisfied that what I have is doing what I need. Thanks.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPerl Programming > Complex (for me) string manipultations

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap