Regex Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreRegex Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old November 10th, 2012, 10:31 AM
benwenger benwenger is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2005
Posts: 353 benwenger User rank is Corporal (100 - 500 Reputation Level)benwenger User rank is Corporal (100 - 500 Reputation Level)benwenger User rank is Corporal (100 - 500 Reputation Level)benwenger User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 5 Days 2 h 31 m 18 sec
Reputation Power: 10
Regex to replace white space between brackets

I know how to replace everything between brackets but not how to replace parts of it. I need a regex to replace all white space between curly brackets with  

example
$string="lorum {ipsum dolor sit} et amed {nucas nullum} est";
after regex
lorum {ipsum dolor sit} et amed {nucas nullum} est

Reply With Quote
  #2  
Old November 10th, 2012, 01:01 PM
Laurent_R Laurent_R is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jun 2012
Posts: 508 Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 4 Days 19 h 29 m 2 sec
Reputation Power: 385
First, in which language? I ask this, because this is quite complicated and some regex constructs may not be available in all languages.

I tend to think that you should not try to do this with pure regexes. I succeeded to do it, but by applying successions of regexes to:
- extract the first substring between curly brackets (something like: $sub = $1 if $string =~ /(\{.*?\})/
- modifying the substring, replacing the spaces
- replacing the first substring by the modified substring
- start again in the string from the point at the end of the first substring, and so on until there is no more match.

This is quite complicated and I think you should probably use other methods to find the curlies (the index finction un Perl) , extract a substring, modify it with a regex, use a substring function to replace the first substring, and take it again from there.

Reply With Quote
  #3  
Old November 11th, 2012, 04:07 AM
benwenger benwenger is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2005
Posts: 353 benwenger User rank is Corporal (100 - 500 Reputation Level)benwenger User rank is Corporal (100 - 500 Reputation Level)benwenger User rank is Corporal (100 - 500 Reputation Level)benwenger User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 5 Days 2 h 31 m 18 sec
Reputation Power: 10
it's in php

Reply With Quote
  #4  
Old November 11th, 2012, 10:42 AM
Laurent_R Laurent_R is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jun 2012
Posts: 508 Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 4 Days 19 h 29 m 2 sec
Reputation Power: 385
Hi,

This is a start in Perl terms:

Code:
$_ = "lorum {ipsum dolor sit} et amed {nucas nullum} est";
$s = $1 if /(\{.*?\})/; $s2=$s; $s2 =~ s/ / /g; s/$s/$s2/;print;

which prints:
Code:
lorum {ipsum dolor sit} et amed {nucas nullum} est

(I know that the work is only half done, but keep on reading.)

This makes relatively heavy use of Perl's default $_ special variable to make the syntax more concise (but less readable for those not knowing Perl). I don't know enough about PHP to write it in PHP, but if I write it again in simpler Perl, without using those special features of Perl, it would look more or less like this:

Code:
$string = "lorum {ipsum dolor sit} et amed {nucas nullum} est";
$s = $1 if $string =~ /(\{.*?\})/;
$s2 = $s1;
$s2 =~ s/ / /g;
$string =~ s/$s/$s2/;
print $string;

But this, of course, changes only the first string quoted between curly braces. How to continue from there? Well, in Perl, we can wrap this in a while loop and modify slightly the regular expression so that the next match starts only that the place where the previous match left it. Such a possibility most probably also exists also in PHP, but is almost certainly bound to be very different from the way it is done in Perl, because Perl uses really some of its unique features to do it, so there is no point for me to give you the full thing with this route in Perl.

An alternative method is to modify this line:

Code:
$s = $1 if $string =~ /(\{.*?\})/;

so that the match occurs only if there is at least one space in the string between the curly braces. In this case, if we wrap this in a while loop, the first string will be matched and spaces within it replaced in the first iteration in the loop, and, in the second iteration, the first string will no longer be recognized (there no space left), so that the second iteration will match the second substring to be modified, and so on until there is nothing left to me matched.

Here is the new expression to match the substring between curly braces only if there are no spaces in it (using negated alternations):
Code:
$s = $1 if $string =~  /(\{(?:[^} ]+ )+[^} ]+\})/;

So wrapping the code below:
Code:
$string = "lorum {ipsum dolor sit} et amed {nucas nullum} est";
$s = $1 if $string =~ /(\{(?:[^} ]+ )+[^} ]+\})/;
$s2 = $s1;
$s2 =~ s/ / /g;
$string =~ s/$s/$s2/;

in a while loop that knows how to stop when there is no longer a match (it could be on whether or not $1 is defined) will do the trick. I'll leave it there for you to try to implement that in PHP

But, as I said before, we are getting at fairly hairy regexes and relatively complicated algorithm. Have you contemplated the other proposal I made: using the built-in PHP functions (rather than regexes) to find the first { and the first }, extract the substring between { and }, and then only use regexes to modify the substring, and insert its replacement into the original string. And then repeat the process with the next substring until you are done. I think it will be much easier and far more readable than using pure regex constructs.

Reply With Quote
  #5  
Old November 11th, 2012, 01:42 PM
benwenger benwenger is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2005
Posts: 353 benwenger User rank is Corporal (100 - 500 Reputation Level)benwenger User rank is Corporal (100 - 500 Reputation Level)benwenger User rank is Corporal (100 - 500 Reputation Level)benwenger User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 5 Days 2 h 31 m 18 sec
Reputation Power: 10
Thanks a lot for your comprehensive answer.
I've tried to translate some of your thoughts and code into PHP but to no avail.
I then came up with a totally different approach that works fine.
First I put all matches (text between curly brackets) into an array using
PHP Code:
 preg_match_all('/{([^}]*)}/',$text,$matched); 


I then loop through the array replacing the matches in the original string as follows

PHP Code:
if(count($matched)>0){
foreach(
$matched[0] as $match){
$match2=str_replace(" "," ",$match);
$text=str_replace($match,$match2,$text);
}



Probably not the most eficcient way in terms of speed and ressource but it work for me.

Reply With Quote
  #6  
Old November 11th, 2012, 02:44 PM
Jacques1's Avatar
Jacques1 Jacques1 is offline
pollyanna
Click here for more information.
 
Join Date: Jul 2012
Location: Germany
Posts: 1,869 Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level)Jacques1 User rank is Lieutenant General (80000 - 90000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 1 Day 22 h 57 m 55 sec
Reputation Power: 813
What you're doing is completely unnecessary effort. PHP (and I'm sure also Perl) can replace patterns with the return value of a callback function. So you don't need all this. In PHP, it's simply

PHP Code:
 $test "lorum {ipsum dolor sit} et amed {nucas nullum} est";
$result preg_replace_callback('/{[^}]*\}/', function ($match) {return str_replace(' '' '$match[0]);}, $test);

var_dump$result ); 


If you have an outdated PHP version (<= 5.3), you need to define the callback function with a normal function declaration and then pass the name as a string to preg_replace_callback.

Note that this does not work with nested braces (like "{ abc { def } xyz}"). If you want to do this, it will get more complicated.

Reply With Quote
  #7  
Old November 11th, 2012, 05:56 PM
Laurent_R Laurent_R is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jun 2012
Posts: 508 Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 4 Days 19 h 29 m 2 sec
Reputation Power: 385
You are absolutely right Jacques, I did not think about having a callback function as a replacement part.

Reply With Quote
  #8  
Old November 13th, 2012, 12:48 PM
Laurent_R Laurent_R is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jun 2012
Posts: 508 Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 4 Days 19 h 29 m 2 sec
Reputation Power: 385
For the benefit of potentially interested readers, here is a way to use a (sort of callback) function within the replacement part of the s/// statement in Perl:

Code:
sub remove_sp { $_ = shift;  s/ /&nbsp;/g; $_;}
my $string = "lorum {ipsum dolor sit} et amed {nucas nullum} est";
$string =~ s/(\{[^}]+\})/remove_sp($1)/eg;


$string now contains: "$test now contains: "lorum {ipsum&nbsp;dolor&nbsp;sit} et amed {nucas&nbsp;nullum} est".

It is also possible to inline an anonymous function, e.g.:

Code:
$string =~ s/(\{[^}]+\})/(my $t = $1) =~ s! !&nbsp;!g; $t/ge;


This last solution was proposed by another member of this forum, OmegaZero, after I reported that I had trouble finding the exact right syntax.

He suggested an even shorter and simpler form:

Code:
$string =~ s/(\{[^}]+\})/join '&nbsp;', split ' ', $1/ge; 

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreRegex Programming > Regex to replace white space between brackets

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap