Hi folks,
This is not a problem that I really need to solve, but I was trying to answer the question of someone in another section of this forum and got partly stuck.
In a question posted in the Regex section of this forum (http://forums.devshed.com/regex-programming-147/regex-to-replace-white-space-between-brackets-933871.html), the original poster asked the following question:
Quote:
| Originally Posted by benwenger I know how to replace everything between brackets but not how to replace parts of it. I need a regex to replace all white space between curly brackets with
example
$string="lorum {ipsum dolor sit} et amed {nucas nullum} est";
after regex
lorum {ipsum dolor sit} et amed {nucas nullum} est |
I warned that this was a bit complicated for a regex and was able to come up with a rather tedious solution that would iterate with a while loop through the string, extract each {...} substring, apply a simple substitution to that substring to replace the spaces, and then replace the original substring by the modified substring, and, in the next while iteration, would do the same thing to the next {...} substring, and so on until the job was done.
Something that works, but looks tedious and rather ugly. I also advised that I personally would probably not do it with regexes, but rather find the substrings with the index function, extract the substring, modify it with a regex, and replace the substring in place, doing the whole thing in a while loop until the job is done.
But then, another poster came with an illuminating remark on something that I had not thought about for one second (even though I knew it in theory, I have probably never used this functionality, so it had not come to my mind):
Quote:
| Originally Posted by Jacques1 What you're doing is completely unnecessary effort. PHP (and I'm sure also Perl) can replace patterns with the return value of a callback function. |
Jacques1 then gave a piece code in PHP to achieve the required result (the original question was about PHP), which is irrelevant here.
Of course. This is sooooh much better.
So, for the fun of it, I tried to do it in Perl, but found that it was not as easy as I thought.
I finally succeeded to do it this way, using a (sort of callback) function:
Code:
sub remove_sp {
$_ = shift;
s/ / /g;
return $_;
}
my $test = "lorum {ipsum dolor sit} et amed {nucas nullum} est";
$test =~ s/(\{[^}]*\})/remove_sp($1)/eg;
This works fine, $test now contains: "lorum {ipsum dolor sit} et amed {nucas nullum} est", which was the required result.
It is pretty good and far better than the regex progressive match constructs within a while loop that I had suggested originally.
But I came up with that solution with a separate function definition only as a fall-back option after I tried unsuccessfully to inline what is in the remove_sp function above as an anonymous function in the replacement part of the s/// expression.
I tried all kinds of ways to inline an anonymous function, but, for example, something like this:
Code:
$test =~ s/(\{[^}]*\})/{$_=$1; s/ / /g}/eg
or
Code:
$test =~ s/(\{[^}]*\})/{$1 =~ s/ / /g}/eg
gave me an "Unmatched right curly bracket" error. I played with a number of variations on that, but I still can't find how to do it. I must be missing something or perhaps doing a silly mistake.
In brief, I am fairly sure it should be possible to do it in an anonymous or inline subroutine within the replacement section of the s/// statement and would like to understand why I don't find the right way to do it. Does anyone have an idea on how to solve this?
Thanks for your thoughts.