|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
|||
|
|||
|
Sed Problem
Hi, I am face with a problem with sed command here. I wan to find a string and replace it within a file.
Eg. from <a href="BuildPage.cgi?body=index.html"> change to <a href=BuildPage.cgi?body=index.html"> what is the code tat i should do? thanks for helping.. |
|
#2
|
|||
|
|||
|
echo <string> | sed 's/"//'
But I guess you want to delete the second " too: echo <string> | sed 's/"//g' |
|
#3
|
|||
|
|||
|
Thanks ya~
yupe actually i wan to replace both double quotes here..
But, the file i wan to change got lots of double quotes and i just want to change the double quotes of the href action. So, how to do that ler? is it like tat sed 's/href=\"*.*\"/href=*.*/' |
|
#4
|
|||
|
|||
|
You need a good regular expression to do this.
I can give an example, but it depends on how the html page was written. More specific: how the <a href......> line is written. - Is it by itself on a line, - Do you use title="..." in the href tag, - Are there multiple html statements on 1 line, - etc Just 3 of many examples: <a href="BuildPage.cgi?body=index.html"> <a href="BuildPage.cgi?body=index.html" title="Build Page">Name</a> <a href="BuildPage.cgi?body=index.html">Name</a><p class="article"> For the above examples the following statement will do the trick: cat <htm_file> | sed 's/a href="\(.*\)html"/a href=\1html/' But you might need to change the regular expression for your specific needs. Hope this helps. |
|
#5
|
|||
|
|||
|
i try the code u type to me, but it doesn't work
the output i want should be <a href=BuildPage.cgi?body=index.html> but the output i get is <a href=html> the word after the href and b4 the html all gone. beside this, sometimes the href not always end with html, for example <a href="BuildPage.cgi?body=index.html#adac"> so, i wan to eliminate the double quotes of the code above too~ do u hv any better solution? Thanks you |
|
#6
|
|||
|
|||
|
Sorry, but the solution I gave does work:
$ cat somepage.html <a href="BuildPage.cgi?body=index.html"> <a href="BuildPage.cgi?body=index.html" title="Build Page">Name</a> <a href="BuildPage.cgi?body=index.html">Name</a><p class="article"> $ cat somepage.html | sed 's/a href="\(.*\)html"/a href=\1html/' <a href=BuildPage.cgi?body=index.html> <a href=BuildPage.cgi?body=index.html title="Build Page">Name</a> <a href=BuildPage.cgi?body=index.html>Name</a><p class="article"> As you can see, only the " around "BuildPage.cgi?body=index.html" are stripped all the rest stays the same. Did you make a typo?? As for the ......html#sometag".... You need to adjust the regular expression for that: This will take care of the example you gave (and still works with the 3 example lines shown above): sed 's/a href="\(.*\)html[#]*[a-z]*"/a href=\1html/' Last edited by druuna : November 14th, 2003 at 06:29 AM. |
|
#7
|
|||
|
|||
|
Thanks for ur helping, now i already solve the first problem. Now, the second problem is the typo.
from the code that u given to me, it will remove the typo behind. but actually i wan the typo to remain there. for example, <a href="BuildPage.cgi?body=index.html#abc"> change to <a href=BuildPage.cgi?body=index.html#abc> |
|
#8
|
|||
|
|||
|
Maybe you misunderstood me.
Both examples I gave will do what you want. The second one will take care of the 'html#abc' stripping. |
|
#9
|
|||
|
|||
|
Thanks druuna, I solve the problem liao.. Thanks for ur guidance and helping~
It does help me a lots. i appreciate it very much~ all the best~ |
![]() |
| Viewing: Dev Shed Forums > Operating Systems > UNIX Help > Sed Problem |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|