|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
|||
|
|||
|
Extracting blocks from XML file
I have a large XML file and I want to extract blocks from it.
For example, there is a large block: <tag> .... </tag> I want to check if there is a specific string within the block - if yes, then extract the entire block from <tag> to </tag>. I tried using nawk but I didnt know how to make the "if" part. It's a Sun machine, thus Sun OS. |
|
#2
|
|||
|
|||
|
sed -n "/\<${1}\>/,/\<\/${1}\>/p" INFILE > OUTFILE
would probably do you, passing the tag as the parameter $1 |
|
#3
|
|||
|
|||
|
not really, Nuth
"/\<${1}\>/,/\<\/${1}\>/p" a) ${1} is not wrong but not needed and in this context, confusing, use $1 instead, so "/\<$1\>/,/\<\/$1\>/p" b) \<abc\> in sed is used to isolate word-boundary 'abc' it casually match '<abc>' but also ',abc,' or 'abc' and ' abc ' ... you need to match '<abc>' and '</abc>' also use: "/<$1>/,/<\/$1>/p" c) assumed 'tag' is NOT expected in the output AND it is on it own line, a correct sed cmd would be: sed -n "/<tag>/,/<\/tag>/{;/<tag>/d;/<\/tag>/d;p;}" repalce tag by $1 if you want. if you like precision: "/^<tag>$/,/^<\/tag>$/{;/^<tag>$/d;/^<\/tag>$/d;p;}" on a production system, i also would check for blancs ![]() |
![]() |
| Viewing: Dev Shed Forums > Operating Systems > UNIX Help > Extracting blocks from XML file |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|
|