Python Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming LanguagesPython Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old June 30th, 2004, 04:20 PM
rockets12345 rockets12345 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2004
Posts: 101 rockets12345 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 8 h 50 m 20 sec
Reputation Power: 5
How to remove blank-lines in a xml-file before parsing!

Hi,
I am trying to parse an xml-file using minidom.parse
method.
  • I want to remove any blank lines in the begining of the xml-file before I start parsing so that my root-element is always at the first line.
  • Also is there a way to remove all the blank-lines in the xml-file before parsing.
As currently parsing works fine if there are no blank-lines at the top means before the root-element but if I put any blank-line before the first-element the parser doesn't work.

Thanks

Reply With Quote
  #2  
Old July 1st, 2004, 01:56 PM
netytan's Avatar
netytan netytan is offline
Hello World :)
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Mar 2003
Location: Hull, UK
Posts: 2,536 netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 18 h 3 m 4 sec
Reputation Power: 63
Send a message via ICQ to netytan Send a message via AIM to netytan Send a message via MSN to netytan Send a message via Yahoo to netytan
Probably the best way to do this would be to iterate though the file and write only the none-blank lines to a temporary file, then pass the name of this temporary files (or the file object itself) the the parse() method.

This can be done as easily as this:

Code:
#!/usr/bin/env python

import os, random

#Create a random name for the temp file.
path = random.sample(5, 'temporary_file')
#Create a new temp file to write to.
temp = file(path, 'w')

for line in file('base.xml'):
    #Iterate over each line in the file and if the
    #line is not blank then write it to the temp
    #file.
    if line.strip():
        temp.write()
#close the temp file.
temp.close()

#...
#Parse the temp file using 'path' as the file name.
#...

#Finally remove the temp file.
os.remove(path)


Note: This has'nt been tested and is here to illustrate the idea only though it should work.

But you can also create temporary files using the tempnam() function is the os module, or by using the tempfile module though it is just as simple to use random in this case.

http://www.python.org/doc/2.3.4/lib...e-tempfile.html
http://www.python.org/doc/2.3.4/lib/module-os.html
http://www.python.org/doc/2.3.4/lib/module-random.html

Infact it would not be hard at all to create a temporary file object that could be increadably easy to use .

Have fun,

Mark.
__________________
programming language development: www.netytan.com Hula


Last edited by netytan : July 1st, 2004 at 01:58 PM.

Reply With Quote
  #3  
Old July 2nd, 2004, 12:39 AM
gen_rec gen_rec is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2004
Posts: 10 gen_rec User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 13 m 48 sec
Reputation Power: 0
Smile

try:

import re
print ''.join(re.split(r'[\n]+\s*', open('base.xml').read())).strip()

Reply With Quote
  #4  
Old July 2nd, 2004, 01:48 AM
netytan's Avatar
netytan netytan is offline
Hello World :)
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Mar 2003
Location: Hull, UK
Posts: 2,536 netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 18 h 3 m 4 sec
Reputation Power: 63
Send a message via ICQ to netytan Send a message via AIM to netytan Send a message via MSN to netytan Send a message via Yahoo to netytan
The disadvangage of this is that you're reading the whole file into to memory and preforming an action on it. If the file is large then this isn't going to be a good thing. For small files it fine but use file itorators where possiable. In this case you can also avoid the (slight) overhead of importing and using regex.

Why not just make changes to the XML file? Surly, if its not bing treated as valid XML then you want to make it so. So You could write the none blank lines back into the original file and be done with it .

Take care,

Mark.

Last edited by netytan : July 2nd, 2004 at 01:53 AM.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPython Programming > How to remove blank-lines in a xml-file before parsing!


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway
Stay green...Green IT