XML Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreXML Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old October 22nd, 2012, 04:18 AM
pa7751 pa7751 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2012
Posts: 4 pa7751 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 19 m 52 sec
Reputation Power: 0
Multi threading and xml

Hi

I have an xml file that has to be written by multiple threads running in parallel. How can we ensure that the structural integrity of the xml file i.e. many threads writing into the file can spoil the structure of the xml. One way is to make the write method synchronized, but that is a very high level approach with which only one thread may write at a time. So the other threads are ready with their data but cannot write until the lock is released. Is there a better way to do this?

Reply With Quote
  #2  
Old October 22nd, 2012, 12:13 PM
requinix's Avatar
requinix requinix is offline
Still alive
Click here for more information.
 
Join Date: Mar 2007
Location: Washington, USA
Posts: 12,680 requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)  Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1
Time spent in forums: 5 Months 1 Week 4 Days 1 h 55 m 43 sec
Reputation Power: 8969
Send a message via AIM to requinix Send a message via MSN to requinix Send a message via Yahoo to requinix Send a message via Google Talk to requinix
a) Don't use a file
b) Keep a master thread which is the only one that writes to (and possibly reads from) the file
c) Lock on the file but collect a few things to write at once, thus reducing how often the file needs to be used (assuming the work the threads do takes longer than the time needed to write the changes)
Comments on this post
pa7751 agrees!

Reply With Quote
  #3  
Old October 22nd, 2012, 01:36 PM
pa7751 pa7751 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2012
Posts: 4 pa7751 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 19 m 52 sec
Reputation Power: 0
Ok i guess a little more explanation on the scenario can help. There are a series of tasks that are nothing but Linux commands that need to be executed. Each of these tasks are tags in the xml e.g.,
Code:
<task name="copy" command="scp src dest" user="root" host="machineIP" resumeable="true" />. 
These tasks are grouped as activities. The idea is to execute multiple such tasks(commands) in parallel and in case of failure we can resume for the last executed task and not the beginning e.g. if input commands.xml has suppose 100 tags and at 20th task, a failure happens and that task is resumable, then when the program is started again, I start execution from 20th step and not 1st. So every task is recorded in resume.xml with its status (began or completed). For the first 20 tasks, resume.xml will have task status=complete. The status of task#21 would be="begin", so I will begin execution from there. Hence at every task, I need to record the status of that task. So I would say that the time taken to write to resume.xml is more than the time taken to execute the task as such. Also before starting execution, I have to check the first "begin" also. In short I have to parse this file first, reach the point of restore, then again start recording tasks with their statuses as I progress. Hence there are many edits happening by parallel running threads to the same file

Reply With Quote
  #4  
Old October 22nd, 2012, 01:59 PM
requinix's Avatar
requinix requinix is offline
Still alive
Click here for more information.
 
Join Date: Mar 2007
Location: Washington, USA
Posts: 12,680 requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)  Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1
Time spent in forums: 5 Months 1 Week 4 Days 1 h 55 m 43 sec
Reputation Power: 8969
Send a message via AIM to requinix Send a message via MSN to requinix Send a message via Yahoo to requinix Send a message via Google Talk to requinix
Then it's a typical... what's it called... worker-thread pattern? Thread pool?

Go with option b. Two variants: you have a number of child threads which communicate with the master daemon to get a task/activity to run and report the status when completed, or the daemon starts a child for each task and there's essentially just one thread of yours running at a time. Which one you choose depends on the nature of the "activities", like whether their tasks can be run independently or are related to each other.
Comments on this post
pa7751 agrees!

Reply With Quote
  #5  
Old October 22nd, 2012, 05:55 PM
E-Oreo's Avatar
E-Oreo E-Oreo is offline
Lost in code
Click here for more information.
 
Join Date: Dec 2004
Posts: 7,931 E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 90th Grade (Above 100000 Reputation Level)  Folding Points: 945 Folding Title: Novice Folder
Time spent in forums: 2 Months 7 h 43 m 47 sec
Reputation Power: 6991
An XML file is really not very appropriate for this sort of thing, but in addition to the recommendation requinix already made, if you change your statuses so they are all the same length (ie:
Code:
begin
cmplt
pausd
etc..

Then you can perform an in-place write rather than having to rewrite the entire file every time a change is made.
__________________
PHP FAQ
How to program a basic, secure login system using PHP

Quote:
Originally Posted by Spad
Ah USB, the only rectangular connector where you have to make 3 attempts before you get it the right way around

Reply With Quote
  #6  
Old October 23rd, 2012, 12:31 AM
pa7751 pa7751 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2012
Posts: 4 pa7751 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 19 m 52 sec
Reputation Power: 0
Quote:
Originally Posted by requinix
Which one you choose depends on the nature of the "activities", like whether their tasks can be run independently or are related to each other.


The tasks that are parallel will be independent of each other and they cannot even share any data amongst themselves. This will be given to me. Also I will not have many threads running in parallel at a time, so I can assume an unlimited thread pool. Consider a worst case scenario like 100 parallel tasks, now I am still unclear as to what is your suggestion to write teh statuses of these tasks in parallel in the output xml file. Could you please help me understand?

Reply With Quote
  #7  
Old October 23rd, 2012, 02:12 AM
requinix's Avatar
requinix requinix is offline
Still alive
Click here for more information.
 
Join Date: Mar 2007
Location: Washington, USA
Posts: 12,680 requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)  Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1
Time spent in forums: 5 Months 1 Week 4 Days 1 h 55 m 43 sec
Reputation Power: 8969
Send a message via AIM to requinix Send a message via MSN to requinix Send a message via Yahoo to requinix Send a message via Google Talk to requinix
Make one master thread do the writing and have the child threads communicate with it. All they really need to do is say that the task has completed, right?

Reply With Quote
  #8  
Old October 23rd, 2012, 04:27 AM
pa7751 pa7751 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2012
Posts: 4 pa7751 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 19 m 52 sec
Reputation Power: 0
Ya so basically there are 3 steps:
  1. Parse existing xml to resume from a state that failed previously
  2. Spawn threads
  3. Write new status to file for each task

So I guess what best can be done in case of parallel threads would be to do 1&3 in synchronized block and 2 as parallel

Reply With Quote
  #9  
Old November 2nd, 2012, 05:35 AM
Morningwalker Morningwalker is offline
Permanently Banned
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2012
Posts: 6 Morningwalker User rank is Just a Lowly Private (1 - 20 Reputation Level)  Folding Points: 43277 Folding Title: Beginner FolderFolding Points: 43277 Folding Title: Beginner FolderFolding Points: 43277 Folding Title: Beginner Folder
Time spent in forums: 2 h 4 m 6 sec
Warnings Level: 10
Number of bans: 1
Reputation Power: 0
Groovy XmlSlurper is a nice tool to parse XML documents, mostly because of the elegant GPath dot-notation.

Reply With Quote
  #10  
Old November 2nd, 2012, 12:45 PM
requinix's Avatar
requinix requinix is offline
Still alive
Click here for more information.
 
Join Date: Mar 2007
Location: Washington, USA
Posts: 12,680 requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)  Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1
Time spent in forums: 5 Months 1 Week 4 Days 1 h 55 m 43 sec
Reputation Power: 8969
Send a message via AIM to requinix Send a message via MSN to requinix Send a message via Yahoo to requinix Send a message via Google Talk to requinix
Quote:
Originally Posted by Morningwalker
Groovy XmlSlurper is a nice tool to parse XML documents, mostly because of the elegant GPath dot-notation.

OP's problem is with the multithreadedness, not with parsing XML.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreXML Programming > Multi threading and xml

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap