The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> Python Programming
|
Working with serveral processes
Discuss Working with serveral processes in the Python Programming forum on Dev Shed. Working with serveral processes Python Programming forum discussing coding techniques, tips and tricks, and Zope related information. Python was designed from the ground up to be a completely object-oriented programming language.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

June 10th, 2004, 03:29 AM
|
|
Registered User
|
|
Join Date: May 2004
Posts: 21
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
Working with serveral processes
I'm a bit lost here. I would like to run several scripts in parallel on a cluster, each script sending back a float number.
I'm searching on the internet and found serveral things:
- multithreading (which I don't really know what it does)
- the os module allowing stuff as popen or spawn enabling to run other scripts as new processes
Which one you recommend? Do you have links to web sites explaining a little how threading works (I have some trouble because the usual python.org and equivalent sites don't present a lot of example of scripts...)?
Thanks
erwin
|

June 10th, 2004, 05:01 AM
|
|
Registered User
|
|
Join Date: May 2004
Posts: 21
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
Hi again,
I'm desperately looking for an efficient way of dealing with my problem, and I found this on a web site. It seems to imply that multithreading doesn't run several processes in parallel.
How would you run another script in parrallel, without having the father process stoping?
Quote: This doesn't mean that you can't make good use of Python on multi-CPU
machines! You just have to be creative with dividing the work up
between multiple *processes* rather than multiple *threads*. |
thanks
|

June 10th, 2004, 06:54 AM
|
 |
Mini me.
|
|
Join Date: Nov 2003
Location: Cambridge, UK
|
|
Your correct using the thread or threading just runs some code in your script in a different thread of execution. It is not running another script.
http://forums.devshed.com/t122818/s...light=threading
The above example happens to use the threading module to run multiple pinger class objects. These objects also happen to call an external program.
I guess your problem is - regardless of the language - how do you launch multiple programs on multiple machines? What is the typical way of controlling this in the cluster you are using?
You could adapt the script to start your processes - I guess they use local storage in which case they could write local results files that could be read by the controlling process.
grim 
|

June 10th, 2004, 07:08 AM
|
|
Registered User
|
|
Join Date: May 2004
Posts: 21
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
Not exactly in fact, the cluster I'm working on sends processes by its self to other CPU's. In fact when you connect to the cluster you connect to a virtual machine and you don't even know on which macine you are really working.
My great problem is calling another script and running let's say 5 or 6 of the same scripts at the same time on the same computer, BUT the os needs to see theses as new processes in order to send them to other CPU's.
I have tried some of the threding examples for instance this one:
Code:
import threading, time
def thread_task(name, n):
time.sleep(1)
for i in range(n): print name, i
for i in range(5):
T = threading.Thread(target=thread_task, args=(str(i), i))
T.run()
And each thread is run in order, they are not working at the same time in parallel...
But I don't know how to make my main process continue doing things, such as running another script for example?
If you have an idea, i'd be more than very grateful!
thanks
erwin
|

June 10th, 2004, 07:17 AM
|
 |
Mini me.
|
|
Join Date: Nov 2003
Location: Cambridge, UK
|
|
|
You did look at the link I posted? It uses threading to run multiple processes... While the threads may be time sliced the processes they manage (using os.system for example) would not be.
|

June 10th, 2004, 08:25 AM
|
|
Registered User
|
|
Join Date: May 2004
Posts: 21
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
Yes thanks a lot, it took some time caus i'm not used to this, but I managed...
The thing is (as I understood but maybe i'm wrong) that the threading module doesn't execute threads in parallel whereas the thread module does (but isn't as handy to use since it's low level threading...)
|

June 10th, 2004, 08:53 AM
|
 |
Mini me.
|
|
Join Date: Nov 2003
Location: Cambridge, UK
|
|
|
threading is just a wrapper class to thread.
Threads are managed by python itself and effectively do round-robin processing. Somewhere in the docs it mentions the number of bytecodes executed before moving on to the next thread (100 I think).
But why does it matter - you have launched other programs in each thread which will occupy their own processes and run independently.
|

June 10th, 2004, 09:05 AM
|
|
Registered User
|
|
Join Date: May 2004
Posts: 21
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
Yes you're right, but I was quite intriged by the example in the beggining since I don't understand why are the threads executed one after the other and not at the same time. Is it because there is not enough bytecode in each thread so that one thread is finished before the python interpreter gives the processor time to the next thread?
|

June 10th, 2004, 10:10 AM
|
 |
Mini me.
|
|
Join Date: Nov 2003
Location: Cambridge, UK
|
|
I found the details -
Secition 8.1 of the Python C API reference manual.
Quote: The Python interpreter is not fully thread safe. In order to support multi-threaded Python programs, there's a global lock that must be held by the current thread before it can safely access Python objects. Without the lock, even the simplest operations could cause problems in a multi-threaded program: for example, when two threads simultaneously increment the reference count of the same object, the reference count could end up being incremented only once instead of twice.
Therefore, the rule exists that only the thread that has acquired the global interpreter lock may operate on Python objects or call Python/C API functions. In order to support multi-threaded Python programs, the interpreter regularly releases and reacquires the lock -- by default, every 100 bytecode instructions (this can be changed with sys.setcheckinterval()). The lock is also released and reacquired around potentially blocking I/O operations like reading or writing a file, so that other threads can run while the thread that requests the I/O is waiting for the I/O operation to complete. |
The for loop in your example is small and the print statement would be a good example of file IO I think. Here is the dis assembly of the thread_task function:
Code:
3 0 LOAD_GLOBAL 0 (time)
3 LOAD_ATTR 1 (sleep)
6 LOAD_CONST 1 (1)
9 CALL_FUNCTION 1
12 POP_TOP
4 13 SETUP_LOOP 29 (to 45)
16 LOAD_GLOBAL 2 (range)
19 LOAD_FAST 1 (n)
22 CALL_FUNCTION 1
25 GET_ITER
>> 26 FOR_ITER 15 (to 44)
29 STORE_FAST 2 (i)
32 LOAD_FAST 0 (name)
35 PRINT_ITEM
36 LOAD_FAST 2 (i)
39 PRINT_ITEM
40 PRINT_NEWLINE
41 JUMP_ABSOLUTE 26
>> 44 POP_BLOCK
>> 45 LOAD_CONST 0 (None)
48 RETURN_VALUE
If you want to interleave the threads more closely then you could sprinkle the function with time.sleep commands. I've seen suggestions that playing with sys.setcheckinterval is not a good idea.
Grim;
Last edited by Grim Archon : June 10th, 2004 at 10:12 AM.
|

June 10th, 2004, 11:25 AM
|
 |
Mini me.
|
|
Join Date: Nov 2003
Location: Cambridge, UK
|
|
Actually, there are several "bugs" in the example that prevent it working properly  .
You should call the start method and not the run method - it is the start method that actually launches a new thread.
The loop first loop would define the number of threads not the number of times the worker loops.
Here is a working version showing things more separate with the sleep moved to allow the each thread a byte of the cherry:
Code:
import threading, time
def thread_task(name, n):
for i in range(n):
print name, i
time.sleep(0.1)
T = {}
for i in range(5):
T[i] = threading.Thread(target = thread_task, args = (str(i), 10))
for i in range(5):
T[i].start()
Grim 
|

June 10th, 2004, 11:34 AM
|
 |
Mini me.
|
|
Join Date: Nov 2003
Location: Cambridge, UK
|
|
This version prevents threads from breaking up strings on screen so it is easier to view. The change is to print one string rather than three (name, number and newline):
Code:
import threading, time
def thread_task(name, n):
time.sleep(1)
for i in range(n):
print "[%s %s] \n"%(name, i),
time.sleep(0.1)
T = {}
for i in range(5):
T[i] = threading.Thread(target = thread_task, args = ("thread "+str(i), 10))
for i in range(5):
T[i].start()
|

June 11th, 2004, 03:11 AM
|
|
Registered User
|
|
Join Date: May 2004
Posts: 21
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
I tested your example and it does work perfectly thanks!
I still have one question now, is there some special reason for you to make two for loops or is it possible to put both in the same loop?
|

June 11th, 2004, 03:21 AM
|
 |
Mini me.
|
|
Join Date: Nov 2003
Location: Cambridge, UK
|
|
Quote: | Originally Posted by winwin I tested your example and it does work perfectly thanks!
I still have one question now, is there some special reason for you to make two for loops or is it possible to put both in the same loop? |
I put it in two for loops just to show clear separation between creating the object and starting/creating the thread.
You can of course combine them in one for loop.
grim 
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|