#1
  1. No Profile Picture
    Contributing User
    Devshed Loyal (3000 - 3499 posts)

    Join Date
    Jul 2003
    Posts
    3,378
    Rep Power
    594

    Parallel::ForkManager Timing Issue


    I have a large number of sub-tasks that I want to run in parallel but I am having a timing problem. Some of the tasks are quite short while others are long (hours) so I set up a hash that contains the pid ($pm->start) and a name. When the task is 'start'ed I set the values in the hash then add that hash to an array. In theory the array should contain the information I need for each running task. When a task finishes I have a 'run_on_finish' defined that removes itself from the array. The problem arises with short tasks, that apparently finish before its hash is added to the array. I'm looking for the best way to assure that the 'run_on_finish' does not try to remove the child process before it is added. TIA.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Loyal (3000 - 3499 posts)

    Join Date
    Jul 2003
    Posts
    3,378
    Rep Power
    594
    After playing with this for a while I think my problem is not understanding how ForkManager works. Within my loop, I thought the child executes everything between 'start and next' and 'finish'. In the meantime the parent continues executing everything after 'finish' to the end of the loop. Then the loop runs again creating a new child and the parent executes to the end of the loop again. Do I have this wrong? Here is my code:
    Code:
    foreach my $url (@urls) {
             unless ($seen{$url}++) {
             	my $proc=process->new();
    		$proc->pid($pm->start and next);
                    chomp($url);
                    trim($url);
    	        print "*************Processing $url*******************\n";
                    system("$helper_utils/collect_user_metrics.pl", "$url",">$tempoutput/collect_user_metrics.txt");
                   system("$helper_utils/get_docroot_info.pl", "$pool", "$url",">$tempoutput/get_docroot_info.txt");
                  $pm->finish();
                    print("Adding new child to array\n");
                   $proc->name($url);
    		push(@processed,$proc);
             }
    }
    My debug print (adding new child) is never output. Can someone clear up how this works for me? TIA.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.

IMN logo majestic logo threadwatch logo seochat tools logo