Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
I've developed a tiny tool for parallel execution...
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
numlock
n00b
n00b


Joined: 17 Sep 2004
Posts: 58

PostPosted: Sat Oct 02, 2004 5:58 pm    Post subject: I've developed a tiny tool for parallel execution... Reply with quote

Hi,

I've just made a neat little tool (in Python) for running several (similar) commands in parallel :D

We can use it, for example, to download a large set of N files, using X simultaneous connections.

The neat touch is, that every "thread" is accessible as a named "screen" (which can be accessed with screen -r screen_name). This way, on the main "screen" we can display just a short summary of what commands are in progress.

The neat advantage: jobs can be added while the thing is running 8)

I'll post the tool, along with a few explanation, in a short while.

Please test it and give me some advice (especially improvements on the source -- I'm a long-time programmer but complete noob in Python) :wink:

Thanks !


Last edited by numlock on Sat Oct 02, 2004 6:09 pm; edited 1 time in total
Back to top
View user's profile Send private message
numlock
n00b
n00b


Joined: 17 Sep 2004
Posts: 58

PostPosted: Sat Oct 02, 2004 6:06 pm    Post subject: Reply with quote

para.py
Code:
#!/usr/bin/python -O
# -*- coding: utf-8 -*-

# Parallel execution scheduler v0.01 (alpha)
# (c)2004 by J. Bourquard (numlock@freesurf.ch)
# Licensed under GPL-2

import sys, thread, commands, threading

def makeTermTitle(title):
   return "\x1b]1;\x07\x1b]2;" + title + "\x07"

def worker(f,n):
   global sleepingTasks
   while 1:
      filePositionLock.acquire()
      pos=f.tell()
      alarmClock.acquire()
      line=f.readline()
      filePositionLock.release()
      if line=='':
         sleepingTasks=sleepingTasks+1
         if sleepingTasks==nbrThreads:
            print '*** All tasks are DONE -- You can press ctrl-c or echo more orders into ' + fifoName
         alarmClock.wait()
         sleepingTasks=sleepingTasks-1
         alarmClock.release()
      else:
         alarmClock.release()
         command=cmdPrefix + line[1:-1]
         if line[0]=='-':
            print '[' + n + '] start "' + command + '"'
            sessionName=sessionNamePrefix+str(n)
            bashCmd='echo "' + makeTermTitle('screen \\"' + sessionName + '\\"' + ' [' + command + ']') + '" ; ' + command
            status = commands.getstatusoutput('screen -D -m -S '+sessionName+' bash -c \'' + bashCmd + '\'')[0]
            if status != 0:
               print '[' + n + '] ERROR "' + command + '" (status=' + str(status) + ')'
            else:
               print '[' + n + '] DONE  "' + command
               filePositionLock.acquire()
               fw.seek(pos)
               fw.write('*'+line[1:])
               filePositionLock.release()
         else:
            print '[' + n + '] SKIP  "' + command + '"'

jobFile=sys.argv[1]
f=open(jobFile, 'r')
fw=open(jobFile, 'r+')
jobName=f.readline()[:-1]
sys.stderr.write(makeTermTitle(jobName))
sys.stderr.flush()
sessionNamePrefix=f.readline()[:-1]
nbrThreads=int(f.readline())
fifoName=f.readline()[:-1]
cmdPrefix=f.readline()[1:-1]

print 'Title           = "' + jobName + '"'
print 'screen names    = ' + sessionNamePrefix + '1..' + sessionNamePrefix + str(nbrThreads)
print 'command history = ' + jobFile
print 'command fifo    = ' + fifoName
print 'max threads     = ' + str(nbrThreads)

alarmClock = threading.Condition()
filePositionLock = thread.allocate_lock()
sleepingTasks=0

for n in range(nbrThreads):
   threadId=str(n+1)
   thread.start_new_thread(worker, (f, threadId))

while 1:
   fifo=open(fifoName, 'r')
   line=fifo.readline()
   fifo.close()
   command=line[:-1]
   print '[M] adding command "' + command + '"'
   fa=open(jobFile, 'a')
   alarmClock.acquire()
   fa.write(line)
   alarmClock.notify()
   alarmClock.release()
   fa.close()


Last edited by numlock on Sat Oct 02, 2004 8:05 pm; edited 1 time in total
Back to top
View user's profile Send private message
numlock
n00b
n00b


Joined: 17 Sep 2004
Posts: 58

PostPosted: Sat Oct 02, 2004 6:17 pm    Post subject: Reply with quote

Step 1: mkfifo jobs.new
Step 2: nano -w jobs
Code:
Parallel job execution test
session
2
jobs.new
!wget http://
-www.google.com/hp0.gif ; sleep 10
-www.google.com/index.html ; sleep 10
-www.mysite.com/example_page_1.html ; sleep 10
-www.mysite.com/example_page_2.html ; sleep 10

Notes:
- the 3rd line is the max number of simultaneous threads.
- the 4th line is the exact prefix added to every command.

Step 3: ./para.py jobs
Step 4: To add a job during execution:
Code:

echo "-www.mysite.com/example_page_2.html ; sleep 10" >jobs.new


Step 5: To view a job during execution:
Code:

screen -r session1
screen -r session2


Notes:
- if you use (for example) konsole, you'll see the executed command line in the window title :D

Results:
- The five URLs are fetched
- jobs contains the additional command (sent through jobs.new). Every line begins with a "*" (instead of a "-") so it won't get run in any upcoming invocation of para.py.

Isn't this neat? :D
Back to top
View user's profile Send private message
killfire
l33t
l33t


Joined: 04 Oct 2003
Posts: 618

PostPosted: Sat Oct 02, 2004 7:26 pm    Post subject: Re: I've developed a tiny tool for parallel execution... Reply with quote

numlock wrote:
Hi,

I've just made a neat little tool (in Python) for running several (similar) commands in parallel :D

We can use it, for example, to download a large set of N files, using X simultaneous connections.

The neat touch is, that every "thread" is accessible as a named "screen" (which can be accessed with screen -r screen_name). This way, on the main "screen" we can display just a short summary of what commands are in progress.

The neat advantage: jobs can be added while the thing is running 8)

I'll post the tool, along with a few explanation, in a short while.

Please test it and give me some advice (especially improvements on the source -- I'm a long-time programmer but complete noob in Python) :wink:

Thanks !


this question may be that of an ignorant, and feel free to smack me with a fish and/or correct me, but couldnt you do this with just screen?

i guess, your tool has avoided the additional overhead of another instance of bash running, but is it that helpful? with screen, you also get benifits from bash right?

sorry if this is degrading in any way.

killfire
_________________
my website, built in HAppS: http://dbpatterson.com
an art (oil painting) website I built a pure python backend for: http://www.lydiajohnston.com
Back to top
View user's profile Send private message
numlock
n00b
n00b


Joined: 17 Sep 2004
Posts: 58

PostPosted: Sat Oct 02, 2004 8:00 pm    Post subject: Re: I've developed a tiny tool for parallel execution... Reply with quote

killfire wrote:
this question may be that of an ignorant, and feel free to smack me with a fish and/or correct me, but couldnt you do this with just screen?

i guess, your tool has avoided the additional overhead of another instance of bash running, but is it that helpful? with screen, you also get benifits from bash right?

sorry if this is degrading in any way.

killfire

Hi killfire,

I think you're partly correct (except about you being an ignorant :lol: ). In fact, up to now I've been using just screen & bash to parallelize stuff. But after a while there were a few things I got really bored (and lazy!!) of doing manually:

- cumbersome to schedule things for the future (ie: if away for a weekend)
- combersome to add commands after the processes are started
- must manually choose to which screen I add a command
- no display of the command names in the title bar (so I never knew which command was running in which terminal)
- difficult to monitor.

The overhead thing was not a concern (and in fact, you can see in my code, that I invoke 'N' instances of bash). So I can also make use of bash's power :)

While it's running, I can just echo the command into jobs.new and it gets scheduled automatically, in the 1st finished queue. I couldn't get this kind of functionality without writing a bash script similar to this code.

I also didn't find any similar tool on the net (I found mcurl, but it is a bit different and designed for curl only).

PS: Oh, just one thing about my examples: the sleep 10 command is not needed, of course. It is there only to ease testing :wink:
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum