Welcome to PyTaskFarmer’s documentation!

PyTaskFarmer is a simple task farmer written in Python for parallelizing multiple serial jobs at NERSC. It is flexible enough to run on other systems. It is very loosely based on the concept of Shane Canon’s TaskFarmer.

More complex task list definitions and setup environments are implemented through the concepts of tasklist handers and runners.

Features

  • Per-task checkpointing.

  • Mulitple farmers running on the same tasklist.

  • Simple synchronization protocol using the file system.

  • Abstract definition of tasklists via tasklist handlers.

  • Automatic environment setup (ie: asetup or shifter).

  • Analysis of task packing using Perfetto.

Quick Start

PyTaskFarmer can be installed using pip.

pip install pytaskfarmer

Create a list of tasks that you want to process in parallel. In this simple example, a counter will be echoed.

for i in $(seq 0 10); do
  echo ${i} >> mywork.tasks
done

Run PyTaskFarmer. The progress will be stored inside the specified workdir.

pytaskfarmer.py --proc 8 --workdir myworkdir mywork.tasks

Usage

The executable script is:

usage: pytaskfarmer.py [-h] [--proc [Processes]] [--timeout TIMEOUT]
                        [--workdir WORKDIR] [--verbose VERB]
                        [--runner RUNNER] [--tasklist TASKLISTHANDLER]
                        tasklist

The tasklist argument is a simple text file with one task per line. The interpretation of the task is up to the TASKLISTHANDLER. By default, the task is treated as a command to run. It is not important how complex the command is.

The --verbose flag adds a bit more output (though not much) telling You what the script is doing as it runs.

The --timeout option allows you to set a timeout for the script, so that after some number of seconds the tasks will be automatically killed (none by default).

The --proc option tells the script how many parallel workers to create ( default 8).

The --workdir option tells the script where to store the progress (task status, log files..) of a single run (default is tasklist_workdir).

The --runner options indicates which runner to execute the command with. See the dedicated section on the available runners and how they work.

The --tasklist options indicates which tasklist handler to parse the tasklist with. See the dedicated section on the available runners and how they

What It Does (60 second version)

The basic behavior, with the default runner/handler, is as follows. Each access to a file is protected using a file locking mechanism.

  1. The tasklist is read and a toprocess file is created in the workdir with unprocessed tasks.

  2. A number of workers (multiprocessing.Pool) are constructed to run on the tasks.

  3. When some work is done, the command is placed into a finished or failed files, depending on the status code.

  4. Duration and start times of completed tasks (timeline) are saved into a timeline.json file. This can then be opened with Perfetto.

  5. The tasks are processed by the workers until 1) the work is completed; 2) the timeout is reached; or 3) a signal is intercepted.