Skip to main content

COSMOS workflow management version 0.4.3

• Written in Python which is easy to learn, powerful, and popular. A programmer with limited experience can begin writing COSMOS workflows right away.
• Powerful syntax and system for the creation of complex workflows.
• Keeps track of workflows, job information, and resource utilization and provenance in a SQL database.
• Visualizes all jobs and job dependencies in convenient drill-down, dynamic web available environment.
• Monitor and debug running workflows, and a history of all workflows via a webinterface.

Intellectual Property Status: Patent(s) Pending


COSMOS is a python-based workflow system designed to allow the creation of modular UNIX pipeline development used with applications that conduct I/O through a POSIX file system. COSMOS provides an efficient programming platform to create complex and highly parallelized workflows of command line-based applications described as a graph of job dependencies. COSMOS handles the submission of jobs using the DRMAA (Distributed Resource Management Application API) interface which supports the most widely used DRMSs (Distributed Resource Management Systems) such as LSF, Grid Engine, SLURM, Torque, and Condor, providing the user with many software options to manage cluster resources (instances, CPU cycles, and storage). In addition, COSMOS creates a simple web interface to facilitate the monitoring and debugging of the submitted jobs. The Web interface provides a drill down environment to examine job and queue status, visualize the job dependency graph, view aggregate and individual timings (wall time, cpu time, virtual RAM, resident RAM, IO blocked time, page faults, etc.), search for jobs with particular attributes, and specific status of log files. The web environment facilitates the user’s ability to monitor, debug, and analyze workflows and greatly enhances the user's ability to tune jobs to efficiently take advantage of all cluster resources. All job information and statistics are stored in a SQL database, with support for PostgreSQL, MySQL, Oracle, and SQLite, and a ORM (Object Relational Mapper) provides a convenient and interactive API to query the database from Python without manually constructing SQL queries.