On the Black Art of Designing Computational Workflows

Authors: Yolanda Gil, Pedro A. González-Calero, and Ewa Deelman - USC and Madrid

Complete Citation

  • Yolanda Gil, Pedro A. González-Calero, and Ewa Deelman. On the black art of designing computational workflows. Proc. Workshop on Workflows in Support of Large-Scale Science, 2007.

Abstract

Computational workflows have recently emerged as an effective paradigm to manage large-scale distributed scientific computations. Workflow systems can automate many execution-level details and provide assistance in composing and validating workflows. However, there is still a significant effort involved in creating these workflows since they often represent collaborative and exploratory science experiments. Therefore, current practice is effective in producing results but not cost-effective for widespread adoption. Drawing from our previous research in computational workflows across scientific disciplines, this paper analyzes the tasks and overall process for designing these workflows. We discuss software engineering methodologies and their relevance to creating workflows as a unique kind of software artifact. We also discuss our ongoing work to make workflow applications more cost effective and lower the barriers for widespread adoption of workflow technologies.

Annotations

The authors describe some observations from working with scientific workflows. They provide a variety of guidelines for the construction of new workflows. Considers the workflow creation process, focusing on practical problems encountered by scientific researchers looking for results and engineers attempting to make systems work.

The authors identify eight significant steps in the creation of a workflow:

  • Establishing roles;
  • Initial design;
  • Formalizing workflow;
  • Metadata creation;
  • Testing;
  • Scaling up;
  • Creating variants;
  • Generalizing workflow.

Each step has various substeps.

The authors observe that robust workflow systems are typically constructed after other ad hoc systems may be in place. Additionally, any workflow system will provide a variety of benefits that come at a cost of time investment. Thus, the costs and benefits of several workflow results must be considered, including:

  • A clear separation between application-specific concepts and execution details;
  • Result validation and documentation;
  • Acceleration of the experimental cycle;
  • Broader participation in the experimental cycle;
  • Scalability.

Future problems in workflow development identified by the authors include:

  • Understanding the role of the workflow system;
  • Initial design issues;
  • Component modeling and versioning;
  • Scalability and profiling;
  • Workflow reuse (catalogs).

In general, the authors support treating workflow construction like any other software problem (UML, etc.).

Tags: Scientific workflows, computational workflows, software design, workflow design, workflow systems.

Related Work

  • Pegasus
  • Wings
  • Frakes W. and K. Kang, “Software Reuse Research: Status and Future”.

-- JustinWozniak - 22 Aug 2007

Topic revision: r3 - 26 Sep 2007 - 03:15:42 - JustinWozniak
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback