Skip to main content

Workflows

tip

This section is not a Developer Guide; if you want to learn how to define a workflow or how to run a workflow, please check the appropriate docs in our developer guide.

This section focuses on concepts.

In LittleHorse, the WfSpec object is a Metadata Object defining the blueprint for a WfRun, which is a running instance of a workflow.

A simple way of thinking about it is that a WfSpec is a directed graph consisting of Nodes and Edges, where a Node defines a "step" of the workflow process, and an Edge tells the workflow what Node to go to next.

Workflow Structure

A WfSpec diagram featuring a directed graph of nodes with a conditional that branches based on whether a user approved an IT Request or not.

Screenshot of a WfSpec in LH Dashboard

A WfSpec (Workflow Specification) is a blueprint that defines the control flow of your WfRuns (Workflow Run). Before you can run a WfRun, you must first register a WfSpec in LittleHorse (for an example of how to do that, see here).

A WfSpec contains a set of ThreadSpecs, with one special ThreadSpec that is known as the entrypoint. When you run a WfSpec to create a WfRun, the first ThreadSpec that is run is the entrypoint.

info

You can see the exact structure of a WfSpec as a protobuf message in our api docs.

A WfRun, short for Workflow Run, is an instantiation of a WfSpec.

In the programming analogy, you could think of a WfRun as a process that is running your WfSpec program. A ThreadRun is a thread in that program.

A WfRun is created by the LittleHorse Server when a user requests the server to run a WfSpec, for example using the rpc RunWf.

Threads

A workflow consists of one or more threads. A thread in LittleHorse is analogous to a thread in programming: it has its own thread execution context (set of LH Variables) and it can execute one instruction (in LH, a Node) at a time.

A ThreadSpec is a blueprint for a single thread in a WfSpec. When a ThreadSpec is run, you get a ThreadRun. Logically, a ThreadSpec is a directed graph of Nodes and Edges, where a Node represents a "step" to execute in the ThreadRun, and the Edges tell LittleHorse what Node the ThreadRun should move to next.

In the LittleHorse Dashboard, when you click on a WfSpec you are shown the entrypoint ThreadSpec. In the picture you see, the circles and boxes are Nodes, and the arrows are Edges. Below is a screenshot of a ThreadSpec from the quickstart workflow.

A WfSpec Thread diagram featuring an entrypoint node, a greet TaskNode, and an exit node, connected in that order.

Screenshot of a Thread in LH Dashboard

A ThreadRun can only execute one Node at a time. If you want a WfRun to execute multiple things at a time (either to parallelize TaskRun's for performance reasons, or to wait for two business events to happen in parallel, or any other reason), then you need your WfRun to start multiple ThreadRuns at a time. See the section on Child Threads below for how this works.

For the highly curious reader, you can inspect the structure of a ThreadRun in our api docs here. At a high level it contains a status and a pointer to the current NodeRun that's being executed. The real data is stored in the NodeRun, which you can retrieve from the API as a separate object.

Variables

Just as a program or function can store state in variables, a WfRun can store state in Variables as well. Variables in LittleHorse are defined in the ThreadSpec, and as such are scoped to a ThreadRun. Note that a child ThreadRun may access the variables of its parents.

A Variable is an object that you can fetch from the LittleHorse API. A Variable is uniquely identified by a VariableId, which has three fields:

  1. The wf_run_id, which is the ID of the associated WfRun.
  2. The thread_run_number, which is the ID of the associated ThreadRun (since a Variable lives within a specific ThreadRun).
  3. The name, which is the name of the Variable.

A Variable is created when a ThreadRun is created. Since it's possible to have multiple ThreadRuns created with the same ThreadSpec (for example, iterating over a list and launching a child thread to process each item), simply identifying a Variable by its name and wf_run_id is insufficient. That is why the VariableId also includes the thread_run_number: a Variable is uniquely identified by its name, workflow run id, and thread run number.

On the dashboard, you can see all of the Variables from a ThreadRun, and their current values, in the tray below it.

Three variables from an `it-request` Workflow Run, showcasing the variable Names, Types, Visibility, and their values.

Screenshot of Variables in LH Dashboard

You can fetch Variables using rpc GetVariable, rpc SearchVariable, and rpc ListVariables.

Variables can be of certain types, which you can find in the VariableType enum.

Lastly, a Variable's value can be set when the thread is created, and the value can be mutated using a VariableMutation after the completion of a Node.

Nodes

A Node is a "step" in a ThreadRun. LittleHorse allows for many different types of Nodes in a WfSpec, including:

info

For a complete list of all of the available node types, check out the Node protobuf message.

A Node is not a fully-fledged object in the LittleHorse API. Rather, it is a sub-structure of the WfSpec object, which is an object in the LH API.

When a ThreadRun arrives at a Node, LittleHorse creates a NodeRun, which is an instance of a Node. In the case of a TASK Node, a TaskNodeRun is created (which also causes the creation of a Task which is dispatched to a Task Worker).

In contrast to a Node, a NodeRun is an object in the LH API, which stores data about:

  • When the ThreadRun arrived at the Node
  • When the NodeRun was completed (if at all).
  • The status of the NodeRun.
  • A pointer to any related objects (for example, a Task NodeRun has a pointer to a TaskRun).

When you click on a NodeRun in the dashboard, that information is fetched and displayed on a screen. You can also retrieve information about a NodeRun via some lhctl commands:

  • lhctl list nodeRun <wfRunId>: shows all NodeRun's from a WfRun.
  • lhctl get nodeRun <wfRunId> <x> <y>: retrieves the yth NodeRun from the xth ThreadRun in the specified WfRun.

Threading Model

Just as a WfSpec is a blueprint for a WfRun (workflow), a ThreadSpec is a blueprint for a ThreadRun (thread). A ThreadSpec is a sub-structure of a WfSpec; a ThreadRun is a sub-structure of a WfRun, and therefore neither are top-level objects in the LittleHorse API.

Every workflow has one special thread called the Entrypoint Thread. If you consider a WfSpec as a program, then you could say that the Entrypoint ThreadSpec is like the main() method of the WfSpec/program.

Below is a screenshot of a WfSpec which does the following:

  1. Start a child thread.
  2. Execute the greet TaskRun.
  3. Wait for the child thread to complete.

Note that, above the diagram, you can select the ThreadSpec to show. Currently, the entrypoint ThreadSpec is selected.

A WfSpec diagram. There are two buttons above the diagram, each representing a different Thread available in that Workflow. The 'entrypoint' thread is highlighted since it is currently selected.

Screenshot of a WfSpec with a child thread in LH Dashboard

On the bottom of the page, you can click on a ThreadSpec to see the variables defined by that ThreadSpec.

When a WfSpec is run and a WfRun is created, the WfRun creates an Entrypoint ThreadRun which is an instance of the specified Entrypoint ThreadSpec.

For many workflows with only one thread (for example, our quickstarts), the Entrypoint Thread is the only thread in the workflow, and thus it's often simple to think of it as just the entire workflow.

Child Threads

In computer science, the main thread of a program can launch child threads within the same process. Child threads in programming run in the same memory address space and can share certain variables with the parent process.

Similarly, LittleHorse allows you to launch child threads. A child thread results in a new ThreadRun being created in the same WfRun.

Child Threads have many use-cases. A subset of those are:

  • Parallel Execution: A single ThreadRun can only execute one Node at a time. Child ThreadRuns allow you to execute multiple business process threads at once within a single workflow.
  • Error Handling Boundaries: You can attach Failure Handlers to a single Node (for example, a TaskRun), or to a whole ThreadRun by attaching it to the WaitForThreadsNode.
  • Workflow Decomposition: Using a Thread allows you to decompose your workflow into smaller logical chunks which makes for more understandable code and workflow diagrams.
  • Repeatable Functionality: Certain workflows may require executing the same business process with multiple inputs. For example, a workflow might require asking three different departments to approve a change. You could use the same ThreadSpec with different input variables, each running as a child ThreadRun sequentially or in parallel.

Variable Scoping

As described above, a Variable is scoped to the ThreadRun level. A Variable object is created in the LittleHorse API when a ThreadRun starts.

When a child ThreadRun of any type is started, it has read and write access to its own Variables, and all Variables that its parent has access to (including the parent's parent, and so on).

info

Since a ThreadRun can have multiple children, the parent does not have access to the variables of the children.

Failure Handling

A Failure in LittleHorse is like an Exception in programming. It means that A Bad Thing® has happened. In a workflow engine like LittleHorse, there are two potential sources of Failure:

  1. A technical process, such as an external API call, fails.
  2. Something goes wrong at the business process level; for example, a credit card has insufficient funds.
caution

Exception Handling in LittleHorse is a fully separate concept from TaskRun retries.

ERRORs and EXCEPTIONs

A Failure that is a result of a technical problem, such as a variable casting error or a TaskRun timeout, is an ERROR in LittleHorse. All ERRORs are pre-defined by the LittleHorse System. You can find them at the LHErrorType documentation.

In contrast, a business process-level failure is an EXCEPTION. All EXCEPTIONs are defined by users of LittleHorse. You must explicitly throw an EXCEPTION with a specific name.

By rule LittleHorse uses the following naming conventions for ERRORs and EXCEPTIONs:

  • ERROR's are pre-defined in the LHErrorType enum and follow UPPER_UNDERSCORE_CASE.
  • EXCEPTION names are defined by users and follow kebab-case.

As per the Exception Handling Developer Guide, you may have different error handling logic for different Failures. For example, you can catch failures for a specific ERROR, any ERROR, a specific EXCEPTION, any EXCEPTION, or any Failure.

Interrupts

There are four types of ThreadRuns in LittleHors:

For a description of INTERRUPT threads, please check out the External Event docs.

Lifecycle

The status of a WfRun is determined by looking at the status of the Entrypoint ThreadRun. A ThreadRun, and by extension a WfRun, can have one of the following statuses, determined by the LHStatus enum:

  • RUNNING
  • HALTING
  • HALTED
  • ERROR
  • EXCEPTION
  • COMPLETED

Halting a Workflow

A ThreadRun can be halted for any of the following reasons:

  • If a StopWfRun request is received (manual halt by system administrator).
  • When interrupted by an ExternalEvent which triggers an Interrupt Handler.
  • If the ThreadRun is a child thread, and the parent ThreadRun is HALTED o.

Note that halting a parent ThreadRun causes all of the children of that ThreadRun to be halted as well.

When a ThreadRun is halted, it first moves to the HALTING status until the current NodeRun can be halted as well (for example, it's always possible to halt an ExternalEventNode but a TaskNode can't be halted while there is an in-flight TaskAttempt).

The criteria for halting a ThreadRun are as follows:

  • If the ThreadRun has any child threads, all children must be in the COMPLETED, ERROR, or HALTED state.
    • If this condition is not satisfied, then the runtime will halt all Children.
  • There can be no TaskRuns that have been dispatched to a Task Worker but not completed, failed, or timed out. In other words, no in-flight tasks.

If a WfRun is waiting at an EXTERNAL_EVENT, USER_TASK, WAIT_FOR_THREADS, or SLEEP Node, the second condition is automatically satisfied.