Skip to main content

Exception Handling

In normal programming languages like Java, you can use Exceptions to handle exceptional cases that come up at runtime. Exceptions can be thrown intentionally by your own code, or they can arise organically from calls to other methods or libraries. Likewise, LittleHorse has the concepts of Failures and failure handling.

Concepts

A Failure in LittleHorse is like an Exception in programming, and it tells your WfSpec that A Bad Thing® has happened inside your WfRun.

caution

Exception Handling in LittleHorse is a separate concept from TaskRun retries.

Failure Types

Since LittleHorse is a distributed system wherein Task Workers perform network calls and talk to external systems, there are two types of Failures in LittleHorse: EXCEPTIONs and ERRORs. We use EXCEPTIONs when something goes wrong at the business process level—for example, when a credit card has insufficient funds—, and we use ERRORs when something fails at the technical level—for example, an API call fails during a TaskRun.

A Failure that is a result of a technical problem, such as a variable casting error or a TaskRun timeout, is an ERROR in LittleHorse. All ERRORs are pre-defined by the LittleHorse System. You can find them at the LHErrorType documentation.

In contrast, a business process-level failure is an EXCEPTION. All EXCEPTIONs are defined by users of LittleHorse. You must explicitly throw an EXCEPTION with a specific name.

By rule LittleHorse uses the following naming conventions for ERRORs and EXCEPTIONs:

  • ERROR's are pre-defined in the LHErrorType enum and follow UPPER_UNDERSCORE_CASE.
  • EXCEPTION names are defined by users and follow kebab-case.

As per the Exception Handling Developer Guide, you may have different error handling logic for different Failures. For example, you can catch failures for a specific ERROR, any ERROR, a specific EXCEPTION, any EXCEPTION, or any Failure.

Throwing Failures

You can explicitly choose to throw a business-level EXCEPTION in your WfSpec or in your Task Worker code.

In contrast, you cannot explicitly choose to throw a technical ERROR: they just occur when something goes wrong that is out of our control. The common causes of an ERROR are:

  • A Task Worker encounters an unexpected error while executing a TaskRun (for example, an external API returns a 500).
  • A network error or server crash causes a TaskRun to time out.
  • Runtime type errors when your WfSpec or a Task Worker expects data to be of a different type than it is, most commonly when working with JSON_OBJ and JSON_ARR variables.
info

If you use lhctl to inspect a NodeRun via lhctl get nodeRun <wfRunId> <threadRunNumber> <nodeRunPosition>, you can see any Failures thrown by that NodeRun on the protobuf itself in the failures field.

If you want to throw a Failure from within a WfSpec, you can do it with the WorkflowThread#fail() method. This will create an EXIT node that has a failure defined. When a ThreadRun arrives at that EXIT node, it moves to the EXCEPTION status and throws the defined failure. This looks like:

String failureName = "my-exception";
String failureMessage = "This is a Failure thrown from the WfSpec";
wf.fail(failureName, failureMessage);

You can throw a Failure from a Task Worker within the logic of your own Task Function by throwing a special error: the LHTaskException. For example, it would look like this:

@LHTaskMethod("charge-credit-card")
public void chargeCreditCard(String toCard, double amount) {
if (amount > 10000) {
throw new LHTaskException("amount-too-large", "Cannot charge more than $10,000");
}
// continue with your task as planned
}

Catching Failures

In Java, you can catch Exceptions with an Exception Handler. LittleHorse uses the concept of a FailureHandlerDef, which defines what failure to catch and . Every Failure in LittleHorse belongs to a specific NodeRun; as a corollary, every Failure Handler belongs to a Node.

When a ThreadRun catches a failure, it launches a Failure Handler ThreadRun. The Failure Handler is a child of the failed ThreadRun, meaning that the Failure Handler has read/write access to all of the variables in the scope of the failed parent.

If the Failure Handler ThreadRun (the child) successfully completes, then the failed parent ThreadRun will continue from where it left off, moving to the next Node that it would have gone to. If the Failure Handler fails, then the parent will fail with the originally-thrown Failure.

When you are building your WfSpec, you can choose whether to handle a technical ERROR, a business EXCEPTION, or any Failure. You can also put different Failure Handlers on a single Node to handle different EXCEPTIONs and different ERRORs differently.

tip

If you want to catch a failure on a group of tasks rather than one specific task, you can wrap them in a Child ThreadRun and catch a failure on the WAIT_FOR_THREADS node.

When using this strategy (as opposed to putting a Failure Handler on each task), all tasks in the group after the task that failed will be skipped.

In Practice

Let's take a concrete look at Failure Handling with a classic workflow: order processing. In the happy path, our fictitious workflow will:

  1. Charge the customer's credit card (using an external payments SaaS service).
  2. Ship the item using a logistics service like Active Omni.

Our first task (charging the credit card) can fail for multiple reasons:

  • The customer's credit card could have insufficient balance.
  • The call to the SaaS service could fail due to an intermittent network issue.
note

In production, we would recommend you use retries and idempotency on the "charge-credit-card" task to ensure that the transaction completes; however, for the purpose of this demo we will leave that part out.

The Task Worker

The Task Worker will throw Failures in two ways:

  1. Intentionally, through the LHTaskException utility, which throws a business EXCEPTION stating insufficient-funds.
  2. Unintentionally, by "accidentally" (or sloppily) not catching runtime exceptions thrown by network calls to the third-party SaaS service.

In addition to the interesting charge-credit-card Task, we will have two "boring" tasks: 1) a Task to ship-item, and 2) a task to notify-transaction-canceled which emails the affected customer and reports that the transaction was canceled.


package io.littlehorse.quickstart;

import java.util.Random;
import java.util.concurrent.ThreadLocalRandom;

import io.littlehorse.sdk.common.config.LHConfig;
import io.littlehorse.sdk.common.exception.LHTaskException;
import io.littlehorse.sdk.worker.LHTaskMethod;
import io.littlehorse.sdk.worker.LHTaskWorker;

class ExampleTasks {

@LHTaskMethod("charge-credit-card")
public void chargeCreditCard(String userId, double amount) {
if (amount > fetchAmount(userId)) {
throw new LHTaskException("insufficient-funds", "User " + userId + " has insufficient funds");
}

// Simulate a random network failure
if (new Random().nextBoolean()) {
throw new RuntimeException("Uh oh, network failure!");
}
System.out.println("Successfully charged credit card of user " + userId);
}

private double fetchAmount(String userId) {
// simulate fetching the current balance from a database
return ThreadLocalRandom.current().nextDouble(0.0, 100.0);
}

@LHTaskMethod("ship-item")
public void shipItem(String itemId, String userId) {
System.out.println("Successfully shipped item " + itemId + " to user " + userId);
}

@LHTaskMethod("cancel-order-insufficient-funds")
public void cancelOrderInsufficientFunds(String userId) {
System.out.println("Notifying user " + userId + " that order was canceled due to insufficient funds on the card");
}

@LHTaskMethod("notify-order-failed")
public void notifyOrderFailed(String userId) {
System.out.println("Notifying user " + userId + " that order failed for technical reasons");
}

}

public class Main {

public static void main(String[] args) throws Exception {
LHConfig config = new LHConfig();
ExampleTasks taskFuncs = new ExampleTasks();

LHTaskWorker chargeCreditCard = new LHTaskWorker(taskFuncs, "charge-credit-card", config);
LHTaskWorker shipItem = new LHTaskWorker(taskFuncs, "ship-item", config);
LHTaskWorker notifyOrderFailed = new LHTaskWorker(taskFuncs, "notify-order-failed", config);
LHTaskWorker cancelOrder = new LHTaskWorker(taskFuncs, "cancel-order-insufficient-funds", config);

chargeCreditCard.registerTaskDef();
shipItem.registerTaskDef();
notifyOrderFailed.registerTaskDef();
cancelOrder.registerTaskDef();

Runtime.getRuntime().addShutdownHook(new Thread(chargeCreditCard::close));
Runtime.getRuntime().addShutdownHook(new Thread(shipItem::close));
Runtime.getRuntime().addShutdownHook(new Thread(notifyOrderFailed::close));
Runtime.getRuntime().addShutdownHook(new Thread(cancelOrder::close));

chargeCreditCard.start();
shipItem.start();
notifyOrderFailed.start();
cancelOrder.start();
}
}

The WfSpec

Our WfSpec will have two different Failure Handlers: one for the insufficient-funds EXCEPTION and one for any ERROR. The WfSpec will look like this:

package io.littlehorse.quickstart;

import io.littlehorse.sdk.common.config.LHConfig;
import io.littlehorse.sdk.wfsdk.NodeOutput;
import io.littlehorse.sdk.wfsdk.WfRunVariable;
import io.littlehorse.sdk.wfsdk.Workflow;
import io.littlehorse.sdk.wfsdk.WorkflowThread;

public class Main {
public static final String WF_NAME = "exception-example";

public static void wfLogic(WorkflowThread wf) {
WfRunVariable price = wf.declareDouble("price").required();
WfRunVariable itemId = wf.declareStr("item").withDefault("lightsaber");
WfRunVariable userId = wf.declareStr("user-id").withDefault("obiwan");

NodeOutput chargeCreditCardHandle = wf.execute("charge-credit-card", userId, price);

// Handle business exception
wf.handleException(chargeCreditCardHandle, "insufficient-funds", handler -> {
handler.execute("cancel-order-insufficient-funds", userId);
handler.fail("insufficient-funds", "Credit card did not have sufficient funds");
});

// Handle any random technical failure
wf.handleError(chargeCreditCardHandle, handler -> {
handler.execute("notify-order-failed", userId);
handler.fail("technical-failure", "Failed to charge credit card");
});

// Ship the item
wf.execute("ship-item", itemId, userId);
}

public static void main(String[] args) throws Exception {
LHConfig config = new LHConfig();
Workflow wfGenerator = Workflow.newWorkflow(WF_NAME, Main::wfLogic);
wfGenerator.registerWfSpec(config.getBlockingStub());
}
}

At this point, you should see a WfSpec with two tasks:

A WfSpec diagram with two tasks in sequence. Note, however, there are two extra ThreadSpec's visible on top.
The Exception Example WfSpec

The WfSpec at first appears simple, with only two Nodes. However, if you look closely you can see that there are two exn-handler threads on top. Clicking on one of them shows the following:

The same WfSpec as above but with one of the Failure Handler ThreadSpec's shown. The exit node is red, denoting that we throw a failure.
A Failure Handler

You'll notice that the last node in the ThreadSpec, or the EXIT node, is red. This denotes that it throws a failure, which we can see in our code when we call handler.fail("insufficient-funds", ...). This means that the ThreadRun fails and throws the insufficient-funds exception up.

Running the Workflow

Let's run the workflow a few times. The only variable you need to pass in is the price. Run it a few times, and you'll see different failure modes. Have fun!