Colt McNealy Speaks at Current 2024

September 18, 2024 · 2 min read

On September 18, 2024, our Founder Colt McNealy spoke at Confluent's flagship conference in Austin about why LittleHorse decided to use Kafka Streams as a data store for a workflow engine.

At LittleHorse, we are building a high-performance workflow engine for microservice orchestration. We have a long list of performance-related requirements, including low latency, horizontal scalability for throughput, transactional semantics, and high availability. Long story short: we decided to build our own data store on top of Kafka Streams.

You might raise your eyebrows when someone suggests using Kafka as a database, and for very good reason. Apache Kafka is a distributed and replicated commit log, which in itself does not make it a "database". However, under the hood almost all databases have three core components (among others): a write-ahead log (WAL), an indexing layer, and a query layer. Stream processing systems can provide the scaffolding to create an indexing layer—for example, Kafka Streams and Flink are both frameworks that use RocksDB to "index" data in your Kafka "WAL"—but there still remains more work to build a full data store.

We will explain the reasoning behind making the core architectural decision to build a homegrown data store using Kafka Streams. We will then discuss topics such as how we enable synchronous responses to POST requests, how we enable secondary indexes, and why this architecture forces us to use the "Aggregate" pattern for data structures.

Join us to learn:

What data consistency guarantees are provided by Kafka Streams.
How an "inside-out database" impacts available query patterns.
How Kafka Streams provides fault-tolerance and high-availability.

Watch the recording here!