A counter intuitive property of streaming systems is that latency has no long term impact on throughput. Increasing or decreasing latency will give a short term change, but once the system stabilizes in its steady state, the throughput will be the same as before.
How can latency and throughput, two important performance metrics, be unrelated?
Let’s define some terms
Latency is the amount of time between when a message is sent and when it is fully processed. This includes the time spent getting the message onto the stream, in queue waiting to process, and process time.
Throughput is the number of completions in a time period. It could be 1 million messages a second, 5 per hour, or anything else. Throughput doesn’t include processing time, that’s part of latency. The million messages/s could have taken 10ms or 10 minutes each to process; so long as 1 million of them finish every second, the throughput is 1 million/s.
Steady State is when the system is fully warmed up and taking on its full load. For a streaming system, this means that it is consuming the full stream, it is producing its maximum output, and the work in progress is being added to as rapidly as it is finished.
Example
Imagine two systems that receive 1 million events per second. The first system takes 5s to process a million messages, the second system takes 2s to process the same messages.
The latency is different, the throughput is the same!
Implications beyond Latency and Throughput
Besides latency and throughput, there are 3 other notable differences between the two systems.
- Higher latency means more events in flight. When it gets to steady state, the first system will be working on 5 million events at a time, the second system will only be working on 2 million. This usually means that the first system will require more resources – bigger queues, more workers, a higher degree of parallelism, etc.
- Higher latency means slower startup. It takes 5 seconds for events to start emerging from the first system, but only 2 seconds for the second system.
- Higher latency means slower shutdown. At the other end of the lifecycle, systems with higher latency take longer to drain and safely shut down than systems with lower latency.
Summary
Why doesn’t latency matter? Because streaming systems have constrained inputs. So long as the system has enough capacity to handle 100% of the inputs, then latency doesn’t impact throughput.
Latency still controls the system requirements; slow is expensive!

