My post about latency and throughput featured an extremely simplistic model to demonstrate that Latency and Throughput are independent. An astute reader called it a spherical cow, a model so over simplified that it is a bit ridiculous.

So, let’s deflate the cow, just a bit, and see how things hold up. I hope you like tables and cow jokes!

(Keenan Crane; GIF by username:Nepluno, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons)

Chewing The Cud

The original model was a streaming system that receives 1 million messages a second. Perfectly spherical.

There were two systems, one with 5s latency, one with 2s latency.

We will leave our processors completely spherical – they each process 100,000 events simultaneously. Our pipelines then look like this

5s Latency

Time	New Events/s	Process Instances	Events Being Processed	Throughput	Extra Capacity
1	1,000,000	50	1,000,000	0	4,000,000
2	1,000,000	50	2,000,000	0	3,000,000
3	1,000,000	50	3,000,000	0	2,000,000
4	1,000,000	50	4,000,000	0	1,000,000
5	1,000,000	50	5,000,000	1,000,000	0
6	1,000,000	50	5,000,000	1,000,000	0
7	1,000,000	50	5,000,000	1,000,000	0
8	1,000,000	50	5,000,000	1,000,000	0

2s Latency

Time	New Events/s	Process Instances	Events Being Processed	Throughput	Extra Capacity
1	1,000,000	20	1,000,000	0	1,000,000
2	1,000,000	20	2,000,000	1,000,000	0
3	1,000,000	20	2,000,000	1,000,000	0
4	1,000,000	20	2,000,000	1,000,000	0
5	1,000,000	20	2,000,000	1,000,000	0
6	1,000,000	20	2,000,000	1,000,000	0
7	1,000,000	20	2,000,000	1,000,000	0
8	1,000,000	20	2,000,000	1,000,000	0

Conclusion: Same Throughput

The Throughput of the two systems is the same.

The first system, with 5s of latency, takes longer to warm up and needs 2.5x more instances, but it still produces the same throughput. 3 seconds later..

What Happens If You Add Scaling?

Maybe that model is too simple. Let’s deflate the cow a little bit, vary the input and add auto-scaling.

Let’s make it an average of 1 million messages a second, with peaks and valleys between 500,000 and 1.5 million per second. 20 second period, so it changes +/- 100,000 messages every second. But, we’re only deflating the cow a little bit, so the changes will be step changes at the end of the second.

We will leave our processors completely spherical – they each process 100,000 events simultaneously. It takes 1 second to start a processor, and 1 second to shut down. The only difference between the two is that one takes 2s to process a message and the other takes 5s.

Now our input looks like this:

5s Latency

Time	New Events/s	Process Instances	Events Being Processed	Events Waiting to be Processed	Throughput	Extra Capacity
1	1,000,000	0	0	1,000,000	0	0
2	1,100,000	10	1,000,000	1,100,000	0	0
3	1,200,000	21	2,100,000	1,200,000	0	0
4	1,300,000	33	3,300,000	1,300,000	0	0
5	1,400,000	46	4,600,000	1,400,000	0	0
6	1,500,000	60	6,000,000	1,500,000	1,000,000	0
7	1,400,000	65	6,500,000	300,000	1,100,000	0
8	1,300,000	68	6,800,000	200,000	1,200,000	0
9	1,200,000	70	7,000,000	0	1,300,000	0
10	1,100,000	70	6,900,000	0	1,400,000	1
11	1,000,000	69	6,500,000	0	1,500,000	4
12	900,000	65	5,900,000	0	1,400,000	6
13	800,000	59	5,300,000	0	1,300,000	6
14	700,000	53	4,700,000	0	1,200,000	6
15	600,000	47	4,100,000	0	1,100,000	6
16	500,000	41	3,500,000	0	1,000,000	6
17	600,000	35	3,100,000	0	900,000	4
18	700,000	31	2,900,000	0	800,000	2
19	800,000	29	2,900,000	0	700,000	0
20	900,000	29	2,900,000	200,000	600,000	0
21	1,000,000	31	3,100,000	400,000	500,000	0

2s Latency

Time	New Events/s	Process Instances	Events Being Processed	Events Waiting to be Processed	Throughput	Extra Capacity
1	1,000,000	0	0	1,000,000	0	0
2	1,100,000	10	1,000,000	1,100,000	0	0
3	1,200,000	21	2,100,000	1,200,000	1,000,000	0
4	1,300,000	23	2,300,000	1,300,000	1,100,000	0
5	1,400,000	25	2,500,000	1,400,000	1,200,000	0
6	1,500,000	27	2,700,000	1,500,000	1,300,000	0
7	1,400,000	29	2,900,000	1,400,000	1,400,000	0
8	1,300,000	29	2,900,000	1,300,000	1,500,000	0
9	1,200,000	29	2,700,000	0	1,400,000	2
10	1,100,000	27	2,500,000	0	1,300,000	2
11	1,000,000	25	2,300,000	0	1,200,000	2
12	900,000	23	2,100,000	0	1,100,000	2
13	800,000	21	1,900,000	0	1,000,000	2
14	700,000	19	1,700,000	0	900,000	2
15	600,000	17	1,500,000	0	800,000	2
16	500,000	15	1,300,000	0	700,000	2
17	600,000	13	1,100,000	0	600,000	2
18	700,000	11	1,100,000	100,000	500,000	0
19	800,000	12	1,200,000	300,000	600,000	0
20	900,000	15	1,500,000	300,000	700,000	0
21	1,000,000	18	1,800,000	300,000	800,000	0

Result – Latency Does Not Impact Throughput

Our slightly less spherical model with perfect step changes produced the same fundamental result:

You can’t increase the throughput of a streaming system to be higher than the input.

Latency has a huge impact on the amount of resources required! The first system, with 5s latency, fluctuated between 29 and 70 instances. The second system, with 2s latency, fluctuated between 11 and 29.

The second system’s maximum scale out was equal in size to the first system’s minimum.

And yet, neither system was able to get above 1.5 million events/s.

No matter how non-spherical the cow may be, you can’t sustain a throughput faster than then inputs.

Sherman On Software.

Latency, Throughput, And Spherical Cows

Chewing The Cud

5s Latency

2s Latency

Conclusion: Same Throughput

What Happens If You Add Scaling?

5s Latency

2s Latency

Result – Latency Does Not Impact Throughput

Like this:

jeffpsherman

The Never Rewrite Podcast, Episode One Hundred Ten: MVPs, YAGNI, and the Goldilocks Problem

Your SaaS Has Scaling Bottlenecks – Do You Know Where?

Related Posts:

Falsehoods Programmers Believe About Projects

State, Persistence, And Cold Restarts

Separating The Work Of Today From The Work of Tomorrow

Leave a ReplyCancel reply

The Never Rewrite Podcast, Episode One Hundred Fifty-One: Celebrating Wins During Constant Iteration

The Never Rewrite Podcast, Episode One Hundred Fifty: The AI Rewrite Trap

The Never Rewrite Podcast, Episode One Hundred Forty-Nine: Isaac Does a Rewrite?

The Never Rewrite Podcast, Episode One Hundred Forty-Eight: How Human Taste Shapes the Future of AI and Coding

The Never Rewrite Podcast, Episode One Hundred Forty-Seven: Managing Organizational Change ft. Sophia Rosa

Latency, Throughput, And Spherical Cows

Chewing The Cud

5s Latency

2s Latency

Conclusion: Same Throughput

What Happens If You Add Scaling?

5s Latency

2s Latency

Result – Latency Does Not Impact Throughput

Share this:

Like this:

Post Navigation

Related Posts:

Leave a ReplyCancel reply

Discover more from Sherman On Software