Abandoned Wells And Other Dangers In Your Codebase

Giant, long lived, codebases are full of code that should never be used for anything new.  Not only shouldn’t the code be used for anything new, it shouldn’t be used at all; but if it ain’t broke, don’t fix it.

The brownfield codebase gets littered with mineshafts, wells, and other hazards from products that were discontinued or never made it into production.  Will the shaft collapse if you start digging?  Maybe.  Are those abandoned wells the right size to swallow new developers?  Absolutely.

How do you move forward when your codebase has become a hazardous environment?

When a developer hangs the UI for your biggest customers because a reporting function is slow; is that the UI developer’s fault for not testing well enough, the report developer’s fault for not adding a warning about the function’s performance, or a sign of a dangerous codebase?

Once your codebase becomes hazardous it is everyone’s responsibility to be cautious, and everyone’s responsibility to warn of danger.  Develop as a team and watch each other’s backs, otherwise you might end up stuck in a well.

Cost To Serve For Staff+ SaaS Developers

Cost To Serve is a critical metric for Staff+ SaaS Developers to understand because it shapes your understanding of the business.  Extremely oversimplified, Cost To Serve is the cost of  everything that goes into producing, running, and supporting software, divided by the number of customers.

Cost To Serve = [The cost of everything] / [Number of Customers]

This is extremely inaccurate, but it helps to point out two major concepts.  First, a large part of Cost To Serve is fixed.  Developer salaries, aka producing software, do not change based on the number of customers.  This leads to the second concept, SaaS pricing models work on leverage.  As the number of customers goes up, the fixed costs to serve each one goes down.

So, the path to success is to get as many customers as possible right?  No, because not all costs are fixed, and not all customers are profitable.

Not All Customers Are Profitable

The first problem with the oversimplified view is that not all customers cost the SaaS the same amount of money.  This is why SaaS pricing includes feature and usage tiers.  The problem is that the features and tiers don’t exactly align with usage and cost.

Pricing based on Contacts is standard for CRMs, charging for imports is rare.  A customer who imports 1 million contacts once has a lower cost than one who reimports a million contact list every week.

Webhooks never make it into pricing models, but their costs add up.

There are dozens of legitimate usage patterns that can make the difference between profitable and unprofitable customers.  You won’t find them on the pricing page, but chances are your finance department has at least a vague model.  It is expected that some customers will be more profitable than others; and that some will even be unprofitable.

You Need A Model For Cost To Serve

It doesn’t matter what your company’s Cost To Serve is and you don’t need to know the fixed costs.  What you need is a model to evaluate customer usage.  You need to be able to explain which usage patterns are expensive and estimate the potential costs of new features.

If your SaaS is complex, try breaking it down like Big-O notation.  What costs scale linearly, which ones are exponential?

Your model can improve over time, the important thing is to start today.  Cost To Serve is a critical metric for SaaS companies; as a Staff+ Developer you need to understand it, and how it differs from your company’s pricing model.

Skipping Tests To Deliver Faster

Managers with looming deadlines often tell developers to skip writing tests in order to deliver code faster.  This only makes sense if you don’t believe that unit tests pay off in initial development, or you view the impact of bugs as an externality.

I have encountered extremists who don’t believe that unit tests ever provide value; that’s not the case here.  Managers who want to skip writing tests to deliver faster are less extreme.  They are stating that unit tests pay for themselves over time and that they won’t provide a net benefit until after the first release.

This is not as crazy as it sounds.  Tests become more valuable over time as they allow future developers to refactor with confidence.  You can reason backwards and say that if the test is more valuable in the future, it must be less valuable today.  Right now, the test could even conceivably be worth less the cost of writing it.  And when you need to ship, you cut things that don’t add value.

Which brings forth the second part of the argument: bugs are an externality.

If you deliver a bug ridden project on time, have you succeeded?  Sadly, the answer is usually yes.  Managers get evaluated on delivering on time, developers get evaluated on quality. Managers can believe that tests add value to the code, but they don’t add value to the manager.

By their definition deadlines prioritize short term thinking.  Deadlines encourage managers to make short term tradeoffs at the expense of long term value creation.  When managers push to skip tests, the people who suffer most are customers who have to use the software.  The people who suffer least are the managers who traded tests for time.

The Purpose Of A Demo Is Feedback

Internal tech demos exist so that stakeholders can see and comment on the work being delivered.  Hopefully the demo produces smiles, cheers, and high fives.  Often, the stakeholders will ask questions and make objections that take everything off of the rails.  These can be painful moments; they are also extremely valuable.

Disastrous demos give you at least one of two great pieces of information:

You learn that you were building the wrong thing.  

Having bug free, highly performant code doesn’t matter if it is correctly doing something other than what stakeholders want.  The longer it takes to learn that you’re building the wrong thing, the more time and money you’re wasting.

It sucks to hear you spent a week or two building the wrong thing and need to scrap some work.  It sucks much harder when you’ve spent 6 months heading in the wrong direction.

You learn that you can’t speak to the business value

How you demo reflects your understanding of the software.  I’ve seen many demos where the developers have delivered the right thing, but they don’t understand why.  Software development in knowledge work; “I built what I was told” is the wrong answer.  

You need to know why stakeholders want the software, and speak to it in your demo.  When you have the value wrong the stakeholders will tear the demo apart looking for their business value.

Remember, demos are for feedback

Disastrous demos suck.  Disastrous demos are also successes - you learned something critical about the project that you didn’t know.

Don’t let disappointment distract from learning and improving.

Collaborative Breakdown: Estimating Full Projects

Have you ever been asked to fully estimate a full project so that someone else can decide if it is worth pursuing?  Did the request set off alarm bells in your head?  

It should!  Estimating full projects is a sign that the collaborative process has already broken down!  

Estimating full projects is a trap that prevents developers from bringing their most important skills to bear.  Instead of collaboration towards a common goal, estimation pushes toxic all-or-nothing demands:

  • The project has a value, but it isn’t being shared with you.  Instead the project owner is asking for an estimate; and it needs to be less than the project’s value.  If your estimate is above the line, you’ll get pressure to revise the estimate down.  Worse, your estimates will often be ignored and timelines will be dictated.  All so that the project will hit numbers you’ve never seen.
  • You are a professional software developer and you have to accept their diagnosis about the software solution.  Is this project the best way to pursue the opportunity?  Could you do it faster and cheaper some other way?  Doesn’t matter; estimate this project.  Your expertise and creative inputs have been rejected.
  • The project will be a big bang deliverable.  You’re given a scope of work and asked how long it will take.  The asker wants the full project.  You’ll have to fight to iterate, make small releases, de-risk, or even prove the concept.
  • The project scope will always miss some requirements; the larger the project, the larger the miss.  The misses will blow out the timeline, and you will get blamed for missing the estimate.

The alternative is a collaborative process!

Instead of pressure for your estimate to come in below an unknown ceiling, you can scope last.

Instead of pressure for you to accept a project, you can work together to shape the project.

Instead of pressure for you to deliver a giant project in one perfect step, you can work together to deliver iteratively.  

You even save all of the time spent creating a project plan, estimating the plan, and deciding whether the plan is worthwhile!

Pushing back will uncomfortable the first few times.  The first time you have a conversation that starts with “I see this opportunity, let’s talk about how we can seize it”, it will all have been worth it.

Musketeering Makes Problems Intractable

Musketeering is lumping multiple difficult problems together to present a giant, intractable, disaster.

The name comes from the famous slogan: All for one, and one for all!  Each musketeer supports the group, and the group supports each musketeer.  When multiple problems form as one, they become impossible to defeat.

Most developers have faced a classic Three Musketeer problem with legacy code:

  1. The code is full of bugs
  2. Unit testing is nearly impossible
  3. Touching anything can have unknown side effects

Each of these issues are fixable on their own, together they bring development to a halt.

Why can’t you fix the bugs?  Because testing is nearly impossible and everything you touch has side effects.

Why can’t you write tests?  Because the code is tightly coupled, which produces side effects.  Also, it is full of bugs so we don’t know what the correct functionality is.

Why can’t you reduce side effects?  Because the code is buggy and there are no tests.  If you can’t separate the concerns, you can’t make progress.

Could you do a cold restart?

Years ago I worked at a mortgage company that bought a bank’s mortgage division.  The deal was mostly for sales people, but it also included custom software and developers.  To ensure that the handover was clean, we were only given the source code.

We had DB Schemas, but no seed data.

This was at the very dawn of Infrastructure-as-code; we didn’t get any.

There were docs about deploying, and there were docs about building servers; they were wildly out of date.

18 months and millions of dollars in salary and opportunity cost later, the project was shut down.  We never got the system fully functional.  We never got close.

You probably won’t be sold to a competitor, but there’s a decent chance your production environment will get compromised by hackers.  

If you lost all running instances of your software and had to rebuild from whatever you had in source control, could you do it?

How long would it take?

Writing A Run Book Can Be Your First Iterative Step

Writing a Run Book can be your first iterative step towards mitigating recurring problems.  Recurring problems can cause massive productivity problems, but don’t get fixed because the immediate issue is elsewhere.

For example, background worker systems rarely fail on their own, instead some unique situation will cause the workers to get stuck, the controller to get confused, or the queue to be poisoned.  Each time, there are really two issues.  The bespoke issue that broke the background processes and the recovery of the background worker system.

Since each new failure is unique, there is a tendency to treat the background system recovery as a unique problem too.  This increases recovery time and prevents you from learning from past mistakes.  Because the bug isn’t in the background system itself, there is often no motivation to spend time on the code.  Fix the bug, restart the system, and move on with your day.

Enter the run book.

Write down the steps needed to mitigate the problem.  This is for humans, so it can be an open ended description of what to look for, it won’t be very programmatic.

Once you have it, keep iterating.  Add code snippets, descriptions, and flow logic.

As you iterate, you will notice that some parts of the process can be scripted, or even automated.

Iteration after iteration, more and more of the run book will become code, which makes it easier to code up the remaining pieces.

Will you be able to iterate the recurring problem out of existence?  Maybe, maybe not.  But with a run book and a plan, you will make progress and not be waiting for the next outage to wreck your day.

Falsehoods Programmers Believe About Projects

Years ago, Patrick McKenzie, wrote an article titled Falsehoods Programmers Believe About Names.  The article inspired a short lived burst of other programming falsehoods.  This is a very late entry in the genre, covering incorrect assumptions about software projects.

All of these assumptions are wrong.  Try to make fewer of these assumptions when working on projects.  Since you are always working on projects at work, always be questioning your assumptions about projects.

  1. Projects have defined beginnings.
  2. Maybe not formal beginnings, but there is a point when you are supposed to start work.
  3. Your manager knows when you should start working.
  4. You can use your priorities to determine when you should start working.
  5. You should be working on the project because you were asked.
  6. You should not be working on the project because you weren’t asked.
  7. Projects have defined endings.
  8. Successful projects have endings.
  9. Failed projects have endings.
  10. The project will solve the problem.
  11. The project will solve a problem.
  12. The project won’t make the problem worse.
  13. Everyone on the project agrees on what problems the project is supposed to solve.
  14. Everyone agrees about what solving the problems means.
  15. Solving the problem will make the project a success.
  16. Not solving the problem will make the project a failure. 
  17. There is a relationship between the project’s success and the status of the problem.
  18. The software you are asked to write will solve the problem.
  19. The software you are asked to write will make the project a success.
  20. Writing the software you are asked to write means you are doing a good job.

At best, projects are best guesses by well intended people.  At their worst, projects can become meaningless busywork that is completely unrelated to any problems or desires at a company.  The fewer false assumptions you buy into the more effective you will be. 

Tech Debt is a Big, But Not Expensive, Problem

A common mistake among developers and line managers is to mistake Tech Debt as an Expensive, instead of a Big, Problem.  For tech debt to be an Expensive Problem the CTO and VPs of Development have to believe that the value the company will get from paying down tech debt is greater than the cost.

Leadership doesn’t reject pitches for rewrites and other major initiatives to address tech debt because they don’t believe that Tech Debt is a Big Problem.  The initiatives get rejected because they don’t believe that the results will justify the cost.

There are two ways to get around the Big vs Expensive problem.

The first is to use data to prove that the problem is Expensive.  

Find a way to measure the costs of tech debt: lost developer time spent fixing bugs, increased developer time building new features and above average customer churn.  Then, find a way to estimate how much better things will be after the big initiative.  Finally, estimate how long the transformation will take.  If you can credibly show that the improvements are greater than the costs, you should have no problem getting your initiative approved.  

Remember though, that you’re estimating the value of the improvements and length of time.  Credibility is as much about leadership’s faith in your delivery as it is in the numbers.

The second way to get around the Big vs Expensive problem is TheeSeeShipping, aka Iterative Delivery.  

Make small improvements every release and show that the cost of the problem is shrinking over time.  Less time spent fixing bugs, faster feature development, maybe even a reduction in churn.  Demonstrate that Tech Debt is an Expensive Problem by fixing it and providing more value than cost.

You’ll find that you won’t have any trouble getting approval, because you won’t need any approval at all.  You just need to start.

Site Footer