Shifting Your User Defined Field Pattern

My last three posts laying out a brief history of User Defined Field Patterns past, present, and future.  This post lays out a framework for migrating.  If your CRM is using the Relational or NoSql pattern and you’re ready to move to the more efficient, cheaper, and simpler future, this is the post for you!

Migration Philosophy

Before going into how to migrate, a reminder of my philosophy:

  • Minimize risk by taking small incremental steps
  • Focus on providing value to the customer

There are many ways to migrate from one pattern to another.  This strategy will minimize risk and maximize customer value.

Step 1 - Extend Your Relational Database

The Relational and NoSql patterns make use of a relational database.

Step 1 is to add a JSON column to your existing contacts column.

Your new schema should look like one of these two models

That’s it - deploy an additive schema update to your database.  Since there’s no code to access the new columns, there’s no coordinated deployment.  Just the regular, minimal, risk of updating your database.

Step 2 - Query The New Schema

Now that the new schema is in production, it is time to extend your query code.

Add a new code path that checks to see if any data is present in the new schema.  If there is data available, execute the query using the new JSON column.  When there’s no data, use the original query method.

You will need to develop this code hand-in-hand with the code for migrating the data from your original system to the new schema.  The important piece is that you should always be deploying with the READER code on, WRITER code off.

When you deploy this code, there won’t be any data in the JSON column.  The new code will be available, but unused.

Since the new code won’t be used, this step is also extremely low risk.

Step 3 - Double Write

At this point your system will use the new schema whenever data is present.  

This gives you a single switch to flip - to use the new system, start writing to the new column IN ADDITION to the original method.

Mistakes at this step are the most likely to cause customer impact!  It is also the most expensive in time and resources because you are writing the data twice.

However, this also gives you a very quick fallback path.  The original writing process is untouched!  

If there’s a problem, turn off the double write and delete the data in the new column.  Thanks to the work in Step 2 you’ll instantly fall back with NO DATA loss.

Migrations are hard!  Preventing data loss minimizes the risk.

Step 4 - Only Hybrid Write

The final step is to stop writing to the original data store.  This ends your ability to fall back so make sure to confiscate copies of the data deleter from Step 3!

Ending the double write should be low risk because you were only doing it as a fallback at this point.  You should see an immediate bump in performance and drop in costs.  This trend will continue as the data migrates from the old system to the new.

Step 5 - Clean Up

At some point you’ll be ready to shut down the old system.

The last step is to decide what to do with the unmigrated data.  Depending on how long you’ve waited you’re looking at customer data that hasn’t been accessed in months.  Look at your retention promises; maybe you don’t have to migrate the data at all.

Either way, clean it up and shut down the old system at your leisure.

Conclusion

You can migrate User Defined Field code to the latest pattern with very little risk by using the 5 step strategy laid out in this article.

The Hybrid Solution offers excellent scalability and performance for reasonable costs.  If your CRM is using one of the earlier patterns, it is time to start migrating.

Take control of the process with small, low risk, steps and never rewrite!

User Defined Field Patters 3 – Hybrid Relations

Part 2 covers how NoSQL emerged as an improvement over the classic relational database solution for User Defined Fields. NoSQL delivers speed and scalability by being expensive and fragile.  In part 3 I’m going to cover the emerging Hybrid Database solution for User Defined Fields.

Hybrid Databases allow you to combine the best aspects of the relational and NoSQL models, while avoiding most of the downsides.

A hybrid implementation looks like this:

The hybrid model brings the data back to a single server, but without the Contact->Field relation.  Instead the field data is stored as a JSON object in the Contact table itself.

Pros:

  • No meta programming and no filters, everything is back to SQL.  Hybrid databases allow you to directly query JSON fields as if they were regular columnar fields.
  • You can create indexes on the JSON data.  This is an improvement over both the classic and NoSQL models.  It can significantly improve performance by allowing the database engine to optimize queries based on usage.
  • Having a single system makes things simple to set up and easier to maintain.
  • The database will enforce valid JSON structures, which makes it difficult to poison your data.

Cons:

  • There’s no enforced relationship between the JSON data and your User Defined Fields.  This means that data can get lost because your system no longer knows to display or delete it.
  • While Hybrid Databases should scale far beyond the needs of your SaaS, the scaling isn’t quite as open ended as the NoSQL model.  If you out-scale the Hybrid model, congratulations, your company’s services are in high demand!

Conclusion

If your SaaS is implementing User Defined Fields from scratch today, go with the Hybrid model.  If you already have the classic or NoSQL pattern in place, it’s a good time to start thinking about how to evolve towards a hybrid solution.

I’ll cover how to evolve your existing solution in Part 4.

User Defined Field Patterns 2 – NoSql Relations

In part 1 - I covered the classic solution for User Defined Fields; simple but unscalable.

NoSQL emerged as a solution to relational fields in the late 2000s.  Instead of having a meta table defining fields in a relational database, the User Defined data would live in NoSQL.

The structure would look like this:

This model eliminates the meta programming and joining the same table against itself.  The major new headache that this model creates is difficulty in maintaining the integrity of the field data.

Pros:

  • No complicated meta programming.  Instead you write a filter/match function to run against the data in the Collection Of Fields.
  • No more repeated matching against the same table.  Adding additional search criteria has minimal cost.
  • Open ended/internet level scaling.  For a CRM or SaaS, the limiting factor will be the cost of storing data, not a hard limit of the technology.

Cons:

  • Much more complicated to set up and maintain.  Even with managed services supporting two database technologies doubles the difficulty of CRUD.  Multiple inserts, multiple deletes, tons of ways for things to go wrong.
  • Without a relational database enforcing the data structure, poisoned or unreadable data is common.  Being able to store arbitrary data collections means you’ll invariably store buggy data.  You’ll miss some records during upgrades and have to support multiple deserializers.  You will lose customer data in the name of expediency and cost control.
  • It’s more expensive.  You’ll pay for your relational database, NoSQL database, and software to map between the two.  

Conclusion

NoSQL systems solve the scaling problems with setting up User Defined Fields in a relational database.  The scaling comes with high costs in terms of complexity, fragility and costs.

Reducing the complexity, fragility, and costs leads to the upcoming 3rd shift, covered in part 3.

Acceptable Beer Bellies in your codebase

How do Beer Bellies begin?

The panel is fully functional, when you push the button a light turns on and the elevator comes.  It is also obviously wrong - the top button is flush with the mount, and the bottom button sticks out.

I found this sad, Beer Belly Elevator Panel, at a high end resort and wondered how it happened.

Certainly whomever installed the mismatched button knew it was wrong.  Did the tech not care?  Was using the wrong button the only way to get the panel repaired?  Was the plan to come back and fix it when the right parts came in?

The hotel maintenance staff had to sign off on it.  Did they care about the quality of the repair?  Were they only able to give a binary assessment of “working” or “not working”?

Did the hotel manager not care?  Were they told to keep costs down?  It isn’t broken now, it would be a waste to fix something that wasn’t broken.

Quality vs Letting Your Gut Hang Out

Employees at the hotel see the mismatched panel every day.  It is a constant reminder that letting things slide, just a little, is acceptable at this hotel.

When you let consistency and quality slide because something works, you’re creating beer bellies in your codebase.

One small button at a time until everyone sees that this is acceptable here.

So long as a light turns on when you hit the button does it matter if the light is green, red or blue?  Does it matter if the light is in the center or on the edge?

But I’m running a SaaS, not a Hotel

Your SaaS may not maintain elevator panels, but your codebase is probably full of beer bellies.

“It works, we’ll clean it up on the next release” bellies.

“This is a hack” bellies.

“This is the legacy version, we’re migrating off of it” bellies.

When you let sad little beer bellies into your codebase, your employees see exactly what you find acceptable.

Chipping Away

You have a goal, you know what it means, and what it implies.

You also know what’s blocking your progress.

It’s time to iterate!

Iterating Against the Blockers

Ask yourself:

  • What’s the first step?
  • What if all I needed was a tool?
  • How will I know if it’s working?

These three primary questions will help you chip away at the Blockers.

You want progress, not victory.  When you keep iterating away at the Blockers, eventually achieving your goals becomes easy.

Besides the questions, the main constraint to keep in mind is that each iteration needs to leave the system in a state where you can walk away from the project.

Unused functionality?  Totally fine.  

Refactored code with no new usage?  Great!

Tools that work but don’t do anything useful?  One day.

Nothing that requires keeping two pieces of code in sync.

Nothing that would prevent other developers from evolving existing code.

Whatever you do, it must go into the system with each iteration.

Example: Continuing from the Async Processing Blockers

I want to make my API asynchronous so that client size doesn’t impact the API’s responsiveness.  But, I can’t make the API asynchronous because:

  • I don’t have a system to queue the requests.
  • I don’t have a system to process the requests off of a queue.
  • I don’t have a way to make the processing visible to the client.

Attempting to do all of this work in one giant step is a recipe for a project that gets delivered 6 months to a year late.

I’m going to hand wave and declare that we are using SQS, AWS’s native queuing system.  This makes setting up the queue trivial and reliable.

I don’t have a system to queue the requests.

What’s the first step?

Write a data model and serializer.  What am I even going to write onto this queue?

What if all I needed was a tool?

Instead of worrying about a system, create a command line tool in your existing codebase to push data to SQS.  It won’t be reliable, it won’t have logging, and it won’t have vizability.

But you’re the only one using it, so that’s fine.

How will I know it’s working?

Manually!  AWS has great observability.  You don’t need to do anything.

Combining your first step, a data model, your tool and AWS observability you’ll be able to push data onto a queue and view what got sent.

The data model will be wrong and the tool will not be production ready!

That’s ok because no existing functionality is blocked or broken.  Getting interrupted doesn’t create risk, which means you can work even if you only have a little time.

I don’t have a system to process the requests off of a queue.

What’s the first step?  

Write a data model and deserializer.  What data do I need to be on the queue in order to recreate the event I need to process?

What if all I needed was a tool?

Create a tool to pull the message off the queue and deserialize.  Send the result to a data validator.  (You’re accepting customer requests from an API, you’d better have a data validator)

How will I know it’s working?

Manually!  AWS has great observability.  You don’t need to do anything.

Combining the three gets the ability to manually pull data off the queue, deserialize and validate.

You can do this before, after or during your work to get the data onto a queue.  It’s not production ready, but it also doesn’t create risk.

I don’t have a way to make the processing visible to the client.

What’s the first step? 

What does visibility look like to the client?  Where does the data go in your UI?  What would you want to know?

What if all I needed was a tool?

Make an endpoint that calls AWS and returns the data you think you need.

How will I know it’s working?

Manually!  Compare what your endpoint tells you with what AWS tells you.  Don’t start until you have tools for adding and removing events from the queue.

Combining the three gets you an endpoint that tells you about the queue.

The endpoint should be safe to deploy to production.  The queue is always empty.

Conclusion

Iterating allows you to chip away at your blockers until there’s nothing stopping you.

Apply the three questions:

  • What’s the first step?
  • What if all I needed was a tool?
  • How will I know if it’s working?

Always keep the system in a state where you can walk away from the project.

Keep iterating against your blockers, and you’ll be amazed at how soon you’ll achieve your goals!

The Implications Of Your Characteristics

Part Three of my series on iterative delivery 

You have a goal, you have characteristics, now it’s time to ponder the implications.

The implications are things that would have to be true in order for your characteristic to be achieved.  They are levers you can pull in order to make progress.

Let’s work through an example

In part 2 I suggested that a Characteristic of a system that can support clients of any size is that API endpoints respond to requests the same amount of time for clients of any size.

What would need to happen to an existing SaaS in order to make that true?

  • The API couldn’t do direct processing in a synchronous manner.  It would have to queue the client request for async processing.  Adding a message to a queue should be consistent regardless of how large the client, or the queue, becomes.
  • For data requests endpoints would need to be well indexed and have constrained results.
  • Offer async requests for large, open ended, data.  An async request works by quickly returning a token, which can then be used to poll for the results.

Implications are measurable

How much processing can be done asynchronously today?  How much could be done last month?

How many data endpoints allow requests that will perform slowly?  Is that number going up or down?

How robust is the async data request system?  Does it exist at all?

Implications are Levers

Progress on implications pushes your service towards its goal.  Sometimes a little progress will result in lots of movement, sometimes a lot of progress will barely be noticeable.

Speaking to Implications

It is important that you can speak to how the implications drive progress towards your goal.

Asynchronous processing lets your service remain responsive.  It doesn’t mean you can process the data in a timely manner yet.  It sets the stage for parallel processing and other methods of handling large loads.

Next Steps

Before continuing on, try to come up with 3 implications for your most important characteristics.

You’ll want a good selection of implications for the next part - blockers.

We will explore what’s preventing you from moving your system in the direction it needs to go.

Getting Started With Iterative Delivery

The last 4 posts have been trying to convince you that iterative, baby step, delivery, is better for your clients than moonshot, giant step delivery:

But how do you get started?  How do you shorten your stride from shooting the moon, to one small step?

The next series of posts is going to lay out my scaling iterative delivery framework.  This site is about scaling SaaS software, and this framework works best if you want an order of magnitude more of what you already offer your clients.  This isn’t a general framework, and it certainly isn’t the only way to get started with iterative delivery.

Work your way through these steps:

  1. Pick a goal - 1 sentence, highly aspirational and self explanatory.
  2. Define the characteristics of your goal - What measurable characteristics does your system need in order to achieve your goal?
  3. What are the implications? - What technical things would have to be true in order for your system to have all the characteristics you need?
  4. What are the blockers? - What is stopping you from making the implications true?
  5. What can you do to weaken the blockers? - Set aside the goal, characteristics and implications; what can you do to weaken the blockers?

Weakening the blockers is where you start delivering iteratively.  As the blockers disappear, your system becomes better for your clients and easier for you to implement your technical needs.

We will explore each step in depth in the following posts.

The Opposite of Iterative Delivery

Iterative Delivery is a uniquely powerful method for adding value to a SaaS.  Other than iterative, there are no respectably named ways to deliver features, reskins, updates and bug fixes.  Big bangs, waterfalls, and Quarterly releases fill your customer’s hearts with dread.

Look at 5 antonyms for Iterative Delivery:

  • Erratic Delivery
  • Infrequent Delivery
  • Irregular Delivery
  • Overwhelming Delivery
  • Sporadic Delivery

If your customers used these terms to describe updates to your SaaS, would they be wrong?

Iterative Delivery is about delivering small pieces of value to your customers so often that they know you’re improving the Service, but so small that they barely notice the changes.

Don’t be overwhelming, erratic or infrequent - be iterative and delight your customers.

The Chestburster Antipattern

The Chestburster is an antipattern that occurs when transitioning from a monolith to services. 

The team sees an opportunity to exact a small piece of functionality from the monolith into a new service, but the monolith is the only place that handles security, permissions and composition.

Because the new service can’t face clients directly, the Chestburster hides behind the monolith, hoping to burst through at some later point.

The Chestburster begins as the inverse of the Strangler pattern, with the monolith delegating to the service instead of the new service delegates to the monolith.  

Why it’s appealing

The Chestburster’s appeal is that it gets the New Service up and running quickly.  This looks like progress!  The legacy code is extracted, possibly rewritten, and maybe better.

Why it fails

There is no business case for building the functionality the new service needs to burst through the monolith.  The functionality has been rewritten.  It's been rewritten into a new service.  How do you go back now and ask for time to address security and the other missing pieces?  Worse, the missing pieces are usually outside of the team’s control; security is one area you want to leave to the experts.

Even if you get past all the problems on your side, you’ve created new composition complexities for the client.  Now the client has to create a new connection to the Chestburster and handle routing themselves.  Can you make your clients update?  Should you?

Remember The Strangler

If you want to break apart a monolith, it’s always a good idea to start with a Strangler. If you can’t set up a strangle on your existing monolith, you aren’t ready to start breaking it apart.

That doesn’t mean you’re stuck with the current functionality!

If you have the time and resources to extract the code into a new service, you have the time and resources to decouple the code inside of the monolith.  When the time comes to decompose into services, you’ll be ready.

Conclusion

The chestburster gives the illusion of quick progress; but quickly stalls as the team runs into problems they can’t control.  Overcoming the technical hurdles doesn’t guarantee that clients will ever update their integration.

Success in legacy system replacement comes by integrating first, and moving functionality second.  With the chestburster you move functionality first and probably never burst through.

Building Your Way Out OF A Monolith – Create A Seam

Why Build Outside The Monolith

When you have a creaky monolith the obvious first step is to build new functionality outside the monolith.  Working on a greenfield, without the monolith’s constraining design, bugs, and even programming language is highly appealing.

There is a tendency to wander those verdant green fields for months on end and forget that you need to connect that new functionality back to the monolith’s muddy brown field.

Eventually, management loses patience with the project and pushes the team to wrap up.  Integration at this point can take months!  Worse, because the new project wasn’t talking to the monolith, most of the work tends to be a duplication of what’s in the monolith.  Written much better to be sure!  But, without value to the client.

Integration is where greenfield projects die.  You have to bring two systems together, the monolith which is difficult to work with, and the greenfield, which is intentionally unlike the monolith.  Now you need to bring them together, under pressure, and deliver value.

Questions to Ask

When I start working with a team building outside their monolith, integration is the number one issue on my mind.

I push the team to deliver new functionality for the client as early as possible.  Here are 3 starting questions I typically ask:

  1. What new functionality are you building?  Not what functionality do you need to build; which parts of it are new for the client?
  2. How are you going to integrate the new feature into the monolith’s existing workflows?
  3. What features do you need to duplicate from the monolith?  Can you change the monolith instead?  You have to work in the monolith sooner or later.

First Create the Seam

I don’t look for the smallest or easiest feature.  I look for the smallest seam in the monolith.

For the feature to get used, the monolith must use it.  The biggest blocker, the most important thing, is creating a seam in the monolith for the new feature!

A seam is where your feature will be inserted into the workflow.  It might be a new function in a procedural straight away, an adapter in your OOP, or even a strangler at your load balancer.  

The important part is knowing where and how your feature will fit into the seam. 

Second Change The Monolith

Once you have a seam, you have a place to start modifying the monolith to support the feature.  This is critical to prevent spending time recreating existing functionality.

Instead of recreating functionality, refactor the seam to provide it to your new service.

Finally Build Outside the monolith

Now that the monolith has a spot for your feature in its workflow, and it can support the external service, building the feature is easy.  Drop it right in!

Now, the moment your external service can say “Hello World!”, it is talking to the monolith.  It is in production, and even if you don’t finish it 100%, the parts you do finish will still be adding value.  Odds are, since your team is delivering, management will be happy to let you go right on adding features and delivering value.

Conclusion

Starting with a seam lets you develop outside the monolith while still being in production with the first release.  No working in a silo for months at a time.  No recreating functionality.

It delivers faster, partially by doing less work, partially by enabling iterations.

Site Footer