User Defined Field Patterns 2 – NoSql Relations

In part 1 - I covered the classic solution for User Defined Fields; simple but unscalable.

NoSQL emerged as a solution to relational fields in the late 2000s.  Instead of having a meta table defining fields in a relational database, the User Defined data would live in NoSQL.

The structure would look like this:

This model eliminates the meta programming and joining the same table against itself.  The major new headache that this model creates is difficulty in maintaining the integrity of the field data.

Pros:

  • No complicated meta programming.  Instead you write a filter/match function to run against the data in the Collection Of Fields.
  • No more repeated matching against the same table.  Adding additional search criteria has minimal cost.
  • Open ended/internet level scaling.  For a CRM or SaaS, the limiting factor will be the cost of storing data, not a hard limit of the technology.

Cons:

  • Much more complicated to set up and maintain.  Even with managed services supporting two database technologies doubles the difficulty of CRUD.  Multiple inserts, multiple deletes, tons of ways for things to go wrong.
  • Without a relational database enforcing the data structure, poisoned or unreadable data is common.  Being able to store arbitrary data collections means you’ll invariably store buggy data.  You’ll miss some records during upgrades and have to support multiple deserializers.  You will lose customer data in the name of expediency and cost control.
  • It’s more expensive.  You’ll pay for your relational database, NoSQL database, and software to map between the two.  

Conclusion

NoSQL systems solve the scaling problems with setting up User Defined Fields in a relational database.  The scaling comes with high costs in terms of complexity, fragility and costs.

Reducing the complexity, fragility, and costs leads to the upcoming 3rd shift, covered in part 3.

User Defined Field Implementations For CRMs

This series covers a brief history of the 2 historic patterns for implementing User Defined Fields in a CRM, the upcoming hybrid solution that provides the best of both worlds, and how to evolve your existing CRM to the latest pattern.  If you care about CRM performance, scaling, or cost, this series is for you!

What are User Defined Field Patterns?

Every CRM provides a basic fields for defining a customer.  Every CRM’s basic field set is different depending on the CRM’s focus.  So, every user of a CRM needs to expand the basic definition in some way.  Birthdays, purchase history, and interests are three very common additions.

The trick is allowing users to define their own fields in ways that don’t break your CRM.

The Three Patterns

At a high level, there have been three major architectures for implementing Custom Fields.  Most of the design is driven by the strengths and weaknesses of the underlying database architecture.

Pattern 1, generalized columns in a database, spanned the dawn of time until the rise of NoSQL around 2010.

Pattern 2, NoSQL, began around 2010 and continues to today.

Pattern 3, JSON in a relational database, began in the late 2010s and combines the best of the two approaches

Pattern 1 - All in a Relational Database

Before the rise of NoSql there was pretty much one way to build generic user defined fields.

The setup is simple, just 3 tables.  A table of field definitions, a table for contacts, and a relational table with the 2 ids and the value for that contact’s custom field.

The Pros

  • This design is extremely simple and can be implemented by a single developer very quickly.
  • Basic CRUD operations are easy and efficient.

The Cons

  • Building search queries requires complicated techniques like metaprogramming.
  • Every search criteria results in a join against the ContactFields table.  This results in an exponential explosion in query times.
  • The lack of defined table columns handicaps the database’s query optimization strategies.

Conclusion

The classic relational database pattern is easy to set up, but has terrible scaling.  This super simple example would bog down by 1,000 contacts and 50 fields.  

There are lots of ways to redesign for scale, but this is a SHORT history.  Suffice it to say that it takes extremely complex and finicky systems to scale past 100,000 contacts and 1,000 fields.

The solutions to the classic pattern’s scaling led to the NoSQL revolution, covered in part 2.

Cheerfully Going To Fail In Production

This weekend I cheerfully went to lose at a fencing tournament.

That’s not negativity, I’ve been fencing off and on for 30 years and I just started up again after a multi-year hiatus.  I fenced terribly, which I already knew from practice.  What I didn’t know, what practice couldn’t tell me, is where I would break down when I fenced for real.

I was just competent enough at the things I knew to watch for, that my opponents destroyed me in ways I didn’t even remember to practice.

What does this have to do with software and my mantra of Never Rewrite?

After years of not doing maintenance, I am a legacy system.  Practice is literally a testing environment.  If I had spent months working on my known problems, when I finally got to production, I would still have been destroyed by my blind spots.

You can choose to spend months rewriting only to have your system fail when you finally make it to production.

Or, you can choose to cheerfully go to production with what you’ve got so that you can see what breaks and iterate.

Acceptable Beer Bellies in your codebase

How do Beer Bellies begin?

The panel is fully functional, when you push the button a light turns on and the elevator comes.  It is also obviously wrong - the top button is flush with the mount, and the bottom button sticks out.

I found this sad, Beer Belly Elevator Panel, at a high end resort and wondered how it happened.

Certainly whomever installed the mismatched button knew it was wrong.  Did the tech not care?  Was using the wrong button the only way to get the panel repaired?  Was the plan to come back and fix it when the right parts came in?

The hotel maintenance staff had to sign off on it.  Did they care about the quality of the repair?  Were they only able to give a binary assessment of “working” or “not working”?

Did the hotel manager not care?  Were they told to keep costs down?  It isn’t broken now, it would be a waste to fix something that wasn’t broken.

Quality vs Letting Your Gut Hang Out

Employees at the hotel see the mismatched panel every day.  It is a constant reminder that letting things slide, just a little, is acceptable at this hotel.

When you let consistency and quality slide because something works, you’re creating beer bellies in your codebase.

One small button at a time until everyone sees that this is acceptable here.

So long as a light turns on when you hit the button does it matter if the light is green, red or blue?  Does it matter if the light is in the center or on the edge?

But I’m running a SaaS, not a Hotel

Your SaaS may not maintain elevator panels, but your codebase is probably full of beer bellies.

“It works, we’ll clean it up on the next release” bellies.

“This is a hack” bellies.

“This is the legacy version, we’re migrating off of it” bellies.

When you let sad little beer bellies into your codebase, your employees see exactly what you find acceptable.

Is This A Quality Panel?

I recently stepped into an elevator and saw this panel:

The panel was clean, full of high quality materials, and everything worked.

Quality is about more than the functionality, does this look like a quality panel?

Everything works!

Push a button and it lights up!

Sure, it might light up red, green or blue. 
And the light might be around the edge or in the center. 
And some of the buttons are flush with the mount, while others extend out; but that doesn't impact the light turning on.

There are 12 possible button implementations, and 5 of them appear randomly.

But when you push the button, the light turns on!

What does that have to do with SaaS Scaling?

No matter how excellent any individual endpoint implementation is, having an API with endpoints that work differently decreases the overall quality of your product.

Having a UI with mismatched widgets and styles increases the user’s cognitive load and decreases quality, even when the differences don’t change any functionality.

Consistency during the scaleup period can be difficult as multiple new teams spin up, but it’s critically important if you want a quality product.

2022 Year In Review

It’s the time of year to look back, reflect, and highlight some interesting pieces you might have missed.

By The Numbers

I set a personal goal of publishing every week, or at least 50 articles for 2022.
All told, I published 32 articles - 64% of my goal.

I may not have hit my goal, but I'm very proud of the pieces I did write!

Top 3 Articles From 2022

These were the most popular articles that I wrote in 2022:

  1. My series of articles on SaaS Tenancy Models
  2. The Opposite Of Iterative Delivery
  3. The Chestburster Anti-Pattern

Top 3 Articles In 2022

These were the three most popular articles in 2022 that I wrote in past years:

  1. Links VS Tags, A Rabbit Hole
  2. The 5 Ws For Developers
  3. Your Database is Not A Queue

Two Published Articles

I became a published author this year, with 2 pieces on leaddev.com

  1. The Dangers Of Pulling Rank as a Staff Engineer
  2. How to Break the "Get Me Everything Cycle"

Goals For 2023

I'm going to spend more time writing in 2023. My goal is to publish 100 pieces, approximately 2 per week.

Happy Holidays and I'll be back in the new year!

Do You Punish Customers For Loyalty?

Does your Customer’s experience with your service get better over time?

Does it get worse?

SaaS software often punishes long term clients in subtle and frustrating ways.

Do your CRM customer screens show a decade of buying history?

How many emails can a contact open before you can’t open the contact?

Do marketing campaigns, contact lists, and tags accumulate over the years?

Do database inserts slow down as you write the 10 millionth row into a log table?

There are countless ways to punish customers for staying with you for years.  It’s not a startup problem, it sneaks in as you become a scaleup.  The flood of new customers blinds you to the slow leak as your most loyal customers churn.

When your longest customers complain about performance more than your largest, chances are your software is punishing them for being loyal.

Chipping Away

You have a goal, you know what it means, and what it implies.

You also know what’s blocking your progress.

It’s time to iterate!

Iterating Against the Blockers

Ask yourself:

  • What’s the first step?
  • What if all I needed was a tool?
  • How will I know if it’s working?

These three primary questions will help you chip away at the Blockers.

You want progress, not victory.  When you keep iterating away at the Blockers, eventually achieving your goals becomes easy.

Besides the questions, the main constraint to keep in mind is that each iteration needs to leave the system in a state where you can walk away from the project.

Unused functionality?  Totally fine.  

Refactored code with no new usage?  Great!

Tools that work but don’t do anything useful?  One day.

Nothing that requires keeping two pieces of code in sync.

Nothing that would prevent other developers from evolving existing code.

Whatever you do, it must go into the system with each iteration.

Example: Continuing from the Async Processing Blockers

I want to make my API asynchronous so that client size doesn’t impact the API’s responsiveness.  But, I can’t make the API asynchronous because:

  • I don’t have a system to queue the requests.
  • I don’t have a system to process the requests off of a queue.
  • I don’t have a way to make the processing visible to the client.

Attempting to do all of this work in one giant step is a recipe for a project that gets delivered 6 months to a year late.

I’m going to hand wave and declare that we are using SQS, AWS’s native queuing system.  This makes setting up the queue trivial and reliable.

I don’t have a system to queue the requests.

What’s the first step?

Write a data model and serializer.  What am I even going to write onto this queue?

What if all I needed was a tool?

Instead of worrying about a system, create a command line tool in your existing codebase to push data to SQS.  It won’t be reliable, it won’t have logging, and it won’t have vizability.

But you’re the only one using it, so that’s fine.

How will I know it’s working?

Manually!  AWS has great observability.  You don’t need to do anything.

Combining your first step, a data model, your tool and AWS observability you’ll be able to push data onto a queue and view what got sent.

The data model will be wrong and the tool will not be production ready!

That’s ok because no existing functionality is blocked or broken.  Getting interrupted doesn’t create risk, which means you can work even if you only have a little time.

I don’t have a system to process the requests off of a queue.

What’s the first step?  

Write a data model and deserializer.  What data do I need to be on the queue in order to recreate the event I need to process?

What if all I needed was a tool?

Create a tool to pull the message off the queue and deserialize.  Send the result to a data validator.  (You’re accepting customer requests from an API, you’d better have a data validator)

How will I know it’s working?

Manually!  AWS has great observability.  You don’t need to do anything.

Combining the three gets the ability to manually pull data off the queue, deserialize and validate.

You can do this before, after or during your work to get the data onto a queue.  It’s not production ready, but it also doesn’t create risk.

I don’t have a way to make the processing visible to the client.

What’s the first step? 

What does visibility look like to the client?  Where does the data go in your UI?  What would you want to know?

What if all I needed was a tool?

Make an endpoint that calls AWS and returns the data you think you need.

How will I know it’s working?

Manually!  Compare what your endpoint tells you with what AWS tells you.  Don’t start until you have tools for adding and removing events from the queue.

Combining the three gets you an endpoint that tells you about the queue.

The endpoint should be safe to deploy to production.  The queue is always empty.

Conclusion

Iterating allows you to chip away at your blockers until there’s nothing stopping you.

Apply the three questions:

  • What’s the first step?
  • What if all I needed was a tool?
  • How will I know if it’s working?

Always keep the system in a state where you can walk away from the project.

Keep iterating against your blockers, and you’ll be amazed at how soon you’ll achieve your goals!

The Blockers

Implications, defining what has to be true for your goal to succeed, are the hardest step.

Step 4, The Blockers, are relatively easy.

What’s Stopping You?

What is stopping you and your team from making your implications true?

I’m not working on it, and there aren’t enough hours in the day, aren’t valid answers.

The Implications are too big, too complex and too scary to tackle head on.  If they weren’t, you wouldn’t be stuck.

Defining the blockers is about naming the steps you can’t take; the chasims you can’t cross.

Continuing with the Async Processing Example

Continuing with the example of using asynchronous processing to make API response time consistent for customers of any size.

Why can’t I make the API asynchronous? 

Three big reasons:

  1. I don’t have a system to queue the requests.
  2. I don’t have a system to process the requests off of a queue.
  3. I don’t have a way to make the processing visible to the client.

The reasons create a giant requirement loop.  I can’t build a job system until I have a queue to read off of, and I can’t queue the work until I can process it.  I also need endpoints and UI to keep the customer informed of how the processing is going, if anything got rejected, and a way to remediate failures.

That’s a lot of work!

Even dropping customer observability, adding queueing and processing is a big step.

Big steps are scary, and not iterative.

Next Steps

Before going on to the next step, pick your 2-3 most important Implications, and write down 3 Blockers for each.

I recommend you don’t write down more than 9 Blockers before moving on to Step 5 - Weakening The Blockers

Site Footer