Test Driven Design (TDD) with dbt: Test First, SQL Later
If you’ve spent more than three days as an analytics engineer, you’ve probably had that moment: your dbt model runs, but the numbers look… off. Not broken, just weird enough to make you question your life choices (or at least your SQL). The code works, but the data? Not so much. And let’s be honest, most of these “surprises” start with the classic: “Let me just quickly fix this.” Fast forward a week, and you’re untangling a mess that would make spaghetti jealous. Ensuring data model quality isn’t just about passing tests or making your code run. It’s about making sure your data is actually useful, trustworthy, and won’t get you a passive-aggressive Slack from your stakeholders.
The beauty of dbt is that it already has testing built in. You’ve got your not_null
tests, your unique
tests, and even some fancy unit tests. But here’s where it gets interesting: what if you wrote those tests before you wrote your model? What if you started by defining what “good” looks like for your data, then built your transformation to meet those expectations? That’s TDD for dbt. Instead of building a model and hoping it works, you define the rules first, then build the model that follows them.
What is Test-Driven Design (TDD)?
Test-Driven Design (TDD) is the software engineering equivalent of writing the exam questions before you teach the class. The idea is simple: you start by writing a test that describes what your code should do, watch it fail (because you haven’t written the code yet), and then write just enough code to make the test pass. Rinse and repeat. It’s a bit like setting up booby traps for your future self—except instead of falling into a pit, you catch bugs before they escape into production. TDD forces you to think about requirements up front, keeps you honest, and gives you a warm fuzzy feeling (or at least fewer cold sweats) when you refactor your code or add new features.
If you’re thinking, “Wait, this sounds a lot like what I should be doing in dbt,” you’re absolutely right. Analytics engineering is just software engineering with more SQL and fewer memes about JavaScript frameworks. In both worlds, you want to catch problems early, automate your guardrails, and make sure your work doesn’t break the moment someone else touches it. TDD in dbt is about writing the tests that define “good data” before you build the model, so you’re not left guessing when it’s time to deploy.
In short: TDD for analytics engineers means you stop hoping your models work and start knowing they do. You write the tests that define success, then build your dbt models to pass those tests. It’s a little more work up front, but it pays off every time you make a change and your tests catch a problem before your boss (or worse, your client) does.
Testing in dbt: Overview
We know of course that dbt has tests built in. But adding just some dbt tests is not what I mean by test-driven development (TDD). See, dbt’s tests are most often data tests meaning that they test the validity of your data row by row. What we want to test is the logic of your models. Not too long ago dbt released a unit testing feature which does at least some of what we need to enable a test-driven workflow. But we need more than that. Let’s take a look at all the types of tests we might need to create a development workflow based on testing.
It’s easy to get lost in dbt-specific terms like ‘generic’ or ‘custom’ or ‘data test’, but it’s good to have a conceptual understanding of what and why you are actually testing before you go into how you’re testing it. I like to roughly distinguish these areas of tests: single column row-by-row, multi-column row-by-row, full-model / aggregated, model unit-tests, and macro unit-tests. Not all of those are easily supported out of the box, but with a little bit of magic dust we can make it work.
Single column row-by-row tests
Single column row-by-row tests help you understand if something is wrong in any of the fields of a single column. This is every test you put into a specific column in your model’s YAML file. It’s your basic out-of-the-box dbt tests like unique
and not_null
, but also more advanced tests from other packages. For example dbt_utils.not_empty_string
allows you to test that the column has no empty string values (e.g. ''
). Of course you can also create your own custom, singular test by just adding them to the tests folder and making sure a row is returned when a row has an error.
Multi-column row-by-row tests
These tests help you understand if something is wrong in any of the rows using a combination of columns. One of the most used tests for this scenario is dbt_utils.unique_combination_of_columns
, to easily check for duplicates across columns for example when a table contains number of sales per week. But there are many other options. For example dbt_utils.expression_is_true
allows you to test the validity of a SQL expression over multiple columns like a calculation with numbers or a text function.
Full model tests and aggregation tests
Full model tests and aggregation tests allow you to check if something is wrong across different models or over a set of rows in the model. This is different from the row-by-row tests in the sense that it uses multiple rows at the same time in the test. The dbt_utils.cardinality_equality
test is an example of this that compares the distinct values in one column to the distinct values of another column in a different model. An aggregation test checks your assumptions for a model based on a calculation over multiple rows, like a minimum or maximum row count, or asserting that normalized values over a year add up to 1. The aggregation test is very powerful, but often underutilized in dbt projects.
Model unit tests
Model unit tests allow you to test if the (business) logic in one specific model is correct by making sure that a given input returns a given output. Unit tests are great because they allow you to mock the input of multiple references, expect any number of rows allowing all kinds of operations in your model.
Macro unit tests
Unit tests for macros are a way to test that the input of a macro returns an expected output. Since macros are often used across models, being able to test their consistency is crucial to sleeping well at night, eh, I mean, to ensuring your data pipeline doesn’t break when you least expect it. Macro’s can’t be unit tested out of the box, but there are simple ways around it to unit test your dbt macro.
Model Contracts
And let’s not forget model contracts—dbt’s way of specifying exactly what your models should output. Think of them as the fine print in your data agreements. They let you define the expected columns, types, and constraints, so if someone tries to sneak in a breaking change, the contract throws the book at them (or at least fails your build). But here’s where model contracts get really interesting: they don’t just validate your data after it’s built like regular tests do. Instead, they enforce constraints at the database level during the build process itself. When you define a model contract with enforced: true
, dbt will include column names, data types, and constraints in the DDL statements it submits to your data platform. This means the database itself enforces these rules, not just your dbt project.
For TDD workflows, this is a game-changer. Instead of writing your model first and then hoping your tests catch issues, you define the contract upfront. The model won’t even build if it doesn’t match the expected structure. This forces you to think through your data requirements before writing a single line of SQL, which is exactly what TDD is all about. Plus, since the constraints are enforced at the database level, you get better performance (no need to run separate test queries) and stronger guarantees that your data integrity rules are actually being followed.
Applying TDD Principles in dbt
So how do you actually apply TDD to your dbt workflow? The key is to flip your usual process on its head. Instead of writing a model and then adding tests to validate it, you start by defining what success looks like. This might sound counterintuitive at first—after all, how can you test something that doesn’t exist yet? But that’s exactly the point. You’re not testing the model; you’re defining the requirements for what the model should produce.
Start with the end in mind. What does “good” data look like for your use case? Should certain columns never be null? Should there be exactly one row per customer? Should the sum of all sales equal the total revenue? These are your test requirements. Write them down first, before you write a single line of SQL. This forces you to think through your business logic and data quality requirements upfront, which is exactly what TDD is designed to do.
Example: Row Count Test Before Model Build
Let’s say you’re building a model that aggregates data about user complaints, and maybe you also don’t want to look at problems relating to the first batch of your product. Before you write the model, you can define what “reasonable” looks like. The dbt-expectations package provides a great set of tools for implementing this kind of aggregation test.
Here’s how you might define your test requirements first:
models:
- name: daily_problems_summary
tests:
- dbt_expectations.expect_table_row_count_to_be_between:
min_value: 1
max_value: 99
group_by: "problems"
row_condition: "batch != 1"
You write this test before you build the model. When you run it, it will fail (because the model doesn’t exist yet), but that’s exactly what you want in TDD. The test failure tells you what you need to build. The beauty of this approach is that it forces you to think through your business assumptions upfront. Why between 1 and 99 rows? What happens if you get 0 rows (no problems)? What happens if you get 100+ rows (maybe a data quality issue)? If you find that you need to apply different constraints in different environments, just make it dependent on the target min_value: "{{ 99 if target.name != 'prod' else 100000 }}"
The point is that by writing the test first, you’re forced to make these decisions explicit before you start coding.
Of course, you don’t have to use existing tests. The dbt_utils
and dbt_expectations
package have plenty of helper functions to get you started, for example you could have a expect_sum_between
test.
models:
- name: daily_problems_summary
tests:
- expect_sum_between:
sum_column: users
min_value: 1
max_value: 99
group_by: "problems"
row_condition: "batch != 1"
This custom test can use the existing dbt_expectations.expression_between
macro with some added sugar.
{% test expect_sum_between(model, sum_column, min_value=None, max_value=None, group_by_columns=None, row_condition=None, include_boundaries=true) %}
{{ dbt_expectations.expression_between(model, 'sum(' ~ sum_column ~ ')', min_value, max_value, group_by_columns, row_condition, strictly=include_boundaries) }}
{% endtest %}
The key insight here is that you’re not limited to dbt’s built-in tests. You can build custom tests that encapsulate your business logic and reuse them across models. This is where TDD really shines—you define your testing framework based on your specific needs, then build models that satisfy those tests.
Best Practices & Tips
Now that you understand the theory and have seen some examples, let’s talk about the practical side of implementing TDD in your dbt workflow. When should you run your tests, and how do you avoid the common pitfalls that can make your testing strategy more trouble than it’s worth?
When to Run Tests
The short answer, like a good consultant: it depends. But here’s a practical approach that balances thoroughness with pragmatism.
Local development: Run unit tests and model contracts every time you make a change. These are fast and catch logic errors immediately. Data tests can be expensive, so run them selectively—maybe just on the models you’re actively working on or on a subset of data.
CI/CD pipeline: Run all tests, but be smart about it. Unit tests and contracts should always run. For data tests, consider using tags to run different sets based on the environment. Maybe run heavy aggregation tests only in production, or use different thresholds for different environments.
Production: This is where you want the full battery of data tests, but timing matters. Don’t run unit tests in production (they’re not needed and just waste compute). Not every test needs to run every time on all data. Be smart about when to run data tests (weekly or monthly is often fine) and use tags to enable or disable tests as needed. Focus on data tests and contracts that validate the actual data quality to the point that it’s needed in production. And remember, just because a test passes in development doesn’t mean it will pass in production —real data has a way of surprising you.
Organizing Tests and Test Data
As your dbt project grows, so will your test suite. Here’s how to keep it manageable:
Use tags to organize tests by purpose: Use the --exclude-resource-type unit_test
flag to exclude unit tests in production, Create tags like critical
, slow
, or business_logic
to run subsets of tests based on what you’re working on. For example, dbt test --select tag:business_logic,new_feature
for all business_logic tests in only the ‘new_feature’ folder so you’ll have fast feedback during development.
Group related tests together, but not too much: Keep tests that validate the same business concept close to each other. If you have multiple tests for customer data quality, put them in the same YAML file or use consistent naming patterns. At the same time, you’re not charged per YAML file, so don’t be afraid to split up tests for convenience.
Create test data for unit tests: Don’t rely on production data for unit tests. Create small, controlled datasets that represent the edge cases you want to test. This makes your tests faster, more reliable, and easier to debug. These are called ‘fixtures’ and can be stored either directly in the unit test in a YAML file, or as a CSV or SQL file in the fixtures folder.
Document your test assumptions: When you write a test, document why it exists and what business rule it’s enforcing, preferably in the description field. Future you (and your teammates) will thank you when they need to understand why a test is failing.
Conclusion
Test-driven design in dbt isn’t just about writing tests—it’s about changing your mindset from “build first, test later” to “define success first, then build.” The benefits are clear: you catch problems early, you’re forced to think through your business logic upfront, and you end up with more reliable, maintainable data models.
Just like in software development, TDD isn’t a silver bullet and doesn’t need to be followed dogmatically. You don’t need 100% test coverage of every model, and sometimes writing the model first makes more sense—especially for exploratory work or when you’re still figuring out what the data should look like. The goal isn’t to be a TDD purist; it’s to use testing as a tool that helps you build better data models. Some models might need extensive testing, others might be fine with just a few key assertions. The key is to be intentional about what you test and why.
The key is to start small. Pick one model, write a simple test that defines what “good” looks like, then build the model to pass that test. Once you see how it works, expand from there. Use the tools dbt provides—unit tests, model contracts, custom tests—but don’t let perfect be the enemy of good. A few well-placed tests that catch real problems are worth more than a comprehensive test suite that never gets run.
Remember, the goal isn’t to test everything—it’s to test the things that matter. Focus on the business logic that would cause real problems if it broke, and let your tests guide your development process. Your future self (and your stakeholders) will thank you.