QA for Analytics: (un)Common Logic Standards

Posted on 2026-04-10 12:22:19

Quality assurance in analytics is not a phase at the end of a project. It is a habit that runs through the way you define metrics, structure data models, and review code. Teams that learn this early spend far less time firefighting faulty dashboards and more time asking useful questions. Teams that learn it late, usually after a painful quarter of misreported revenue or conflicting KPIs, end up rebuilding trust before they can build anything else.

Over the years I have seen the same pattern repeat: the data pipeline looks fine, the tech stack is modern, the visuals are pretty, yet executives argue because two dashboards show different numbers for the same metric. Nine times out of ten, the root cause is logical, not technical. Someone applied a common rule in an uncommon way, or vice versa. That is where a standard for logic comes in.

I call the approach (un)Common Logic. It is a way to separate logic everyone must follow from logic that is specific to a business unit, channel, or edge case. The distinction sounds simple. Practiced consistently, it is one of the fastest ways to raise the quality bar in analytics.

Why logic, not just data, fails analytics

A pipeline can be robust, every table can be up to date, and still, the output misleads. The failure vectors are familiar.

A definition shifts quietly. Marketing decides a signup is valid once a confirmation email is sent, while Finance still treats it as valid once the first invoice posts. Engineering implements a new event with a subtly different property name. A regional team stores VAT-inclusive amounts while the global model expects VAT-exclusive. None of these break the data platform, yet each one breaks a critical metric.

The technical instinct is to add more unit tests on columns and constraints. Useful, but incomplete. Column-level quality tells you whether the data is shaped as expected. Logic-level quality tells you whether the numbers answer the right question. Analytics QA has to do both.

The idea behind (un)Common Logic

Common logic is what the business uses everywhere. If you change it, everyone needs to agree. Uncommon logic is valid only within a clear boundary, such as a market, channel, or product tier. A healthy analytics environment keeps these apart, versioned, and testable.

Think of it as a contract. Common logic defines the canonical metrics, dimension hierarchies, and filters that any dashboard can rely on. Uncommon logic allows for the justified deviations that real life demands. For example, return windows vary by region due to consumer law. That is uncommon logic, scoped to geography. Counting a paying customer as one with at least one posted invoice in the last 30 days, not simply any billing profile created, is common logic that should not change per team.

A practical definition helps:

Common logic is governed, named, documented, and stable for six months or more. It lives in shared models and semantic layers that are versioned. It is test-covered and monitored. Uncommon logic is explicit in its scope and justification. It lives on top of common models, not inside them, and it is easy to audit or retire.

If the distinction is not visible in your models and dashboards, you do not have standards, you have good intentions.

A brief cautionary tale

A subscription company reported monthly recurring revenue that grew 7 percent quarter over quarter. Executives planned hiring around that number. Weeks later, Finance flagged a shortfall. The growth was closer to 2 percent. The culprit was not a data outage or a broken join. It was an uncommon logic rule sneaking into a common model.

The analytics team refactored churn to exclude customers who churned due to fraud investigations. This made sense for the Risk dashboard. It did not belong in the company-wide MRR metric. Risk motivated the change, wrote a solid PR, and shipped. The MRR model imported the churn table, unaware of the exception, and the growth rate inflated.

The fix was not a reversion of code. It was a standard: fraud-related churn became an uncommon filter, applied only in Risk views. The common churn definition returned to the base model, with tests to prevent exceptions from leaking back in. A small change to where logic lived prevented a big change to the story leadership heard.

What good looks like, structurally

Logic lives in layers. A clear separation reduces accidental coupling.

Raw or staging models, named consistently per source, with only structural changes like renaming, type casting, and deduplication. No business decisions here. Core business models that encode common logic, such as canonical customer, product, order, subscription, payment, and event models. These hold the standard keys, status rules, and time handling. Marts or feature models that add uncommon logic on top, scoped by audience, channel, or geography, and always pointing back to the common model lineage.

When every layer knows its responsibility, QA fits naturally. Type and shape tests dominate staging. Semantic and referential checks dominate core. Scenario and expectation tests dominate marts.

Data contracts, but enforceable

Any standard built on hand-waving will break under pressure. The practical data contract for analytics needs to be both human and machine enforceable.

Write it down as a short spec per common model, limited to what QA and development can check:

The purpose, with a short plain-language description, a list of the key entities, and the queries this model must answer reliably. The inputs, with field-level notes on meaning, units, and time zones, plus allowed ranges and nullability. The outputs, with the same field-level notes and references to canonical keys. The invariants, such as uniqueness, one-to-one or one-to-many expectations, slowly changing dimension behavior, and allowed status transitions. The versions, with a change log that states whether each change is compatible or breaking, and a deprecation plan for consumers.

I have seen teams shrink incident counts by half within two quarters after adopting contracts like this, not because the documents themselves prevent bugs, but because the act of agreeing on invariants forces difficult conversations before code is written.

Time, status, and joining: where bugs hide

Every team has its own graveyard of time bugs. If you want to improve QA for analytics, start by tightening your approach to time, status, and joins.

Time. Choose a canonical model for timestamps. Store as UTC where possible, snap to daily or weekly grains with clear rounding rules, and annotate calendars with business closures and regional holidays if those matter. If your business spans time zones, define whether a day rolls over by user local time or by company time. Document which models use which rule, and test conversions. Half of the metric disputes I have mediated came down to a day boundary difference.

Status. Treat status as state machines, not booleans. A customer is not simply active or inactive. They progress through created, trialing, active, delinquent, suspended, canceled. Each transition has a trigger. Encode those triggers as common logic, with a single source of truth. Then write scenario tests against event sequences. When a suspension lifts and a payment posts, what status do we expect that day, and the next?

Joins. Most data platforms make it easy to write an inner join that looks plausible but erases history. Keys that are stable in your head drift in the real world. Email addresses change. Device IDs reset. Sales territories move. Treat primary keys as contract fields with collision and change rules. When you must choose between left join and inner join, document the reason in code and review. If non-matching records are legitimate, maintain an unmatched row counter and alert when the rate exceeds a threshold. Joins are not just a technical step, they are a logical assertion about identity and scope.

The testing pyramid for analytics, adapted

Software teams borrow the idea of a testing pyramid. It translates well to analytics, with a few adjustments.

At the base, column and table checks, enforced by tools or SQL. Uniqueness, not null, accepted values, numeric ranges, freshness. These are cheap and fast. They catch malformed inputs and schema drifts.

In the middle, relation and semantic checks. Referential integrity, one-to-one expectations across keys, slowly changing dimension conformance, revenue components summing to totals. These require models to be understood as a set, not just as isolated tables.

At the top, scenario and metric assertions. State transitions, weekly cohort retention curves, MRR movement buckets reconciling to net change, revenue recognition timelines, lagged windows that match a finance ledger within a tolerance. These tests are slower and require fixtures, but they pay dividends when KPIs are on the line.

Treat failure modes differently by level. A base check failing on a staging model should fail the build. A semantic drift in a rarely used dimension can create a warning and a ticket. A metric assertion failure on a canonical KPI should block every downstream publish until investigated.

Instrumentation and monitoring that matters

QA does not stop at merge. It continues in production with monitoring that notices silent shifts. Too many teams alert on row counts or freshness alone. Better to watch the distribution of key fields and the ratios that express business behavior.

For example, track the share of orders with zero tax, the fraction of events with missing user IDs, the percentage of subscriptions that churn within the first 7 days, the ratio of refunds to gross sales, and the proportion of sessions tagged by a parser as bots. These ratios are stable within a band for most businesses. When they move, a logic change or an upstream behavior change likely occurred. An alert within an hour beats a dashboard correction two weeks later.

Monitor lineage as well. If a dependency graph changes shape, especially at the core model layer, notify owners. A critical source added to canonical customer should prompt a review of join logic and invariants. Silent lineage growth is a common cause of accidental logic coupling.

An approach to definitions that resist drift

Every team agrees to define metrics. Fewer teams agree to define them as code and tests, not just in documentation tools.

The healthiest pattern I have used places metric definitions in a semantic layer or view that sits on common models. The definition includes a base filter, a grain, a time attribute, a measurement expression, and dimensions allowed for slicing. Each definition has unit tests that compute the metric on a known fixture dataset where edge cases are present: leap days, refunds after cancellation, free trials converting mid-period, partial period proration, and currency changes.

When new product launches or pricing changes occur, create small fixture datasets that mimic the new behavior. Wire them into the unit tests before the launch. You will catch misalignments early. I have watched teams catch VAT inclusive bugs that would have caused a 4 to 6 percent revenue overstatement in EMEA because the fixture made the inclusive amounts obvious compared to the expected outputs.

Handling ambiguity without stalling

Perfect definitions are rare. What matters is how you move when ambiguity appears.

Treat ambiguous logic as uncommon by default. Place it in a mart or a view scoped to the stakeholder who needs it. Mark it experimental with a sunset date, say 90 days out. Require that a permanent place for it be reviewed in a standards meeting before that date. This keeps work moving while signaling that the logic should not leak into common models.

Also, track questions asked more than twice about a metric. If your support channel sees repeat confusion over a dimension like active user, the problem is with the definition or its communication. Set aside time every two weeks to refine those hot spots. The hours invested here save days of churn later.

A compact checklist for (un)Common Logic in practice

Separate models by purpose: staging, common core, and scoped marts, with clear contracts at each boundary. Treat definitions as code, with fixtures and tests that capture edge cases and business rules, not just schema constraints. Classify logic deliberately. Common logic is governed and versioned. Uncommon logic is explicit, scoped, and reversible. Monitor ratios and semantic distributions, not just freshness and counts, and alert on lineage changes in core models. Review time, status, and joins as first-class logic decisions, with documented reasons and thresholds for acceptable mismatch.

Tooling that helps without owning your brain

Tools do not create standards. They can enforce and encourage them. Teams find success with:

Dbt or a similar build tool to encode model dependencies and tests. Write custom tests when needed. A generic unique test catches a duplicate, but a revenue composition test that reconciles line items to invoice totals prevents subtle revenue leakage.

Great Expectations, Soda, or native warehouse checks to codify expectations. Keep expectations small and meaningful. I have audited projects with thousands of checks that added noise. A few hundred well-chosen assertions on the core layer outperform a blizzard of shallow checks.

A semantic layer or metric store where definitions live. Whether that is a purpose-built platform or a thin modeling layer in your BI tool, the key is versioned definitions and test hooks. Metrics defined only in dashboard filters will drift.

Data contracts or schemas at the ingestion layer. Even a JSON schema with allowed enums for event types and property names avoids hundreds of downstream cleanups. Put rejections on a dead letter queue and report on them weekly.

A lineage-aware catalog. Not for vanity, but to make responsibility visible. Every core model should have an owner and a maximum acceptable time to investigate an alert, stated in hours, not days.

Edge cases that separate mature teams from aspiring ones

Multi-currency revenue. Decide where conversion happens, at what rate, and when. Convert at the line item or invoice level, not at report time. Keep both the original and converted amounts, with the rate used. If finance uses a period-end rate for reporting but product analytics wants purchase-time rates, separate the common and uncommon logic and test both against fixtures.

Refunds and chargebacks. Do not subtract refunds from gross revenue in a way that hides return behavior. Keep refund counts and amounts separate, tie them to the original transaction, and include the refund date and reason code. Reconcile net revenue movement with explicit refund and chargeback buckets.

User identity. Build a durable user key that is not dependent on login status or cookies alone. Reconciliation between anonymous events and authenticated sessions should happen in common logic, with explicit matching rules. Measure how many sessions stitch to a user and alert on drops. Marketing campaigns rely on this number, and nothing erodes trust faster than a sudden unexplained change in attributed conversions.

Late arriving facts. Warehouses make it easy to rebuild yesterday. Business reality means long-tail updates arrive days later. Document acceptable late arrival windows per model and create backfill jobs as part of the standard, not a manual fix. Flag metrics sensitive to backfills with a confidence score for the last N days. Executives appreciate a number that says 93 percent confidence today, 99.7 percent in three days.

Privacy and deletion. Design deletion as a first-class event. If a user requests deletion, your common user model must reflect removal while preserving aggregates where allowed. QA should include tests that confirm aggregated metrics remain stable within expected tolerances after deletions, and that sensitive attributes disappear across all layers.

A sample workflow that keeps quality high without slowing delivery

Write or update the contract for any affected common model. Keep it to one page, focused on purpose, invariants, and changes. Build staging models with minimal logic and add base checks. Validate freshness and shape before proceeding. Add or adjust core models to encode common logic. Write semantic tests and at least one scenario test using a small fixture that exercises expected edge cases. Layer marts for uncommon logic, scoped and documented. Keep diffs small. Require reviewers to check scoping language in code and documentation. Ship with monitoring hooks on ratios and lineage. Define alert routes by model owner and expected response time.

Tight loops win. The process above can run in a day for small changes and a week for larger launches. The first time you apply it, it will feel heavy. By the third iteration, it feels like a seatbelt, not a harness.

How to arbitrate disagreements without politics

Disagreements over logic will happen. The goal is to resolve them quickly and keep the decision visible.

Set a small standards group, three to five people, with representation from analytics engineering, a business stakeholder like Finance or Product, and someone close to the data sources. Grant them decision rights on common logic and versioning. Record decisions in the contracts and require migration plans for incompatible changes. Hold a weekly 30 minute session that reviews proposed changes and incidents. Keep a backlog of contested points and timebox debates. If consensus cannot be reached within two meetings, choose a default, label it temporary, and set a date to revisit with new data.

People trust processes that produce predictable outcomes. Process, here, means the path from discovery to decision to code to tests to monitoring.

An anecdote on speed vs quality

A growth team once asked for a new definition of activated user, to be used in a campaign that launched in three days. Their proposal counted any user who clicked a certain feature within seven days of signup. Product analytics argued that activation required both the click and a successful completion of a workflow. Time was short. The traffic was large. The team wanted to move.

We used the (un)Common Logic lens. The existing common definition remained intact. We created an uncommon view called growth activationv1 for the campaign, with code and tests scoped to the growth mart. We noted the sunset date for the view in 60 days. Monitoring tracked the ratio between the new definition and the common one. The campaign launched on time. Two weeks later, the monitoring showed the growth definition overstated activation by 18 to 22 percent compared to common. The team adjusted targeting. Two months later, we aligned on a shared definition that preserved speed and accuracy. No dashboards broke, and no one argued over a phantom drop in activation.

Speed and quality are not enemies. Undefined logic is the enemy.

Measuring the impact of logic standards

Executives will ask how you know the standards help. Treat QA like any other product and measure outcomes.

Incident counts and time to detection, broken down by stage in the pipeline and by common vs uncommon logic. If most incidents arise from common models, you need stronger contracts and tests. If most arise from marts, you may be scoping too much as uncommon or duplicating logic across teams.

Metric volatility bands, especially for canonical KPIs. A tighter band after introducing standards indicates fewer unplanned logic changes.

PR review times and rework rates. If review times spike and rework is high, standards may be too rigid or unclear. If review times drop while incidents hold steady or decline, you found a productive balance.

Trust signals, informal but telling. Fewer Slack threads asking why two dashboards disagree. Fewer last-minute reconciliations before board meetings. These are hard to quantify but easy to feel when they shift.

The quiet power of naming

Names influence behavior. A common anti-pattern is a model named customers that mixes real customers with prospects, test accounts, and churned records. Rename it to customer universe and create customeractive as a separate model with a clear status machine. Sudden clarity follows. Another is metrics named revenue that mix gross and net. Rename them to revenue gross and revenuenet, and make conversions explicit. Teams step into fewer traps when names state the logic openly.

The same applies to uncommon logic. Prefix or suffix with the scope, such as mrr byregion apacrules or activation growthv1. In code review, these names act like road signs that warn you when a local rule tries to sneak into a highway.

Where teams stumble when adopting standards

The first stumble is overreach. A team tries to define every metric at once, writes thick documents, and stalls delivery. Start with the https://lanebzse512.theglensecret.com/un-common-logic-for-b2b-growth-leaders three to five KPIs that appear in leadership decks. Harden them with contracts, tests, and monitoring. Expand from there.

The second is neglecting migration. Changing common logic without a clear deprecation path leaves consumers stranded. Provide a parallel run window, migration guides, and decommission dates. Expose both old and new definitions, with warnings on the old, for a defined period.

The third is tool-chasing. New tools promise to solve semantics. They help, but without clear responsibility and a habit of writing tests that encode business rules, you will wrap old problems in new wrappers.

The fourth is culture. If analysts feel that raising a logic question delays them or earns a reprimand, they will route around standards. Celebrate catches. Publicize near-misses and the fixes. Make it safe to say, this rule seems uncommon, can we scope it?

Bringing it together

QA for analytics lives or dies on logic. Data quality matters, but it is table stakes. What separates a team that hits its stride from one caught in loops of reconciliation is a deliberate, disciplined approach to logic that acknowledges reality. Business rules do vary by region and channel. Definitions do change as products evolve. The trick is to keep common and uncommon apart, versioned, and visible, then test them as if they were code, because they are.

The (un)Common Logic frame gives you a vocabulary and a structure. Build core models that everyone can trust. Layer specific rules on top, with scope and sunsets. Write tests that read like stories the business would recognize. Monitor the ratios that tell you when behavior shifts. Keep names honest. Decide together, with a process that is faster than debate.

Do this for a quarter and you will notice something quiet but profound. Meetings focus on what to do, not whose number is right. Analysts spend more time exploring and less time reconciling. Engineers fix the right problems. The data stack fades into the background, as it should, and the logic earns the trust.