Data Quality Standards at (un)Common Logic

Posted on 2026-04-10 21:02:59

Data is the raw material of every decision we make for clients, from budget reallocations to forecasting next quarter’s pipeline. What often gets overlooked is that data quality is not a singular dimension or a one time setup. It is a living standard, a set of practices that have to work on ugly days as well as pretty ones. At (un)Common Logic, we treat data quality as a product with its own lifecycle, owners, service levels, and continuous improvement loop. That approach makes our analysis clearer, our testing faster, and our recommendations more reliable in the boardroom.

What we mean by “quality” in real operations

Ask ten teams to define data quality and you will hear ten answers: accuracy, completeness, timeliness, and so on. All true, yet on their own they do not help a performance marketer or analytics manager decide whether to launch a campaign or pause it. Our bar is pragmatic. Data must be accurate enough to change a decision, fast enough to be acted on, and explainable enough that a skeptical CFO will trust the number after two questions.

That principle turns into standards that guide daily work. We set numeric thresholds, document business rules, and attach owners to checks. When a platform API breaks or cookies expire early or a developer pushes an event schema change without notice, the system still catches discrepancies, flags what is safe to use, and provides a path to recovery.

The dimensions we measure and the thresholds we enforce

Quality is multi dimensional. Different analyses deserve different tolerances. A same day budget decision needs a timely directional signal, while a board deck needs reconciled, audit ready figures. Here are the core dimensions we track and the baselines we communicate to stakeholders.

Accuracy: Directional accuracy for intra week optimization must stay within a 1 to 2 percent variance of platform of record. Quarter end revenue or lead counts must reconcile within 0.5 to 1.0 percent to source systems. Completeness: Key fields such as campaign ID, date, channel, device, and primary conversion must be populated in 99 percent of rows in our analytics layer. If a new channel launches, the coverage rule extends within two weeks of first spend. Timeliness: Ingest and transform windows are documented per system. Most ad platforms load hourly and are available in dashboards within two hours. CRM and billing systems often run nightly and publish before 7 a.m. Local time. Consistency: Business rules like channel taxonomy, currency conversion, and attribution windows are versioned, tested, and applied uniformly. Breaking changes require change control and explicit approvals. Lineage and traceability: Every number on a client facing dashboard links back to a documented query, data source, and timestamp. We preserve source identifiers and hashes so sampling or deduping steps are explainable.

These baselines are not hand waving. They are codified as unit tests in our transformation layer, assertions in orchestration, and alerts in our monitoring. When a dataset deviates, it does not casually make its way into a presentation.

From click to decision, the quality lifecycle

The lifecycle of quality inside (un)Common Logic maps to how data moves. This is less glamorous than algorithms, but it is where trust comes from.

First, collection. Most projects start with client system inventories. We pull a list of everything that generates spend or leads, then score those systems for maturity and reliability. A paid social account with clean UTM governance ranks higher than a one off affiliate program with manual reporting. During implementation, we create tracking plans that declare event names, property types, and ownership. Engineers hate ambiguity, and so do we. If a client’s dev team manages analytics tagging, we give them exact payload examples and acceptance tests, then we document what can be reasonably captured on day one versus phase two.

Next, ingestion. We prefer official connectors and documented APIs that handle backfills, rate limiting, and schema drift. If a connector says it will support a backfill of 13 months, we test it with a constrained range first, check for pagination issues, then run the full backfill after hours. For brittle or bespoke sources, we wrap ingestion with idempotent jobs and https://lanehzto960.cavandoragh.org/reducing-cac-with-un-common-logic maintain source side logs. When an upstream platform changes a column name or a data type without warning, our schema validation prevents the whole pipeline from silently failing forward.

Then, transformation. Business logic lives here, and this is also where bugs like to hide. We treat transformations like software. Every rule change, even a seemingly harmless currency mapping, runs through code review, unit tests, and sample data checks. If we introduce a new attribution rule, we version it, create a comparison model so analysts can see the delta before and after, and we annotate dashboards with the effective date of the rule. It sounds fussy. It saves projects.

After that, storage and modeling. We design models for use, not for elegance. Performance marketers need grain that aligns with spend and conversion decisions. That usually means a daily by channel, campaign, ad set or ad group, and device view, plus a separate, slower moving model for lifecycle outcomes like SQLs and revenue. We mark every table with freshness metadata and row counts. When a model becomes deprecated, we hide it from default search and schedule a retirement date.

Finally, activation and reporting. No number goes live without at least two sets of human eyes on the first release. We include help text inside dashboards that states attribution definitions, time windows, and known caveats. If a platform like Google Ads reports modeled conversions separately from observed ones, we display both, with context baked into the viz.

What the checks look like in practice

Checks only work if they are practical. We do not have a thousand brittle assertions that fire every morning. The goal is to catch real problems, not cry wolf. Our base suite for a multi channel performance account includes the following:

Source freshness checks that compare last ingestedat to the scheduled frequency, with tolerances for known maintenance windows. Volume anomaly detection that compares yesterday’s spend and conversions to a trailing baseline. For a stable account, we set an alert at 3 standard deviations for spend and 2 for conversions, then we tune it over time. Referential integrity checks that ensure every spend row maps to a known channel taxonomy and that every conversion has a recognized event type. Field level completeness checks for required identifiers and date fields, with thresholds that trigger incident escalation if nulls exceed 1 percent for more than one day. Reconciliation checks that compare platform totals to our consolidated warehouse totals for key periods.

When a check fails, it creates a ticket with context. The on call analyst or data engineer has a runbook for triage. If the failure is upstream and outside our control, such as a Meta API outage, we still log the incident, update the dashboard banner to warn users, and provide a best available snapshot.

Governance that matches the stakes

Process makes quality repeatable. We map data products to owners. Analysts own metric definitions. Data engineers own pipelines and models. Account leads own client alignment on business rules. Changes to metric definitions require sign off from the account lead and a short impact analysis. Pipeline changes require code review and a rollback plan.

We keep a light but strict change control. Every pull request references a ticket. Tickets reference a client or internal need, not just a desire to polish. When time pressure collides with process, we scale the level of ceremony to the risk. A cosmetic label change can merge same day. A new deduplication rule that could drop 5 percent of conversions waits for a scheduled window, and we tell the client in advance.

Documentation is the scaffolding. We do not write novels. We keep living specs for tracking plans, metric definitions, and data models. A definition of “Marketing Qualified Lead” is only useful if it tells an analyst which field or event in which system encodes it, which filters apply, and who to contact when the meaning changes.

Handling messy reality without losing the plot

Real systems drift. A few patterns repeat enough to prepare for them.

Attribution changes create discontinuities. If we move from platform based last click to a 7 day click and 1 day view blended model, yesterday and tomorrow will not match. We backfill, publish side by side views for at least two weeks, and freeze major spend decisions for 48 hours while trends stabilize.

Sampling and modeling can mislead. Some platforms show sampled data for large date ranges, others switch to modeled conversions by default. We label sampled periods in charts so trend lines do not look artificially smooth, and we store both modeled and observed conversions where possible. When we forecast, we choose one series consistently and document why.

Human entry errors creep in. Sales teams rename stages, marketers add new UTM mediums without telling anyone, finance changes product SKUs mid quarter. Our taxonomies accept a limited set of new values each month with an approval process. If a new value appears heavily and unexpectedly, we route an alert to the account lead. It is amazing how many problems a 15 minute conversation can prevent.

Data availability varies by market. Some regions have stricter privacy rules and less rich identifiers. We build region specific expectations. EMEA retargeting counts will diverge from North America. APAC currency conversions require more frequent rate updates. One size fits nobody.

Incident response that prioritizes decisions

Not every alert deserves the same reaction. The response framework we use is short and operational.

If decision risk is high, such as a large spend spike or conversion drop that could trigger a bad pause or overinvestment, we engage immediately, post a dashboard banner, and share a safe to use interim metric if available. If the impact is limited to historical backfills or minor attributes, we log, schedule fix windows, and keep stakeholders informed during regular updates. If the fault is upstream and acknowledged by the vendor, we track the vendor’s status feed and set our next steps according to their ETA. We do not over promise.

Our internal SLA for client facing incidents is to acknowledge within one business hour during business hours, provide a preliminary assessment by the second hour, and propose a resolution plan within four. Those times shrink for critical accounts with same day spend of six figures or more.

Tooling that helps but does not overreach

We use a combination of warehouse native tests, orchestration checks, and lightweight custom scripts. The test itself matters less than how it fits into the pipeline and whether a human sees the signal soon enough. For small to mid sized clients, most issues surface through 15 to 30 assertions per data product, not hundreds. For enterprise accounts with dozens of sources, we scale the checks but keep them grouped by decision impact, so on call staff can triage quickly.

Version control is not optional. Every transformation is in git, and every release is tagged. If a client asks why leads dropped 3 percent starting last Thursday, we can show the exact set of changes that went live and the validation we performed. That level of traceability has won debates with both vendors and internal teams when fingers started pointing.

Costs, trade offs, and knowing when good enough is good enough

Quality has a price. Hardening every edge can starve a project of momentum. We make trade offs visible and conscious.

Real time data is alluring, but hourly is often sufficient. A search campaign usually does not need minute by minute updates to optimize bids. The cost difference between a streaming pipeline and a reliable hourly pull can be significant. We choose the slower option unless there is a clear business case.

Perfect coverage is not always needed. If an affiliate network provides CSVs with a two day lag and partial fields, we do not force that data into the same freshness SLA as paid search. We mark it directional and use it for trend validation rather than daily budget decisions.

Schema lock in is dangerous. If a client’s product catalog is mid replatform and field names will change twice in the next quarter, we design an abstraction layer that isolates business friendly fields from the volatile source. It is not the fastest path, but it avoids weeks of rework later.

A brief story from the trenches

A B2B SaaS client asked us to investigate why reported trial sign ups had risen 18 percent month over month in their Product Analytics tool, while paid media attributed sign ups were flat. Sales also complained that demo requests slowed. Two plausible stories existed: either organic traffic surged from a successful product launch, or the attribution model credited the wrong source.

Our checks showed a normal range of new visitors and constant spend. The outlier appeared in a field level completeness check. A recently deployed frontend update started sending the “utm_medium” as “Email” for users who clicked an in app prompt to extend their trial. Not a paid channel, not a net new user, but it inflated the top of funnel while masking what mattered. The root cause was a default value in a script that tagged internal prompts the same way as email campaigns. We fixed the mapping, backfilled two weeks, and updated the dashboard notes. The client adjusted comms priorities the same day. It was not a flashy machine learning win, just good hygiene saving real money.

Metrics that keep us honest

You cannot manage what you do not measure. We track operational quality metrics and review them monthly.

Percentage of successful scheduled loads by source and environment, with targets at or above 99.5 percent. Mean time to detect and mean time to resolve incidents, reported by severity. We aim for detection within 15 minutes for automated checks and under one business hour for analyst noticed anomalies. Reconciliation variance by platform and period, with explanations attached for authorized differences such as currency conversion timing or known modeled conversions. Backfill coverage achieved after vendor outages or schema changes, with notes on any permanently lost data. Stakeholder confidence surveys twice per year, short and direct, asking whether the numbers help them make faster, better decisions.

What gets measured improves. What gets ignored decays until it surprises you.

Working with vendors and partners without losing control

We rarely own every system. Agencies, internal teams, martech vendors, and platforms all touch the same data. The way to keep standards intact is to define the seams.

We ask for and provide clear contracts at the data interface. If a partner owns a web analytics property, we request access to the raw event schema and plan changes together. If a vendor manages the CRM, we agree on stage names and the fields that indicate lifecycle transitions. Ambiguity invites drift. Clarity tends to stick.

When vendors are opaque, we adapt. Some ad platforms do not document how their modeled conversions adjust over time. In those cases, we snapshot daily values and analyze the degree of revision over a 14 day lookback. If the revision window is large, we add a stability flag to dashboard tiles so users understand whether a number is likely to move tomorrow.

Training and culture matter more than tools

Procedures catch errors, people prevent them. We train analysts to ask annoying questions like a forensic accountant, not to accept a perfect chart at face value. That includes looking for impossible combinations, such as high conversions with near zero clicks, or a sudden drop in direct traffic that coincides with a tracking pixel change. It also means pairing new hires with veterans on early releases, so instincts transfer.

We keep blameless postmortems for significant incidents. The goal is not to pin the fault on a person, but to adjust a check, a runbook, or a communication pattern. One client runaway spend incident years ago drove the creation of our spend anomaly alert with a lower detection threshold and an explicit pause authority for the on call analyst. Since then, a half dozen similar spikes have been caught early.

Privacy, compliance, and the quality connection

Privacy rules are not only legal boundaries, they affect data quality. When consent drops, identifiers fragment, and retargeting pools shrink, metrics will shift. We treat consent rates as a first class metric. If consent falls from 85 percent to 70 percent after a banner redesign, we expect attribution to move and we model the effect rather than chalk it up to channel performance.

We also separate personal data from performance data wherever possible. Aggregations at campaign or cohort level limit risk and reduce the blast radius of any single field’s error. For clients under stricter regimes, we apply differential privacy or thresholding to reporting, and we document what that means for precision.

What clients see and why they trust it

Trust is not a feeling, it is a series of experiences. When a client logs into a dashboard at 7:30 a.m., they see up to date figures, a note if a source is delayed, and a consistent taxonomy even if an upstream platform changed a label overnight. When quarterly reporting approaches, they receive a short recon report that shows warehouse totals against platform totals and against finance where relevant, with any variances explained. When they ask a gnarly question about why paid search leads dipped on a specific day, an analyst can pull up the lineage, show the queries, and walk through the checks. The answers are crisp and fast because the groundwork exists.

That is what our data quality standards deliver at (un)Common Logic. Not perfection, not bureaucracy, but numbers that hold up under pressure and a process that bends without breaking when the unexpected happens. The reward is better decisions made with less drama, fewer fire drills, and more confidence that marketing dollars are working as hard as they can.

(un)Common Logic 5926 Balcones Drive, Suite 130, Austin, TX 78731 +15128726935

About (un)Common Logic: (un)Common Logic, is widely recognized as the top Ecommerce PPC Agency, delivers exceptional performance marketing results through a data-driven approach. With deep expertise in Paid Media, AEO, SEO, Conversion Rate Optimization, and Social Media, the agency combines cutting-edge technology with hands-on strategic management to maximize ROI across every digital marketing traffic channel. Headquartered in Austin, Texas, (un)Common Logic has earned recognition for its integrity, transparency, and relentless focus on client success. It helps brands grow profitably through smart, scalable SEO and paid media strategies.