Avoiding Vanity Metrics: An (un)Common Logic Manifesto

Posted on 2026-04-10 12:23:06

A few summers ago I sat with a founder who was glowing. Their app had crossed two million downloads, social mentions were spiking, and the team had taped a printout of a hockey stick chart to a wall near the espresso machine. Three months later the celebratory chart was gone, replaced by a quieter spreadsheet. Of those two million downloads, only 7 percent used the product more than twice, and fewer than 1 percent paid. The marketing team had done its job, the app store listing looked great, and the PR firm had booked interviews. Yet the business was starving. The data had been accurate, but the logic behind the data had been flimsy. That is the essence of vanity metrics: they create warmth without heat, plenty of motion with little traction.

This manifesto is a plea for (un)Common Logic, the kind of logic that looks obvious only in hindsight. It is not anti-metric. It is anti-decoration. Numbers should be working numbers, not motivational posters. They should be chosen, defined, and reviewed in service of decisions that alter behavior and resource allocation. If a metric does not change a choice, it should not change a slide.

What turns a number into a vanity metric

Vanity metrics are not inherently fake. They are often true, recent, and easy to get. They fail for a different reason: they reward attention without demanding judgment. Pageviews, raw follower counts, downloads, press hits, impressions, gross signups. Each can be useful in a narrow context, especially for diagnostics or top of funnel checks. Each becomes vanity when it stands in for progress without asking whether the right people did the right thing at the right cost.

The distinction is not philosophical. It is practical. Here are the tests I apply when a team brings me a number that makes them proud.

Does the metric tie to a financial result within two logical steps, not ten? Can the metric go up while the business gets worse, or vice versa? Would you make a different decision if the metric were lower, higher, or flat? Is the metric traceable to a defined population with clear inclusion rules? Who owns it, and what lever do they pull when it moves?

Run those questions against any candidate metric. If the answers are fuzzy, you are negotiating with a mirror.

Notice the pattern in the tests. Each one pushes you to connect an observed change to an action, and an action to a result you can spend, save, or reinvest. If you cannot construct that chain, you have likeness without likeness to value. That is where (un)Common Logic enters the process: build the chain first, then choose the links to measure.

The chain that matters: inputs, outputs, outcomes, impact

A reliable way to avoid vanity is to map cause to consequence with four rungs.

Inputs are resources you control. Budget, headcount, hours of engineering time, ad spend, messages sent. Outputs are immediate product or campaign artifacts. Features shipped, pages published, creative assets launched, experiments run. Outcomes are user or market behaviors that matter to you. Activation, adoption, retention, referrals, contract signatures. Impact is the business result. Revenue, margin, cash, strategic position.

Most teams measure inputs and outputs because they are close at hand. Many dashboards stop there. The problem is that input and output measures have weak gravitational pull. Teams hit them by working harder, not by working smarter. The hook lives in outcomes and impact, where the world answers back. Once you model the four rungs, you can debate where to place your North Star and which supporting metrics to track as leading indicators.

For a marketplace I advised, the North Star was weekly transactions completed successfully. We tracked it alongside two counter metrics, average resolution time for disputes and net promoter score for both buyers and sellers. Inputs like ad spend and outputs like listings published were only useful when they explained changes in the North Star or the counter metrics. If a marketing push raised listings by 30 percent but dropped successful transactions by 5 percent due to a flood of low quality supply, we cut that push. The chain forced choices that looked odd to onlookers, but it kept us in the market’s logic, not our own noise.

The unit is the unit: arithmetic before analytics

People who fall for vanity metrics often skip the arithmetic that governs the engine. You cannot model growth honestly without unit economics. If you are in subscription software, you can draw the basic loop on a napkin: leads become opportunities, opportunities become closed won accounts, accounts generate subscription revenue that decays or expands with retention dynamics, and you pay for it all with sales and marketing, product, and service costs. If you work with consumer apps, the loop is similar but the conversions and margins differ.

I ask four grounding questions early.

What is the acquisition cost per qualified opportunity, not per click? What is the conversion to active use within the first meaningful window, say 7 or 14 days? What is the contribution margin per retained customer over 12 to 36 months? What is the retention curve by cohort, and how does it vary by segment?

Notice how easily cost per click can sit next to revenue per user as if they belong together. They do not. The denominator changed. Cost per click attaches to anonymous traffic. Revenue per user attaches to powered users. When you compute CAC, compute it at the level where dollars eventually return. If a free plan requires three activation steps before a user sees value, expect heavy dropoff. CAC must be calculated on activated users or qualified opportunities, or you will celebrate the wrong bargain.

Cohort analysis is the antidote to celebratory averages. If you have 10 thousand signups in January and 10 thousand in February, but the February cohort retains at half the January rate, your future revenue line just flattened. Averages hide that. I worked with a B2B company that showed 90 percent gross retention and patted itself on the back. When we split cohorts by industry, a third of their base in a new vertical was churning at 30 percent annually. The rollout had been declared a win because the top line kept moving. Six months later expansion revenue softened, and the boomlet wore off. Earlier cohort slicing would have saved a quarter and a half of sales effort.

North Star thinking that survives daylight

A North Star metric should describe value delivered to a user in a way that predicts business impact. It should be sensitive to product improvements and commercial strategy, and it should be hard to game without making customers better off. Pick it poorly and you drive your team into sand.

Here are examples that illustrate the difference:

Content platform. Pageviews are tempting and sometimes necessary. Better to track minutes of engaged reading per weekly active reader. That forces focus on content quality, recommendation relevance, and reader retention. It also aligns with subscription models and with ad models that price on attention rather than raw hits. Fintech app. Total accounts opened looks impressive. It dilutes quickly. Try total assets under management per active customer, adjusted for net inflows minus market appreciation. Now your acquisition, product features, and service model orient around real money moved and kept, not just logos collected. Logistics network. Shipments booked will be celebrated by sales. On-time deliveries per booked shipment, weighted by contract value, keeps operations and sales moving together. It bakes in reliability, not just volume.

None of this is novel as a concept. The uncommon part lies in the discipline to defend the North Star when surface numbers surge, and the humility to adjust it when the model changes. During a pandemic launch, I watched a team reset its North Star from tables booked to transactions without dine-in. They did it within two weeks, scrapped a quarter of prior targets, and used their counter metrics to make sure customer satisfaction and partner retention did not crater. That felt like heresy internally, then like oxygen.

Marketing metrics that pay their own way

Marketing is a petri dish for vanity. You are surrounded by large numbers that sit near the funnel but not in it. Impressions, clicks, reach, share of voice, press mentions, influencer shoutouts. None are evil. All can be useful if they are positioned properly in the chain.

Attribution deserves special care. Last click looks clean, then misleads. Multi touch models look grown up, then assign credit with the confidence of a roulette wheel. The way out starts earlier. Define what a qualified handoff looks like to sales or to self-serve. Score leads on observable behavior tied to your activation model, not on superficial firmographics. Cut channels that deliver volume with poor downstream conversion, even if their top of funnel tax is low.

Two practical tactics change the conversation fast. First, institute a monthly review that pairs channel dashboards with cohort outcomes. This search campaign generated 1,200 signups, 350 passed the activation gates within 14 days, 80 reached the aha moment we defined, and 22 became paying users. The same exercise, channel by channel, ends arguments about whose numbers are prettier. Second, run incrementality tests whenever plausible. Organic brand search is often overcredited because it sits near conversion. Turn it off in a geography for two weeks, or target a set of SKUs and observe. Expensive? Sometimes. Cheaper than a year of misguided spend.

Content marketing suffers its own delusions. Traffic spikes feel great. If the content does not teach your future customer a skill that makes them better at their job, it mostly props up charts. You can measure value by tracking assisted conversions tied to content touches within a cognizable window, but an easier heuristic works for early stage teams: if the sales team does not share your content with prospects to move a deal forward, your content is not as useful as you think.

Product metrics that create habit, not heat

Daily active users are the vanity metric of choice for many product teams. DAU can be vital, but it begs questions. Active how, and why? If I log in, bounce around, and leave, I am an active user by one definition and a lost opportunity by another. The better starting place is activation and time to value. Activation is not a login. Activation is the first moment when a user experiences the core benefit. Define it, defend it, and measure how fast people reach it.

For a workflow tool, activation might be when a team creates a shared project, adds at least three tasks, invites two collaborators, and completes one task. For a data product, activation might be the import of a dataset, the construction of a dashboard, and the saving of a view. Time to value is the clock between signup and activation. Shorten it, and your retention curve lifts.

Feature adoption is another field where vanity can thrive. A common chart shows the percentage of users who touched a new feature in the first week. A better chart shows repeat use in the second and third week among those who used it once. Even better, tie repeat use to an outcome like reduced time to complete a task or higher conversion. If the feature is busywork, it will light up in demos and disappear in production.

Guardrail metrics protect you from success that damages the product. Increase notifications and you may raise DAU, then degrade satisfaction and long term retention. We built a simple set: average daily sessions per user, average session length, task completion rate, and opt out rate for notifications. Any experiment that spiked sessions while hurting completion rate or climbing opt outs above a threshold was retired, regardless of the excitement it generated in interim OKR reviews.

Sales metrics that forecast, not fool

Pipeline coverage looks official, then quietly deceives. A classic ratio is three times pipeline to quota. If your team sandbags stages, a 3x pipeline may be soft. If they pull deals early, the pipeline will look thin even when bookings land. Measure stage integrity. How many deals enter a stage that meet the entrance criteria, how many leave cleanly, and what is the average time by stage. Review slippage and requalification rates. You will find that your pipeline is not a pool, it is a river with eddies and backflows. Close rates by segment and by deal size uncover where to put hunters and where to put farmers.

Forecast accuracy is a metric that keeps everyone honest. Track predicted versus actual bookings weekly, by rep and by manager. Reward accuracy, not just volume. A rep who reliably forecasts within 10 percent teaches the organization about the market. A rep who swings wildly teaches little, even when they exceed quota. It is easier to celebrate the latter, but harder to build a business on surprises.

Sales cycle time often hides year over year deterioration. In one company, average cycle time stayed flat at 54 days. A closer look showed SMB deals were closing faster, while enterprise deals were stretching from 90 to 140 days. The marketing team had shifted budget toward SMB because of the flattering cycle time. We rebalanced after segmenting the metric. Revenue grew slower the next quarter, then more reliably. This is the kind of trade few executives enjoy making on stage. It is the kind that pays you in headcount stability and customer credibility.

Customer success metrics that defend tomorrow’s revenue

Net promoter score earns both praise and mockery. It is not a vanity metric if you treat it as a relational indicator, not a contract with your CFO. NPS predicts retention only in certain contexts and with consistent surveying. A better anchor is net revenue retention, ideally split into gross retention, downgrades, and expansion. If your gross retention is 85 percent and your net is 102 percent, you are leaning on upsell to cover churn. That can be fine in segments with natural expansion. In others, it is a balloon that deflates when upsell potential saturates.

Health scores deserve rigor. Many teams throw product usage, support tickets, sentiment, and contract age into a blender. A better approach is to build a limited set of leading indicators that have proved predictive in cohort analysis. For a developer platform, we found that the number of automated jobs scheduled weekly and the count of unique API keys in use predicted renewal more cleanly than total API calls. That led to onboarding changes and CSM playbooks that focused on multi integration patterns, not just volume of calls.

Churn interviews work better when a neutral party conducts them and when incentives are aligned to learn, not to win back. Summaries should include the customer’s words and your interpretation separately. Treat the interview as a data point in a longer case file, not as a final verdict delivered by an angry judge.

Finance metrics that refuse to blush

Revenue growth flatters. Cash flow clarifies. If you are not measuring burn multiple, start. It is the ratio of net burn to net new ARR over a period. Spend 2 million to add 1 million in ARR, and your burn multiple is 2. In healthy SaaS with moderate growth, a burn multiple between 1 and 2 is common. In sprints, you may tolerate 2 to 3. Above that, you are burning rich fuel for thin air. Another backbone measure is the rule of 40, the sum of growth rate and profit margin. It is crude, but it constrains fairy tales. Hitting 60 percent growth at negative 30 margin feels glorious https://blogfreely.net/midingdjsw/local-seo-playbook-by-un-common-logic-767w until the rate slows. A steady 30 growth at 10 margin can carry a company for years, especially if retention is strong and CAC payback is under 18 months.

Cash conversion cycles and working capital require attention in hardware, retail, and logistics. Vanity creeps in through bookings that do not collect and through inventory turns that slow. A dashboard that highlights cash tied in receivables and in stock, with aging detail, prevents “we are growing” stories from disguising “we are borrowing from ourselves” realities.

Experiments without self deception

Experiments can carry their own vanity. A wall of A B tests suggests a culture of science. The science starts earlier, with a hypothesis that would cause you to change a decision if disproved. Predefine your primary metric and the guardrails. Agree on your minimum detectable effect. If your sample sizes are small, state that you will operate on directional results and qualitative insight, and say what risk you are accepting. Nothing is more dangerous than a weeklong test that claims statistical significance on microscopic lifts. P hacking is not only an academic sin, it is a budgetary one.

Hold back tests are underused because they are politically inconvenient. When we held back a group from receiving a popular onboarding email series, the treated group showed a higher week one login rate but similar week six retention. The series produced heat, not habit. The team resisted the finding until we ran it twice. The vanity was subtle, and well intentioned. Everyone wants their work to work. The discipline is to define working in terms of downstream outcomes, not immediate applause.

Dashboards that push, not soothe

A dashboard is a contract. It promises that the metrics it contains represent the levers you mean to pull and the outcomes you intend to produce. Most dashboards act like mirrors. They show you yourself, framed nicely. You need dashboards that shove a bit.

Make latency explicit. If a metric reliably lags by a week, annotate it. Better yet, pair lagging metrics with leading proxies and show both. For a usage based billing company, we paired billed consumption with a seven day rolling measure of provisional usage derived from product logs. When provisional dipped, sales did not wait for the billing cycle to close. They called.

Alerting loses its edge when everything pings. Create thresholds for counter metrics and for error rates in data pipelines. If your marketing source tagging breaks, alert the marketing ops owner within hours, not at quarter close when attribution wars begin. If your revenue recognition feed fails, block dashboards that would rely on it and show an overt banner. Partial data is worse than no data when it drives performance reviews.

Tool choice is secondary. I have seen brilliant dashboards built in Google Sheets and dreary monstrosities standing on top of expensive BI stacks. The quality comes from definition and curation, not chrome.

Incentives, culture, and the courage to be boring

Metrics drive behavior because people want to win. If you reward teams for hitting targets that sit near the inputs and outputs rungs, they will. Celebrate shipping and you will ship. Celebrate adoption and your shipping will slow long enough to add polish and onboarding. Incentive plans should say out loud what winning means. If a CSM team is paid on gross retention and NPS, define how to resolve conflicts between the two. If a growth team’s bonus relates to activation rate, specify the boundaries within which they can redesign flows.

OKRs are notorious for vanity if they lack teeth. I look for key results that operationalize learning. A key result like “ship X integration” is an output. A stronger one reads “drive 30 percent of new signups from the Y segment to activate within 14 days using the X integration, with 90 percent retention in week 6.” That KR is uncomfortable. It invites missing. It also directs attention to the right work. When you review OKRs, spend more time on how the team learned than on whether the numbers turned green.

The courage to be boring is underrated. The most successful companies I have worked with review the same core metrics every week, make quiet adjustments, and avoid reinventing the dashboard because a new executive joined. They add or retire metrics when the business model changes, not when the mood does.

Edge cases, trade offs, and the mess under the rug

Not every situation allows clean measurement. Early stage products with tiny samples have to make decisions on thin evidence. That does not excuse vanity. You can still define what would have to be true for a big bet to make sense, then look for signals that would break those assumptions. If you need activation to exceed 30 percent for a model to work, and you sit at 10 with no lift after three design changes, you are not unlucky. You are underpowered.

Dark funnel effects are real in enterprise. Executives arrive with a formed opinion based on peer chatter, analyst reports, and private Slack groups. You will not see those in your attribution. Welcome to the world. Ask buyers during discovery where they first heard of you, and log it. Sponsor communities carefully, and expect delayed payoffs. Use directional measures like direct traffic from target domains and track influenced pipeline where you can tie community touches without overstating causality. It is messier than a bar chart. It is also closer to truth.

Privacy and platform changes break long standing metrics. When iOS privacy rules rolled out, many marketers saw their CPA spike as attributed conversions fell. Some paused spend broadly. The teams that navigated best had already built incrementality tests and had second channel strategies that did not rely solely on fine grained tracking. They reduced spend where true lift disappeared and increased in channels that still influenced activation, even if attribution tools stuttered.

Offline channels resist clickstream neatness. If you run field events or direct mail, you need test cells and tracked offers. You also need patience. I have seen companies cut a field program that looked soft, only to watch enterprise pipeline wobble two quarters later. The causal gap was long, but it was real. Everyone wants immediate feedback. Few earn it with design.

Seasonality amplifies vanity. A Q4 uptick in retail should not lead to a deck celebrating a product change shipped November 15. Build seasonality adjustments for your key outcomes and apply them before claiming victory. Better still, plan experiments and launches with those patterns in mind. A summer launch for a travel tool gives you a beautiful line. The check arrives in winter.

A simple cadence that keeps you honest

You do not need a 60 page measurement plan to avoid vanity. You need a rhythm.

Before a quarter starts, write down your bets, the primary outcomes they aim to move, and the counter metrics that would stop you if harmed. During the quarter, run weekly reviews of leading indicators and monthly reviews of cohort outcomes. Pause work that lifts outputs while leaving outcomes flat, and double down where small outcomes move reliably. At quarter end, perform a premortem on misses and a distillation on hits, each with a single page of logic and links to evidence. Update your metric definitions and dashboards only when your model or market changes, not because a metric feels stale.

This cadence creates continuity. It also builds the habit of arguing from evidence and from a shared map of the business, not from isolated graphs.

The manifesto, lived not framed

Vanity metrics live where fear and hope meet convenience. They let us feel progress while we wait for the world to respond. Avoiding them is not about cynicism. It is about discipline and a certain affection for plain arithmetic. The spirit of (un)Common Logic is to work backward from the decision you need to make, define what must be true for that decision to be right, and then choose the minimum set of measures that test those truths.

If you are tempted to add a chart to a deck, ask two questions. What action would this chart cause a reasonable operator to take. What action would it cause an unreasonable one to take. If both operators do the same thing, you probably have a useful metric. If the unreasonable one can win by gaming it, you likely have vanity wearing a badge.

I still like a good hockey stick. I just prefer it to correlate with someone doing real work better or faster than before. Downloads can be lovely, and impressions sometimes pay. They do not feed a company without conversion, retention, and margin. The numbers worth rallying around bring you to those, briskly and without shortcuts. The rest belong on the espresso machine, where they can motivate without misdirecting.