Your team doesn’t trust the numbers they present to you. You can usually spot it in the first 30 seconds of a meeting. Every chart comes with a caveat. Every result is followed by a quiet “but it might not be fully accurate”. That behaviour is not caution. It is a signal. When people stop standing behind the numbers, the system has already broken. Decisions slow down, accountability fades, and performance becomes a debate instead of a direction. I see this a lot in £1–5m teams. Multiple platforms, slightly different attribution models, and no agreed version of reality. So people hedge. They protect themselves instead of committing to a position. It reminds me of an “apology report”. A document built to avoid being wrong rather than to drive a decision. Safe, but not useful. The fix is not more data. It is less. Pick one source of truth for your core metrics and make it the reference point for decisions. It will not be perfect, but it will be consistent, and consistency is what creates trust. What would happen in your business if everyone had to stand behind one number? #DigitalMarketing #B2B #leadership #saas #future
Why trust in data is fragile and how to fix it
Explore top LinkedIn content from expert professionals.
Summary
Trust in data is easily damaged when inconsistent, incomplete, or unreliable information creeps into reports and dashboards, leading teams to hesitate in making decisions. Data trust means believing that the numbers you use reflect reality, and it depends on both technical systems and clear, agreed-upon standards.
- Establish clear standards: Choose one reliable source for your core metrics and agree on definitions across your team, so everyone refers to the same numbers when making decisions.
- Monitor for quality: Regularly check your data for freshness, accuracy, and completeness to catch early signs of errors or gaps before they impact your business.
- Document and own: Make sure data processes, changes, and fixes are well-documented and assign clear ownership, so troubleshooting becomes quick and confidence in your data grows.
-
-
Data quality isn’t a luxury; it’s the seatbelt in your analytics car—skip it, and the crash is inevitable. Why We Actually Care (Factors) → Business decisions: Executives trust your dashboards. Don't let them down. → ML models: Garbage in, garbage out. Your model is only as good as your data. → Pipelines: One bad field breaks everything downstream. Fix it early. → Compliance: Auditors don't accept "oops." Neither does GDPR. → Cost: Bad data means reruns, fixes, and late nights. Good data saves money. The Six Dimensions (Your Quality Checklist) → Accuracy: Does it reflect reality? → Completeness: No missing pieces. → Consistency: Same story everywhere. → Timeliness: Fresh, not yesterday’s leftovers. → Validity: Fits the rules, like a puzzle piece. → Uniqueness: No duplicates—because one identity crisis is enough! How We Actually Do It (Process) → Input validation: Stop bad data at the door. Always. → Constraints & rules: If age > 150, something's wrong. → Data profiling: Know your data before you trust it. → SLAs & SLOs: Set expectations. Measure reality. → Monitoring & alerts: Catch issues before users do. → Lineage tracking: When things break, trace it back. → Triage & RCA: Fix the bug. Fix the system. Document it. The Tools That Help (Frameworks) → Great Expectations: Write tests for your data like you test code. → Deequ: Amazon's gift to data quality. Scales beautifully. → Monte Carlo: Observability for data pipelines. Sleep better. → dbt tests: Test your transformations. Trust your models. 𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗶𝘀𝗻'𝘁 𝗮 𝗼𝗻𝗲-𝘁𝗶𝗺𝗲 𝗽𝗿𝗼𝗷𝗲𝗰𝘁. 𝗜𝘁'𝘀 𝗮 𝗱𝗮𝗶𝗹𝘆 𝗽𝗿𝗮𝗰𝘁𝗶𝗰𝗲. Data without quality is like coffee without beans—pointless. As data engineers, we’re not just pipeline plumbers; we’re the guardians of trust. Build systems that catch issues early and keep the flavor of truth intact. 𝘉𝘶𝘪𝘭𝘵 𝘣𝘺 𝘥𝘢𝘵𝘢 𝘦𝘯𝘨𝘪𝘯𝘦𝘦𝘳𝘴, 𝘧𝘰𝘳 𝘥𝘢𝘵𝘢 𝘦𝘯𝘨𝘪𝘯𝘦𝘦𝘳𝘴. 𝘕𝘰 𝘧𝘭𝘶𝘧𝘧, 𝘫𝘶𝘴𝘵 𝘳𝘦𝘢𝘭𝘪𝘵𝘺.
-
If You Can't Trust Your Data, You Can't Trust Your Decisions. 𝗣𝗼𝗼𝗿 𝗱𝗮𝘁𝗮 𝗾𝘂𝗮𝗹𝗶𝘁𝘆 𝗶𝘀 𝗺𝗼𝗿𝗲 𝗰𝗼𝗺𝗺𝗼𝗻 𝘁𝗵𝗮𝗻 𝘄𝗲 𝘁𝗵𝗶𝗻𝗸—𝗮𝗻𝗱 𝗶𝘁 𝗰𝗮𝗻 𝗯𝗲 𝗰𝗼𝘀𝘁𝗹𝘆. Yet, many businesses don't realise the damage until too late. 🔴 𝗙𝗹𝗮𝘄𝗲𝗱 𝗳𝗶𝗻𝗮𝗻𝗰𝗶𝗮𝗹 𝗿𝗲𝗽𝗼𝗿𝘁𝘀? Expect dire forecasts and wasted budgets. 🔴 𝗗𝘂𝗽𝗹𝗶𝗰𝗮𝘁𝗲 𝗰𝘂𝘀𝘁𝗼𝗺𝗲𝗿 𝗿𝗲𝗰𝗼𝗿𝗱𝘀? Say goodbye to personalisation and marketing ROI. 🔴 𝗜𝗻𝗰𝗼𝗺𝗽𝗹𝗲𝘁𝗲 𝘀𝘂𝗽𝗽𝗹𝘆 𝗰𝗵𝗮𝗶𝗻 𝗱𝗮𝘁𝗮? Prepare for delays, inefficiencies, and lost revenue. 𝘗𝘰𝘰𝘳 𝘥𝘢𝘵𝘢 𝘲𝘶𝘢𝘭𝘪𝘵𝘺 𝘪𝘴𝘯'𝘵 𝘫𝘶𝘴𝘵 𝘢𝘯 𝘐𝘛 𝘪𝘴𝘴𝘶𝘦—𝘪𝘵'𝘴 𝘢 𝘣𝘶𝘴𝘪𝘯𝘦𝘴𝘴 𝘱𝘳𝘰𝘣𝘭𝘦𝘮. ❯ 𝑻𝒉𝒆 𝑺𝒊𝒙 𝑫𝒊𝒎𝒆𝒏𝒔𝒊𝒐𝒏𝒔 𝒐𝒇 𝑫𝒂𝒕𝒂 𝑸𝒖𝒂𝒍𝒊𝒕𝒚 To drive real impact, businesses must ensure their data is: ✓ 𝗔𝗰𝗰𝘂𝗿𝗮𝘁𝗲 – Reflects reality to prevent bad decisions. ✓ 𝗖𝗼𝗺𝗽𝗹𝗲𝘁𝗲 – No missing values that disrupt operations. ✓ 𝗖𝗼𝗻𝘀𝗶𝘀𝘁𝗲𝗻𝘁 – Uniform across systems for reliable insights. ✓ 𝗧𝗶𝗺𝗲𝗹𝘆 – Up to date when you need it most. ✓ 𝗩𝗮𝗹𝗶𝗱 – Follows required formats, reducing compliance risks. ✓ 𝗨𝗻𝗶𝗾𝘂𝗲 – No duplicates or redundant records that waste resources. ❯ 𝑯𝒐𝒘 𝒕𝒐 𝑻𝒖𝒓𝒏 𝑫𝒂𝒕𝒂 𝑸𝒖𝒂𝒍𝒊𝒕𝒚 𝒊𝒏𝒕𝒐 𝒂 𝑪𝒐𝒎𝒑𝒆𝒕𝒊𝒕𝒊𝒗𝒆 𝑨𝒅𝒗𝒂𝒏𝒕𝒂𝒈𝒆 Rather than fixing insufficient data after the fact, organisations must 𝗽𝗿𝗲𝘃𝗲𝗻𝘁 it: ✓ 𝗠𝗮𝗸𝗲 𝗘𝘃𝗲𝗿𝘆 𝗧𝗲𝗮𝗺 𝗔𝗰𝗰𝗼𝘂𝗻𝘁𝗮𝗯𝗹𝗲 – Data quality isn't just IT's job. ✓ 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗲 𝗚𝗼𝘃𝗲𝗿𝗻𝗮𝗻𝗰𝗲 – Proactive monitoring and correction reduce costly errors. ✓ 𝗣𝗿𝗶𝗼𝗿𝗶𝘁𝗶𝘀𝗲 𝗗𝗮𝘁𝗮 𝗢𝗯𝘀𝗲𝗿𝘃𝗮𝗯𝗶𝗹𝗶𝘁𝘆 – Identify issues before they impact operations. ✓ 𝗧𝗶𝗲 𝗗𝗮𝘁𝗮 𝘁𝗼 𝗕𝘂𝘀𝗶𝗻𝗲𝘀𝘀 𝗢𝘂𝘁𝗰𝗼𝗺𝗲𝘀 – Measure the impact on revenue, cost, and risk. ✓ 𝗘𝗺𝗯𝗲𝗱 𝗮 𝗖𝘂𝗹𝘁𝘂𝗿𝗲 𝗼𝗳 𝗗𝗮𝘁𝗮 𝗘𝘅𝗰𝗲𝗹𝗹𝗲𝗻𝗰𝗲 – Treat quality as a mindset, not a project. ❯ 𝑯𝒐𝒘 𝑫𝒐 𝒀𝒐𝒖 𝑴𝒆𝒂𝒔𝒖𝒓𝒆 𝑺𝒖𝒄𝒄𝒆𝒔𝒔? The true test of data quality lies in outcomes: ✓ 𝗙𝗲𝘄𝗲𝗿 𝗲𝗿𝗿𝗼𝗿𝘀 → Higher operational efficiency ✓ 𝗙𝗮𝘀𝘁𝗲𝗿 𝗱𝗲𝗰𝗶𝘀𝗶𝗼𝗻-𝗺𝗮𝗸𝗶𝗻𝗴 → Reduced delays and disruptions ✓ 𝗟𝗼𝘄𝗲𝗿 𝗰𝗼𝘀𝘁𝘀 → Savings from automated data quality checks ✓ 𝗛𝗮𝗽𝗽𝗶𝗲𝗿 𝗰𝘂𝘀𝘁𝗼𝗺𝗲𝗿𝘀 → Higher CSAT & NPS scores ✓ 𝗦𝘁𝗿𝗼𝗻𝗴𝗲𝗿 𝗰𝗼𝗺𝗽𝗹𝗶𝗮𝗻𝗰𝗲 → Lower regulatory risks 𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗱𝗮𝘁𝗮 𝗱𝗿𝗶𝘃𝗲𝘀 𝗯𝗲𝘁𝘁𝗲𝗿 𝗱𝗲𝗰𝗶𝘀𝗶𝗼𝗻𝘀. 𝗣𝗼𝗼𝗿 𝗱𝗮𝘁𝗮 𝗱𝗲𝘀𝘁𝗿𝗼𝘆𝘀 𝘁𝗵𝗲𝗺.
-
📌 The Modern Data Quality Framework for BI Every company wants better dashboards, better insights, better AI. But very few stop to ask the one question that actually matters: Can we trust the data we’re using in the first place? Because the hard truth is this: Most data issues don’t come from tools. They come from unreliable foundations that nobody notices until something breaks in production. When I look at the teams that consistently ship trustworthy data, there’s always the same pattern behind the scenes. Let me walk you through my reasoning. 1️⃣ 𝐓𝐡𝐞 5 𝐏𝐢𝐥𝐥𝐚𝐫𝐬 𝐀𝐫𝐞 𝐒𝐭𝐢𝐥𝐥 𝐭𝐡𝐞 𝐒𝐭𝐚𝐫𝐭𝐢𝐧𝐠 𝐏𝐨𝐢𝐧𝐭 Accuracy, completeness, consistency, timeliness, and validity. We all know them. But most teams still treat these as “definitions.” On the other hand, the best teams treat them as operational targets. It’s a completely different mindset. Accuracy isn’t “nice to have.” It’s whether your revenue aligns with reality. Completeness isn’t a rule. It’s whether you trust the KPI enough to act on it. Everything changes once you start thinking this way. 2️⃣ 𝐓𝐞𝐜𝐡𝐧𝐢𝐜𝐚𝐥 𝐂𝐡𝐞𝐜𝐤𝐬 𝐌𝐚𝐤𝐞 𝐨𝐫 𝐁𝐫𝐞𝐚𝐤 𝐑𝐞𝐥𝐢𝐚𝐛𝐢𝐥𝐢𝐭𝐲 This is where issues hide. I can’t count the number of times I’ve seen dashboards fail not because the model was wrong but because nobody noticed: → A column changed type → A pipeline skipped 2% of rows → A source table silently dropped a field → A null explosion went undetected for weeks This layer is invisible to most of the business, yet it’s the one that protects trust. If you don’t have anomaly detection or CI/CD tests, you’re relying on luck. And luck is not a data strategy. 3️⃣ 𝐆𝐨𝐯𝐞𝐫𝐧𝐚𝐧𝐜𝐞 𝐌𝐚𝐤𝐞𝐬 𝐄𝐯𝐞𝐫𝐲𝐭𝐡𝐢𝐧𝐠 𝐖𝐨𝐫𝐤 Data catalogs, lineage, ownership, contracts. People talk about them like buzzwords, but the impact is very real. Lineage isn’t a diagram. It’s how you debug issues in minutes instead of days. Contracts aren’t bureaucracy. They’re how producers guarantee stability for downstream teams. Stewardship isn’t a title. It’s accountability. What I’ve learned from my experience is simple: When governance is strong, you don’t spend your life firefighting. 4️⃣ 𝐀𝐭 𝐭𝐡𝐞 𝐂𝐞𝐧𝐭𝐞𝐫 𝐨𝐟 𝐄𝐯𝐞𝐫𝐲𝐭𝐡𝐢𝐧𝐠: 𝐃𝐚𝐭𝐚 𝐓𝐫𝐮𝐬𝐭 This is the part people underestimate. Trust is not something you “announce” on a slide. It’s something you earn, build, and protect over time. It shows up in adoption. It shows up in business confidence. It shows up in how quickly you can respond when an anomaly hits. Trust is the real KPI. And when it’s strong, everything else becomes easier. Executives stop asking "where did this number come from." Why does this matter so much? Because a lot of companies are scaling GenAI without first fixing data quality. And when AI learns from unreliable data, it becomes unreliable itself. If you want to improve decision-making, data quality is not a side topic. Everything else is built on top of it.
-
Everyone celebrates the AI skyline. Almost no one wants to invest in the foundation. That foundation is data governance. Not as a policy exercise, but as an operating discipline. When governance is weak, AI looks impressive at first: fast demos clever outputs early wins Then reality shows up: inconsistent answers hidden bias teams arguing over whose data is “right” leaders quietly losing trust in the system That’s not an AI failure. It’s a foundation failure. Here’s the practical playbook I’ve helped organizations use to fix it: 1) Assign real ownership, not committees Every critical data domain needs a clear owner with actual decision rights. If no one owns the data, the model ends up guessing. → Leader question: Who is accountable when this data misleads a decision? 2) Define “good data” in business terms Quality only matters in context. Accuracy, timeliness, and completeness must be tied to how the data is used, not how it’s stored. → Leader question: What decision breaks if this data is wrong or late? 3) Design guardrails before scale Not every dataset should feed every model. Governance is about boundaries: what AI can see, what it can influence, what it can automate. → Leader question: Where must humans stay in the loop, no matter how good the model gets? 4) Treat data pipelines like production systems Monitoring, lineage, versioning, and rollback aren’t optional. If you can’t trace an output back to its source, you can’t trust it. → Leader question: Could we explain this answer six months from now? 5) Build governance where work actually happens Policies on slides don’t scale. Embedded checks in workflows do. → Leader question: Is governance preventing rework later, or just slowing teams down today? AI doesn’t fail because it’s too advanced. It fails because the groundwork was never finished. If you want a skyline that lasts, build where no one is looking. 📌 Save this if AI reliability is now a leadership issue 🔁 Repost to shift the conversation from demos to durability 👤 Follow Gabriel Millien for grounded insight on Enterprise AI and transformation
-
Your data problem didn't start in your warehouse. It started in that free-text 'Region' in your ERP. Spending $1M modernizing your stack, won't fix your data. Everyone wants accurate data. But when you dig you realize their processes were never built to produce good data. They’re trying to analyze chaos. A few months ago, we were talking to finance company. They’d just spent 14 months modernizing their stack. They hired the data engineers. Millions spent. Hundreds of dashboards. And yet: “Revenue” in Salesforce included refunds. “Customer” in Marketing meant prospects too. Operations had 15 different “regions” spelled 8 different ways. The tech wasn’t broken. The process was. Their CRM, ERP, and sales systems were designed for convenience, not for data. Every time a sales rep skips a CRM field.. You create a leak in your data foundation. Until your warehouse is garbage. If your processes weren’t designed with data in mind nothing will save you Here is how to go about stopping bad data 1. Design Every Process as if Data Were the End Goal If you’re setting up a CRM, ERP, or even a Google Form, build it like a data engineer would. Even if it's yet. Develop a process with data in mind. As down the line, you will need, and rather than waiting 3-5 months to get data. Replace free-text fields with controlled dropdowns. Enforce mandatory fields that align with business-critical metrics. Executives say they want clean data but approve workflows that guarantee mess. In my opinion, data should be clean from the source. Becuse if it's not, managing pipelines, modelling becomes a nightmare. And even that can't save it 2. Treat Metrics Like Products Agreeing on definitions is not easy at all. People change, leave. 2 VPs can't agree on it so they create their own spreadsheet. Every metric you report on should have an owner, version history, use case and single definition across the company. If found in a situation can't agree, ask "What finding this info enables you" If can't answer it, archive it. Or if can't agree on metric. Seperate and define clear use case where each. 3. Asssign Owner & Build Feedback Loops Bad data comes from the frontlines, reps skipping CRM fields, creating custom objects in Salesforce. Assign owners of the metrics. Answer: Who owns the data? Who manages the inputs? Who's keeping operational systems clean? (Data stewards) If no one is accountable or owns it, how do you thing it will get fixed. Tie accuracy to incentives. 4. Enforce Standards, Not Opinions Everyone uses their own definition of “good data” Define how data should look: formats, naming, validation rules. If “Region” is free-text in CRM, you’ve built chaos by design. 5. Data quality isn’t a project or a one-time thing Start where it's most important. Track exceptions, expose results, fix patterns. Embed it in the system, so it's proactive rather than reactive.
-
My AI was ‘perfect’—until bad data turned it into my worst nightmare. 📉 By the numbers: 85% of AI projects fail due to poor data quality (Gartner). Data scientists spend 80% of their time fixing bad data instead of building models. 📊 What’s driving the disconnect? Incomplete or outdated datasets Duplicate or inconsistent records Noise from irrelevant or poorly labeled data Data quality The result? Faulty predictions, bad decisions, and a loss of trust in AI. Without addressing the root cause—data quality—your AI ambitions will never reach their full potential. Building Data Muscle: AI-Ready Data Done Right Preparing data for AI isn’t just about cleaning up a few errors—it’s about creating a robust, scalable pipeline. Here’s how: 1️⃣ Audit Your Data: Identify gaps, inconsistencies, and irrelevance in your datasets. 2️⃣ Automate Data Cleaning: Use advanced tools to deduplicate, normalize, and enrich your data. 3️⃣ Prioritize Relevance: Not all data is useful. Focus on high-quality, contextually relevant data. 4️⃣ Monitor Continuously: Build systems to detect and fix bad data after deployment. These steps lay the foundation for successful, reliable AI systems. Why It Matters Bad #data doesn’t just hinder #AI—it amplifies its flaws. Even the most sophisticated models can’t overcome the challenges of poor-quality data. To unlock AI’s potential, you need to invest in a data-first approach. 💡 What’s Next? It’s time to ask yourself: Is your data AI-ready? The key to avoiding AI failure lies in your preparation(#innovation #machinelearning). What strategies are you using to ensure your data is up to the task? Let’s learn from each other. ♻️ Let’s shape the future together: 👍 React 💭 Comment 🔗 Share
-
"But we don't trust the data." This is a hard to hear when discussing AI with Banks and Credit Unions. Because in the same breath, those teams will tell you they rely on that very data set to make high-stakes decisions every single day. So why is it untrustworthy to use with AI? What I find is usually "we don't trust the data" is a polite mask for three much messier realities: "We don't have the data." We never collected it, we just started, or it’s buried in a legacy system no one has the password for. When we look for the data you are asking for, we stare into the void. "Our people don’t agree on what the data means." We have the numbers, but when we put them in a room, teams run in circles. If the people can’t agree on the "so what," we’re terrified of what a machine might conclude. "We don't want to know what the data will tell us." This is the scariest one, because it is the most sinister. Operating on the dark side is safe. Shining a light on the data might reveal a can of worms that would prove we should have acted months or even years ago. Trust in data isn't something you get before you start. It's something you earn through utility - you need to use the data to make decisions to build trust. If you just collect data for the sake of having it, you will never trust it. Even in its imperfect form, using your data for AI provides the roadmap you’ve been missing - in a way even your dashboards couldn't reveal. It forces you to see where the gaps are, where you have true event-driven data, what’s actually quantifiable, and where your processes are broken. Stop waiting for the perfect dataset. This is a trap - it doesn't exist. Start building with the data you have now, and let the results guide you to a better data future.
-
One of the open secrets in our industry is data quality. Not in the abstract, but very specifically whether the people answering our surveys are real, paying attention, and actually representative or whether we’re quietly building insights on top of bad data. This is getting worse. Fraud is more sophisticated. Bots are harder to spot. Professional respondents are better at gaming systems, increasingly with the help of AI. And the incentives to look the other way have not disappeared. If you want a quick gut check on whether a supplier actually takes data quality seriously, here are a few questions that matter far more than any marketing claim: 1. Do they evaluate quality at the individual respondent level, or only at the dataset level? Averages hide problems. Real quality control happens response by response. 2. Can they explain why a response was removed not just that it failed a rule? If quality control is a black box, you’re outsourcing judgment without accountability. 3. Do they rely on multiple, independent signals, or just speed checks and attention traps? Bad actors can pass simple checks. Consistency and pattern analysis are much harder to fake. 4. Is quality assessed during fieldwork, or only after the fact? Post hoc cleaning makes reports cleaner. It doesn’t protect decisions made mid-stream. Once bad data is normalized, it becomes invisible. Perfect data doesn’t exist. But there’s a big difference between managed uncertainty and unknown contamination. If you trust your data, you should be able to explain how that trust is earned. We spend a ton of time at Outward Intelligence thinking about this from first principles… how to assess respondent quality at the individual level without over cleaning or introducing hidden bias. In practice, that means AI-based validations that run before, during, and after the survey, with fraud removed in real time and ambiguous cases reviewed by humans to minimize both missed fraud and collateral damage.
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development