Secludy AI’s cover photo
Secludy AI

Secludy AI

Technology, Information and Internet

San Francisco, CA 1,495 followers

Privacy-guaranteed synthetic data for training AI models.

About us

Privacy-guaranteed synthetic data generation for training AI models

Website
www.secludy.com
Industry
Technology, Information and Internet
Company size
2-10 employees
Headquarters
San Francisco, CA
Type
Privately Held

Locations

Employees at Secludy AI

Updates

  • Secludy AI reposted this

    Excited to share that we're launching Secludy AI with $4M in seed funding. Impression Ventures led the round. LAUNCH and The Syndicate, a venture firm and angel investing group led by Jason Calacanis, also joined, along with Wedbush Ventures, Precursor Ventures, Hustle Fund, Script Capital, Mana Ventures, Chispa VC, and an amazing group of angel investors. This started with a problem we couldn't stop thinking about. AI labs have already exhausted the majority of publicly available data. To improve model performance, AI/ML teams need to tap into their proprietary data to train domain-specific models. But that data is often too sensitive to use directly. Such as transactions, support tickets, and account histories. Legal won't sign off, customer data is restricted, and PII/IP can leak straight through model outputs. The more conversations we had, the more obvious it became. This was the same wall hitting one team after another. So my co-founder Mingze He,Ph.D. and I got to work. Secludy is self-hosted and lets AI/ML teams put their most sensitive data to work without ever exposing it. Keep the performance of your raw data when training models, running vendor evaluations, sharing datasets across teams/borders, testing in lower environments, and monetizing your most valuable data. We're starting with fintech. Banks, life sciences, insurance, and healthcare are next. Thank you to our early design partners, investors, and friends who helped us get here. And a special thank you to Soso Sazesh for your early and steady support. If your AI roadmap has been stuck in legal review for months, my DMs are open. Read more about our news today (link in comments) Christian Lassonde, Maor Amar, Petra Griffith, Jason Calacanis, Charles Hudson, Haley Bryant, Domingo Guerra, AJ S., Soso Sazesh, Amr Al-Shihabi

    • No alternative text description for this image
  • Secludy AI reposted this

    10 billion intimate conversations were used to train AI. Users found that Scatter Lab's AI chatbot, Lee Luda, was exposing sensitive details from conversations between romantic partners, including real names and addresses. Then came the fallout. - major fine from South Korea’s privacy regulator - original AI chatbot shut down - users sued and won (now under appeal) - lasting loss of user trust Training models on raw customer conversations, support logs, or internal records creates significant legal risk, damages trust, and hurts your brand. The companies that win in AI will be the ones that can train fast, stay compliant, and protect trust. We're building Secludy AI to make that possible.

    • No alternative text description for this image
  • Secludy AI reposted this

    Larry Ellison on where AI goes next. "For AI models to reach their peak value, you need to not just train them on publicly available data, but you need to make privately owned data available to those models as well." Larry gets it. Private data is next. He called out using private data for reasoning. OpenAI, Anthropic, and Google are hitting a data wall. Nearly every valuable corner of the public web has already been trained on. 99% of the worlds data is private and not accessible to the public web (off-limits to AI labs). It lives inside banks, hospitals, insurers, factories, call centers, supply chains, and internal company systems. Prompt engineering and RAG are now table stakes. The real AI moat comes from training on your proprietary data. Enterprises are realizing they don't need GPT-5 scale to win on domain-specific tasks. Small, fine-tuned models trained on proprietary data run circles around frontier models that have never seen your industry, your customers, or your data. At a fraction of the cost too. But the problem is that GenAI models regurgitate sensitive training data. Whatever you train on can surface in outputs. So training on raw customer records, patient files, claims data, or internal documents creates real risk. That is why the right path is differentially private synthetic data. Has privacy-guarantees and keeps the signal that matters for training. We’re building Secludy AI to turn sensitive private data into privacy-guaranteed synthetic data that keeps the training signal while protecting PII and proprietary IP. DM me if you are trying to figure out how to safely train on private data without exposing PII or IP.

  • Secludy AI reposted this

    Another day, another AI agent leaks sensitive data. Last week, security researchers showed how a customer service AI agent was hijacked (now patched). With a simple prompt injection the entire knowledge base and CRM was exfiltrated. No exploits. No phishing. No malware. Just… words. This case was about RAG. The knowledge base was the target. Guardrails failed. But the same principle applies to training data. Fine tune a model on raw tickets, chats, or CRM data and those facts live in the weights. This is why we are seeing a big movement towards anonymizing data before AI model training. Differentially private synthetic data replaces sensitive records with statistically faithful, privacy-guaranteed replicas. You keep the utility, lose the risk. Assume compromise. Then pick data that turns an exfiltration into a non-event. Send me a DM if you want to see how privacy-guaranteed synthetic data works in practice and how it can stop your models from becoming your next data breach headline.

    • No alternative text description for this image
  • Secludy AI reposted this

    Small beats big in agentic AI. Just finished the new NVIDIA research paper on Small Language Models (link in comments). For LLM-to-SLM Agent Conversion, NVIDIA calls out the need to anonymize your data as the first step. The paper's core argument: SLMs (<10B parameters) are the future of agentic AI because: - 10-30x cheaper to deploy than LLMs - Perfect for repetitive, specialized tasks (which is 90% of agent work) - Can be fine-tuned overnight vs weeks for LLMs Requiring anonymization as the first step is exactly why differentially private synthetic data is more important than ever to prevent model memorization of PII and your customers' IP. We're seeing this shift in real-time on the ground. To unlock your most sensitive datasets for agents, you need realistic, safe data for model fine-tuning without privacy or customer IP restrictions. With the shift from "one giant model for everything" to "many small models for specific tasks," differentially private synthetic data has became the critical enabler for safe, rapid innovation The future of AI agents is modular, efficient, and privacy-first. Send me a DM if you would like to learn how anonymized synthetic data can unlock your high-value, sensitive datasets for agentic workflows

  • Secludy AI reposted this

    Most people think their ChatGPT chats are private. They are not. In a recent podcast, Sam Altman just said it himself. There is no legal confidentiality when you treat ChatGPT like a therapist. Whether you use ChatGPT as a therapist. Or a life coach. Or to help you through relationship problems. That chat can be legally accessed. Right now, there is no doctor-patient privilege, no HIPAA, no confidentiality protection. → If OpenAI gets subpoenaed, they have to hand it over. → If a court asks for it, you do not get a say. → If you are not on ChatGPT Enterprise, your chats are not even excluded from some of these legal fights. This is not only a policy failure. It is a technology gap, too. We already have ways to protect privacy. Differentially private synthetic data is one example. → It lets AI learn from user patterns without copying or leaking real conversations. → It is built to protect identities even under legal pressure. → It gives insights without exposing real people. Altman said, "We have not figured this out yet." We need to. This is not an abstract risk. People are using ChatGPT as a therapist today. Most do not know their deepest thoughts could end up in a courtroom. We need clear laws. We need better technology. Differential privacy. Synthetic data. We have the tools. We need the urgency to use them.

  • Secludy AI reposted this

    Nobody thinks about core banking tech until something breaks. But if you work in financial services, you know who really keeps things running. → The vendors behind the scenes. → Powering hundreds of smaller banks and credit unions. → Quietly moving billions of dollars every day. These companies are the infrastructure layer no one sees. And they’re now facing a brutal challenge. They’re expected to help all their clients adopt new AI tools fast. But they don’t own the data. They need to build on top of other people’s customer records. They use outsourced dev teams. And they’re under pressure to move faster than ever. That’s a hard place to innovate safely. We’re already seeing signs of where this goes. → Early fintechs are using synthetic data to balance speed and privacy. → Now, infrastructure vendors are starting to follow. Why? Because they have to test and train AI models. But they can’t risk exposure to sensitive client data. Synthetic data changes the equation. → No more waiting on legal approvals. → No risk of leaking personal info. → No sharing of PII with overseas teams. → Retain statistical properties of the original data. → Faster experiments, safer rollouts. And for core banking platforms, it solves the real problem: How do you help 200+ banks adopt AI without slowing everything down? If you’re building or deploying AI in financial services, ask yourself this: Are your teams using the right data to move fast and stay compliant? I’d love to hear how others are tackling this. We’re help infra vendors and fintechs generate safe, production-grade synthetic data for faster testing and deployment. DM me if this is a challenge for you. Happy to share what we’ve learned. (Or check out what we’re building at Secludy AI)

  • Secludy AI reposted this

    The recent WeChat data leak exposed billions of PII records. Over 4 billion records, including 805 million from WeChat, were left exposed online. Alipay data was part of the leak too. Names, addresses, phone numbers, and financial information were all included. That is everything needed for fraud, identity theft, or worse. It is a brutal example of what happens when real data ends up in the wrong environments. That is why synthetic data is being used not only to train AI models but also for safe testing in lower environments, which is a growing use-case across our customer base. For example, we are increasingly seeing teams replace real user data with synthetic data in lower environments to test fraud detection models without exposing real customer data. Unlike old mock data, modern synthetic data is: - Statistically accurate. It preserves the structure, patterns, and correlations of real data. - PII-free by design. Even if systems are compromised, there is no sensitive data to leak. - Production-grade. It is realistic enough for QA, analytics, and model development. If your lower environments still use real user data, it is time to rethink.

Similar pages

Browse jobs

Funding

Secludy AI 1 total round

Last Round

Pre seed
See more info on crunchbase