š§ Stop building single-agent GenAI apps. That era is over. Most GenAI products today look like this: ā”ļø One prompt ā”ļø One model ā”ļø One output But when things break, hereās what you hear: -āIt forgot context.ā -āIt hallucinated.ā -āItās too slow, too dumb, too fragile.ā Thatās not the modelās fault. Thatās the architectureās fault. Let me explain š š„ What breaks in single-agent apps? -Context Overload: LLMs donāt need more informationāthey need relevant information. Dumping the entire history into one context window isnāt memory. Itās noise. -No Role Separation: A single agent trying to do research, analysis, reasoning, and response formatting? Thatās like asking one employee to be your assistant, lawyer, analyst, and social media manager. -Zero Observability: Thereās no traceability of why the model failed. No logs, no fallback logic, no task routing. š What works instead? Orchestration. Hereās how a real system works: ā Agents have roles ā Tasks are passed, not re-prompted ā Tools are securely invoked ā Humans can override any step ā Everything is observable, auditable, and retrainable You move from: š§± Prompt engineering ā š Protocol engineering Orchestration isnāt about complexity. Itās about coordination. āļø Hereās what weāve built to solve this: We open-sourced an orchestration protocol that lets you: š Register any LLM or external agent š§ Assign tasks based on roles + memory š Coordinate flows with zero-code YAML or full API šļø Trace every interaction with observability tools š„ Add human-in-the-loop interventions at any node No fancy wrappers. No black-box magic. Just a robust foundation for multi-agent GenAI systems. š ļø Example use cases weāve deployed: - A pre-sales agent team (research + pricing + objection handler) - A co-pilot for onboarding new employees across tools - An internal policy bot that cites source docs, lets HR approve or reject in real-time - Legal summarizers that escalate to humans if confidence < 80% These arenāt PoCs. Theyāre production-grade AI systems. Want to see how it works? š”Comment "Agents" and I will send invitation to the closed pioneer group to access the code and tutorialsāļø #AgenticAI #GenAI #artificialintelligence #innovation
LLM Security Management
Explore top LinkedIn content from expert professionals.
-
-
Few Lessons from Deploying and Using LLMs in Production Deploying LLMs can feel like hiring a hyperactive genius internāthey dazzle users while potentially draining your API budget. Here are some insights Iāve gathered: 1. āCheapā is a Lie You Tell Yourself: Cloud costs per call may seem low, but the overall expense of an LLM-based system can skyrocket. Fixes: - Cache repetitive queries: Users ask the same thing at least 100x/day - Gatekeep: Use cheap classifiers (BERT) to filter āeasyā requests. Let LLMs handle only the complex 10% and your current systems handle the remaining 90%. - Quantize your models: Shrink LLMs to run on cheaper hardware without massive accuracy drops - Asynchronously build your caches ā Pre-generate common responses before theyāre requested or gracefully fail the first time a query comes and cache for the next time. 2. Guard Against Model Hallucinations: Sometimes, models express answers with such confidence that distinguishing fact from fiction becomes challenging, even for human reviewers. Fixes: - Use RAG - Just a fancy way of saying to provide your model the knowledge it requires in the prompt itself by querying some database based on semantic matches with the query. - Guardrails: Validate outputs using regex or cross-encoders to establish a clear decision boundary between the query and the LLMās response. 3. The best LLM is often a discriminative model: You donāt always need a full LLM. Consider knowledge distillation: use a large LLM to label your data and then train a smaller, discriminative model that performs similarly at a much lower cost. 4. It's not about the model, it is about the data on which it is trained: A smaller LLM might struggle with specialized domain dataāthatās normal. Fine-tune your model on your specific data set by starting with parameter-efficient methods (like LoRA or Adapters) and using synthetic data generation to bootstrap training. 5. Prompts are the new Features: Prompts are the new features in your system. Version them, run A/B tests, and continuously refine using online experiments. Consider bandit algorithms to automatically promote the best-performing variants. What do you think? Have I missed anything? Iād love to hear your āI survived LLM prodā stories in the comments!
-
You don't need a 2 trillion parameter model to tell you the capital of France is Paris. Be smart and route between a panel of models according to query difficulty and model specialty! New paper proposes a framework to train a router that routes queries to the appropriate LLM to optimize the trade-off b/w cost vs. performance. Overview: Model inference cost varies significantly: Per one million output tokens: Llama-3-70b ($1) vs. GPT-4-0613 ($60), Haiku ($1.25) vs. Opus ($75) The RouteLLM paper propose a router training framework based on human preference data and augmentation techniques, demonstrating over 2x cost saving on widely used benchmarks. They define the problem as having to choose between two classes of models: (1) strong models - produce high quality responses but at a high cost (GPT-4o, Claude3.5) (2) weak models - relatively lower quality and lower cost (Mixtral8x7B, Llama3-8b) A good router requires a deep understanding of the questionās complexity as well as the strengths and weaknesses of the available LLMs. Explore different routing approaches: - Similarity-weighted (SW) ranking - Matrix factorization - BERT query classifier - Causal LLM query classifier Neat Ideas to Build From: - Users can collect a small amount of in-domain data to improve performance for their specific use cases via dataset augmentation. - Can expand this problem from routing between a strong and weak LLM to a multiclass model routing approach where we have specialist models(language vision model, function calling model etc.) - Larger framework controlled by a router - imagine a system of 15-20 tuned small models and the router as the n+1'th model responsible for picking the LLM that will handle a particular query at inference time. - MoA architectures: Routing to different architectures of a Mixture of Agents would be a cool idea as well. Depending on the query you decide how many proposers there should be, how many layers in the mixture, what the aggregate models should be etc. - Route based caching: If you get redundant queries that are slightly different then route the query+previous answer to a small model to light rewriting instead of regenerating the answer
-
If youāre building anything with LLMs, your system architecture matters more than your prompts. Most people stop at ācall the model, get the output.ā But LLM-native systems need workflows, blueprints that define how multiple LLM calls interact, how routing, evaluation, memory, tools, or chaining come into play. Hereās a breakdown of 6 core LLM workflows I see in production: š§ LLM Augmentation Classic RAG + tools setup. The model augments its own capabilities using: ā Retrieval (e.g., from vector DBs) ā Tool use (e.g., calculators, APIs) ā Memory (short-term or long-term context) š Prompt Chaining Workflow Sequential reasoning across steps. Each output is validated (pass/fail) ā passed to the next model. Great for multi-stage tasks like reasoning, summarizing, translating, and evaluating. š£ LLM Routing Workflow Input routed to different models (or prompts) based on the type of task. Example: classification ā Q&A ā summarization all handled by different call paths. š LLM Parallelization Workflow (Aggregator) Run multiple models/tasks in parallel ā aggregate the outputs. Useful for ensembling or sourcing multiple perspectives. š¼ LLM Parallelization Workflow (Synthesizer) A more orchestrated version with a control layer. Think: multi-agent systems with a conductor + synthesizer to harmonize responses. š§Ŗ EvaluatorāOptimizer Workflow The most underrated architecture. One LLM generates. Another evaluates (pass/fail + feedback). This loop continues until quality thresholds are met. If youāre an AI engineer, donāt just build for single-shot inference. Design workflows that scale, self-correct, and adapt. š Save this visual for your next project architecture review. ć°ļøć°ļøć°ļø Follow me (Aishwarya Srinivasan) for more AI insight and subscribe to my Substack to find more in-depth blogs and weekly updates in AI: https://lnkd.in/dpBNr6Jg
-
AI agents are not yet safe for unsupervised use in enterprise environments The German Federal Office for Information Security (BSI) and Franceās ANSSI have just released updated guidance on the secure integration of Large Language Models (LLMs). Their key message?Ā Fully autonomous AI systems without human oversight are a security risk and should be avoided. As LLMs evolve into agentic systems capable of autonomous decision-making, the risks grow exponentially. FromĀ Prompt Injection attacksĀ toĀ unauthorized data access, the threats are real and increasingly sophisticated. The updated framework introducesĀ Zero Trust principlesĀ tailored for LLMs: 1) No implicit trust: every interaction must be verified. 2) Strict authentication & least privilege accessĀ ā even internal components must earn their permissions. 3) Continuous monitoringĀ ā not just outputs, but inputs must be validated and sanitized. 4) Sandboxing & session isolationĀ ā to prevent cross-session data leaks and persistent attacks. 5) Human-in-the-loop, i.e., critical decisions must remain under human control. Whether you're deploying chatbots, AI agents, or multimodal LLMs, this guidance is a must-read. Itās not just about compliance but about buildingĀ trustworthy AIĀ that respects privacy, integrity, and security. Bottom line:Ā AI agents are not yet safe for unsupervised use in enterprise environments.Ā If you're working with LLMs, it's time to rethink your architecture.
-
Excited about our new paper: AI Agent Traps AI agents inherit every vulnerability of the LLMs they're built on - but their autonomy, persistence, and access to tools create an entirely new attack surface: the information environmental itself. The web pages, emails, APIs, and databases agents interact with can all be weaponised against them. We introduce a taxonomy of six classes of adversarial threats - from prompt injections hidden in web pages to systemic attacks on multi-agent networks. 1. Content Injection Traps (Perception): What a human sees on a web page is not what an agent parses. Attackers can embed malicious instructions in HTML comments, hidden CSS, image metadata, or accessibility tags. These are invisible to users, but processed directly by the agent. 2. Semantic Manipulation Traps (Reasoning): These attacks corrupt how the agent thinks. Sentiment-laden or authoritative-sounding content skews synthesis and conclusions. LLMs are susceptible to the same framing effects and anchoring biases as humans - logically equivalent problems phrased differently produce systematically different outputs. 3. Cognitive State Traps (Memory & Learning): Persistent agents accumulate memory across sessions and that memory becomes an attack surface. Poisoning a handful of documents in a RAG knowledge base reliably manipulates outputs for targeted queries. 4. Behavioural Control Traps (Action): These traps hijack what the agent does. A single crafted email caused an agent to bypass safety classifiers and exfiltrate its entire privileged context. 5. Systemic Traps (Multi-Agent Dynamics): The most dangerous attacks may not target individual agents at all. A fabricated financial report could trigger synchronised sell-offs across trading agents - a digital flash crash. Compositional fragment traps distribute a payload across multiple benign-looking sources; each passes safety filters alone, but when agents aggregate them, the full attack reconstitutes. 6. Human-in-the-Loop Traps: The final class uses the agent as a vector to attack the human. A compromised agent can generate outputs that induce approval fatigue, present misleading but technical-sounding summaries, or exploit automation bias. These aren't theoretical. Every type of trap has documented proof-of-concept attacks. And the attack surface is combinatorial - traps can be chained, layered, or distributed across multi-agent systems. Authors: Nenad TomaŔev Joel Leibo Julian Jacobs Simon Osindero Read here: https://lnkd.in/eTTZsPNG
-
Isabel BarberĆ”: "This document provides practical guidance and tools for developers and users of Large Language Model (LLM) based systems to manage privacy risks associated with these technologies. The risk management methodology outlined in this document is designed to help developers and users systematically identify, assess, and mitigate privacy and data protection risks, supporting the responsible development and deployment of LLM systems. This guidance also supports the requirements of the GDPR Article 25 Data protection by design and by default and Article 32 Security of processing by offering technical and organizational measures to help ensure an appropriate level of security and data protection. However, the guidance is not intended to replace a Data Protection Impact Assessment (DPIA) as required under Article 35 of the GDPR. Instead, it complements the DPIA process by addressing privacy risks specific to LLM systems, thereby enhancing the robustness of such assessments. Guidance for Readers > For Developers: Use this guidance to integrate privacy risk management into the development lifecycle and deployment of your LLM based systems, from understanding data flows to how to implement risk identification and mitigation measures. > For Users: Refer to this document to evaluate the privacy risks associated with LLM systems you plan to deploy and use, helping you adopt responsible practices and protect individualsā privacy. " >For Decision-makers: The structured methodology and use case examples will help you assess the compliance of LLM systems and make informed risk-based decision" European Data Protection Board
-
AI matched specialist responses on real-world consults at Stanford ā but doctors couldnāt agree whether they preferred the AI or human response. š New peer-reviewed study (+ David JH Wu, Fateme (Fatima) Nateghi, Vishnu Ravi, MD, Saloni Maharaj, Stephen Ma, Jonathan H. Chen et al) āAutomated Evaluation of Large Language Model Response Concordance with Human Specialist Responses on eConsult Casesā š The study - 40 physician-to-physician eConsults used to test how well AI replies matched board-certified specialists (cardiology, endocrinology, ID, neurology, heme/onc, etc.) - Each case included real notes + labs - GPT-4.1 generated consult replies to actual clinical queries - Real human specialistsā replies were used for comparison Doctors graded AI outputs (āconcordantā vs ādiscordantā). āļø Evaluation methods 1ļøā£ LLM-as-Judge (LaJ): LLM grader compared doctor and AI replies 2ļøā£ Decompose-then-Verify (DtV): LLM broke replies into atomic facts (e.g., āstart metforminā) before LLM grader checked each fact for agreement. 3 internal medicine doctors reviewed all 40 cases to benchmark how reliably each automated method matched expert judgment. š Key findings 1ļøā£ Ā AIās replies were similar to actual specialist responses - GPT-4.1 answers were as consistent with human specialists as doctors were with each other - Īŗ = 0.75, F1 = 0.89 (AIādoctor concordance) - Inter-physician Īŗ = 0.69ā0.90 (doctorādoctor concordance) 2ļøā£ An AI ājudgeā can rate AI responses with near-human expert reliability - Best evaluator: DeepSeek R1 (F1 0.89, Īŗ 0.75) - Gemini 2.5 Pro (F1 0.86, Īŗ 0.70) LLM-as-Judge outperformed DtV. But DtV was more "explainable" since could compare grading against atomic facts. 3ļøā£ Doctors disagreed on whether they preferred the AI or human specialistās consult - One preferred the AI in 82% of cases (clarity, organization). - Another preferred the human in 88% of cases (nuance, tone, and contextual awareness eg. insurance considerations). 𩺠Takeaways AI can produce consults that are concordant with how a real human specialist might respond. + LLMs can evaluate AI outputs with near-human expert reliability. š Limitations - Small sample: 40 eConsults and 3 physician reviewers - Conducted at single health system (Stanford) - Evaluated concordance, not clinical accuracy š Why this matters Human evaluation is a big (and costly) bottleneck in clinical AI. Results show automated evaluation can reach human expert level reliability, perhaps enabling scalable validation across real-time specialty consults. š Free online Stanford BMIR Colloquium āApplied Intelligence: Integrating AI Technologies Into Medical Educationā šļø Laurah Turner, PhD š Thursday, Oct 9 | 12ā1 PM PT š Tapao Hall (3180 Porter Dr) + Zoom Live Stream Dr. Turner's talk will explore when to trust AI autonomously, when human oversight is essential, and when to avoid AI entirely.
-
As an MLOps platform, we started by helping organizations implement responsible AI governance for traditional machine learning models. With principles of transparency, accountability, and oversight, our Guardrails enabled smooth model development. However, governing large language models (LLMs) like ChatGPT requires a fundamentally different approach. LLMs aren't narrow systems designed for specific tasks - they can generate nuanced text on virtually any topic imaginable. This presents a whole new set of challenges for governance. Here are some key components for evolving AI governance frameworks to effectively oversee large language models (LLMs): 1ļøā£ Usage-Focused Governance: Focus governance efforts on real-world LLM usage - the workflows, inputs and outputs - rather than just the technical architecture. Continuously assess risks posed by different use cases. 2ļøā£ Dynamic Risk Assessment: Identify unique risks presented by LLMs such as bias amplification and develop flexible frameworks to proactively address emerging issues. 3ļøā£ Customized Integrations: Invest in tailored solutions to integrate complex LLMs with existing systems in alignment with governance goals. 4ļøā£ Advanced Monitoring: Utilize state-of-the-art tools to monitor LLMs in real-time across metrics like outputs, bias indicators, misuse prevention, and more. 5ļøā£ Continuous Accuracy Tracking: Implement ongoing processes to detect subtle accuracy drifts or inconsistencies in LLM outputs before they escalate. 6ļøā£ Agile Oversight: Adopt agile, iterative governance processes to manage frequent LLM updates and retraining in line with the rapid evolution of models. 7ļøā£ Enhanced Transparency: Incorporate methodologies to audit LLMs, trace outputs back to training data/prompts and pinpoint root causes of issues to enhance accountability. In conclusion, while the rise of LLMs has disrupted traditional governance models, we at Katonic AI are working hard to understand the nuances of LLM-centric governance and aim to provide effective solutions to assist organizations in harnessing the power of LLMs responsibly and efficiently. #LLMGovernance #ResponsibleLLMs #LLMrisks #LLMethics #LLMpolicy #LLMregulation #LLMbias #LLMtransparency #LLMaccountability
-
Vulnerabilities in MCP (Model Context Protocol) I was hired to audit integrations of an LLM with MCP, for use with data management tools, log collections and automated routines. Here are some problems I found and would like to share so that those of you who want to implement MCP in your products can start thinking about security at the beginning of the development cycle. However, it is worth mentioning that there are still not many efficient solutions, despite some selling LLM Firewalls. I would like to test and validate the effectiveness of this. Anyway, let's get to the points: 1) The lack of HTTPS in API Integrations was a problem I noticed a lot. The LLM and the integrated MCP APIs that were integrated with the tools or executed commands and received the response to the commands allowed me to view the requests and responses. I used Wireshark to validate. 2) Inadequate Permission Management, allowing me to access data from other clients without any tenant isolation, all via Prompt Injection and Burp Suite to analyze requests and perform basic manipulations. 3) Abuse of Automations and Unrestricted Resource Consumption, allowing me to trigger multiple parallel routines, all via a single prompt, or sending different prompts causing the server to trigger routines all at once, without proper thread queue management. I used Burp Suite with Intruder and created a list of prompts and executed at least 50 different prompts with the same context. In addition, there was no control over the request limit in the APIs. 4) SQL Injection via Prompt, basically making requests using human language, for example: āwhat columns does the users table have?ā resulted in queries being executed directly without control and spitting out information, i.e., it seems that the integration opened the database schema (weird). Obviously, the problem is that it built the query in the backend and processed it as an SQL query. I used Burp Suite in this case to analyze the response, etc. 5) Hardcoded Secrets in the MCP Code. API tokens, database credentials, and endpoints were found directly in the MCP integration scripts. Although it is obvious, just because they are in the backend does not mean they must be hardcoded. Unfortunately, I was unable to extract secrets via prompt injection or obtain an RCE. 6) Broad Context allowing Full Control of the application. Although I did not obtain the application secrets, providing broad context to the LLM gave it full control over the integrated systems, executing tasks that should be exclusive to the admin, since the configured keys had excessive permissions that allowed the execution of numerous functions. In short, these are flaws that a trained developer with knowledge of application security could resolve, but many who start integrating solutions with AI do not worry about Shift-Left. #mcp #AI #redteam #cybersecurity #AISecurity #mcpsecurity #pentest #llmpentest
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development