VLM Run

VLM Run · 2026-02-08T22:06:41.366Z

🏈 Super Bowl hits different when it's less than a mile from your office. We couldn’t help but generate this 4K poster with chat.vlm.run combining two star quarterbacks into a single face-off. You’ll likely see it while walking to the stadium. Wishing both teams the best of luck tonight. #SuperBowl #SBLX #VLMRun #SeattleSeahawks #NewEnglandPatriots VLM Run

Technology, Information and Internet

Palo Alto, CA 3,963 followers

The Unified Gateway for Visual AI - Extract JSON from images, videos, and PDFs with our Vision Language Models.

See jobs Follow

View all 6 employees

About us

Unified Gateway for Visual Intelligence.

Website: https://vlm.run
External link for VLM Run
Industry: Technology, Information and Internet
Company size: 2-10 employees
Headquarters: Palo Alto, CA
Type: Privately Held

Locations

Primary

Palo Alto, CA 94301, US

Get directions
2445 Augustine Dr

Spaces - Santa Clara, Suite 103

Santa Clara, CA 95054, US

Get directions

Employees at VLM Run

See all employees

Updates

VLM Run

3,963 followers
1w
Report this post
Excited to share that mm-ctx is now live on Hugging Face Spaces! Try it in the browser via an interactive terminal without installing anything: https://lnkd.in/g_jhKyk8 mm-ctx – fast, multimodal context for agents. LLM-based agents handle text fine, but as soon as a directory contains images, videos, or PDFs with visual content, they struggle to understand the full context. mm-ctx is meant to feel familiar: the Unix tools we already love (find/cat/grep/wc), rebuilt for file types LLMs can't read natively and designed to work with agents via the CLI. - mm grep "invoice #1234" ~/Downloads searches across PDFs and returns line-numbered matches - mm cat <document>.pdf returns a metadata description of the file - mm cat <photo>.jpg returns a caption of the photo - mm cat <video>.mp4 returns a caption of the video A few things we obsessed over: ⚡ Speed: Rust core for the hot paths 🏠 Local-first, BYO model: Uses any OpenAI-compatible endpoint: Ollama, vLLM/SGLang, LMStudio with any multimodal LLM (Gemma4, Qwen3.5, GLM-4.6V). 🔗 Composable: stdin + structured outputs 🤖 Drops into any agent via mm-cli-skills: Claude Code, Codex, Gemini CLI, OpenClaw. We’d love to hear your feedback! Especially on the CLI and what file types and workflows you would like to see next.
5 Comments

Like Comment Share
VLM Run reposted this
Jeremy Park, PhD
4w
Report this post
I made a rock climbing tool using computer vision! I prompted VLM Run’s visual agent Orion to segment all of the blue bouldering holds, and it did a good job! It is interesting that now we can prompt VLMs to segment all of the holds, rather than creating a new dataset from scratch to train a model. With holds detection + pose estimation, I can show how each hold gets activated as a hand or foot uses it. Once we touch the final hold with both hands, the route is completed, and I show the overall path of my torso midpoint. A tool like this could help climbers understand their movement better. I’m still very much a beginner at bouldering, so I could use all the help I can get 🤣 There are definitely things to improve, but overall I’m encouraged by this first demo 🙂 Let me know what you think in the comments! Models used: - VLM Run’s Orion for segmentation - ViTPose+ Huge for pose estimation (via Hugging Face 🤗) - RT-DETR for person detection (via Hugging Face 🤗) Shoutout to Daniel Reiff and his bouldering + computer vision project for the inspiration! #ai #machinelearning #computervision #vlmrun #huggingface #rockclimbing #bouldering

13 Comments

Like Comment Share
VLM Run

3,963 followers
1mo
Report this post
Manually parsing handwritten intake forms can be slow and prone to error, while VLM Run's HIPAA-ready API allows you to extract the same details in seconds. In this tutorial by Jeremy Park, PhD, learn how to use VLM Run to extract structured JSON from handwritten healthcare documents at scale. Through this walkthrough, you will learn how to: - Upload documents in the Requests tab and run them against your saved skills - Enable confidence scores and grounding to see exactly where each field came from in the original document - Edit incorrect extractions and provide feedback to improve extraction over time - Run the same workflow programmatically via the VLM Run API as shown in Google Colab

3 Comments

Like Comment Share
VLM Run

3,963 followers
1mo
Report this post
Announcing Orion Skills! 🚀 Rather than rewriting prompts every time you want to define a specific task, you can now package all of that knowledge into a reusable skill. Why skills? - Reusable: Create a skill once, reference it from any endpoint (image, document, video, audio, agent) - Versionable: Pin a specific skill version for reproducible results, or use "latest" to always get the newest revision - Composable: Pass multiple skills in a single request, or combine them with custom schemas Unlike purely text-based skills, we have reimagined what skills mean for visual agents and how to codify visual workflows into skills. Try skills in chat today! And check out this skills creation tutorial by Jeremy Park, PhD 👇

2 Comments

Like Comment Share
VLM Run

3,963 followers
1mo
Report this post
The AI agent skills conversation focuses mainly on text. But for many tasks, words fall short. We need visual skills: providing images and videos as context, not just text. It's one thing to describe to a robot how to fold a t-shirt. It's another to show it a video. At VLM Run, that's what we're building: visual agents that understand visual data and act on it.

Like Comment Share
VLM Run

3,963 followers
2mo Edited
Report this post
Visual AI agents for identifying the most delicious blueberries?! 🫐 Jeremy Park, PhD recently shared his PhD research on computer vision for blueberries and new results using VLM Run's visual agent Orion for segmentation, detection, and metadata tagging. Link in the comments! #agtech #visualanalytics #computervision #blueberry

2 Comments

Like Comment Share
VLM Run

3,963 followers
2mo Edited
Report this post
What if visual AI agents could help give feedback on exercise? Jeremy Park, PhD recently reviewed our Orion visual AI agent for providing deadlift feedback. He raises the question: what if visual intelligence could be made accessible for applications in exercise, all through a chat interface? Read the Substack blog here: https://lnkd.in/gx_krdsM

7 Comments

Like Comment Share
VLM Run

3,963 followers
2mo Edited
Report this post
Healthcare documents come fragmented across PDFs, images, emails, and faxed scans. OCR fails because real-world documents require visual reasoning of layout and context – not just plain text extraction. Scan.com processes high volumes of documents and images where both speed and accuracy matter. They needed automation that could handle the diversity and complexity of healthcare documents. We built it together with Orion. In a single call: • Classifies multi-page document bundles • Extracts data from emails and attachments • Understands checkboxes, handwriting, and layout • Visually verifies for high confidence The result: faster processing, reduced manual QA, reliable structured data. Document automation isn't a text problem. It's a visual reasoning problem. Read more: https://lnkd.in/gd93-xUY
1 Comment

Like Comment Share
VLM Run

3,963 followers
2mo
Report this post
We're hiring our first infra engineer (senior/staff) at VLM Run! We're processing tens of millions of VLM requests per month and scaling fast; we're looking for a founding Infrastructure Engineer to serve and operationalize our GPU workloads (custom runtimes on vLLM/Hugging Face transformers, orchestrated with Ray/Modal). The work is technically challenging, the learning curve is steep (in the best way), and you'll be joining a stellar ML team building the go-to visual intelligence platform for enterprises. In-person ONLY. Apply here → https://lnkd.in/gc8GJGVu Tag someone who'd crush this 👇 #hiring #llm #vlm #ai #computervision #infra #k8s

1 Comment

Like Comment Share
VLM Run

3,963 followers
3mo
Report this post
🏈 Super Bowl hits different when it's less than a mile from your office. We couldn’t help but generate this 4K poster with chat.vlm.run combining two star quarterbacks into a single face-off. You’ll likely see it while walking to the stadium. Wishing both teams the best of luck tonight. #SuperBowl #SBLX #VLMRun #SeattleSeahawks #NewEnglandPatriots VLM Run
1 Comment

Like Comment Share

Browse jobs

Funding

VLM Run 1 total round

Last Round

Seed Feb 1, 2023

See more info on crunchbase

VLM Run

Technology, Information and Internet

Palo Alto, CA 3,963 followers

The Unified Gateway for Visual AI - Extract JSON from images, videos, and PDFs with our Vision Language Models.

About us

Locations

Employees at VLM Run

Sudeep Pillai

Dinesh Reddy N

Shahrear Bin Amin

Jeremy Park, PhD

Updates

Join now to see what you are missing

Similar pages

Simple AI

Synquery

Sonia Health

RELAI

Operator

Liquid AI

Almond FinTech

Avoca

Inkeep

Outspeed

Browse jobs

Infrastructure Engineer jobs

Director of Product Engineering jobs

Engineer jobs

Lead Manager jobs

Head of Product Management jobs

Strategist jobs

Mentor jobs

Finance Specialist jobs

Chief Product Officer jobs

Scientist jobs

Program Lead jobs

Mechanical Engineer jobs

Co-Founder jobs

Senior Product Manager jobs

Senior Software Engineer jobs

Director jobs

Chief Executive Officer jobs

Manager jobs

Service Desk Engineer jobs

Analyst jobs

Funding