Skip to content

google-ai-edge/LiteRT-LM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,615 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LiteRT-LM

LiteRT-LM is Google's production-ready, high-performance, open-source inference framework for deploying Large Language Models on edge devices.

🔗 Product Website

🔥 What's New: v0.12.0

  • Swift APIs: Natively integrate LiteRT-LM into iOS applications with Metal GPU acceleration. See the Swift Guide.
  • Web JavaScript APIs: Run models inside web browsers with high performance via web GPU/CPU. See the JavaScript Guide.
  • LiteRT-LM CLI Update: The command-line interface now supports NPU, besides CPU and GPU backends across Linux, macOS, and Windows. See the CLI Guide.
  • Community-Maintained Flutter APIs: Build cross-platform Flutter applications using the community flutter_gemma package. See the Flutter Guide.

👉 Try Gemma4-E4B with MTP on Linux, macOS, Windows or Raspberry Pi with the LiteRT-LM CLI:

litert-lm run  \
   --from-huggingface-repo=litert-community/gemma-4-E4B-it-litert-lm \
   gemma-4-E4B-it.litertlm \
   --backend=gpu \
   --enable-speculative-decoding=true \
   --prompt="What is the capital of France?"

🌟 Key Features

  • 📱 Cross-Platform Support: Android, iOS, Web, Desktop, and IoT (e.g. Raspberry Pi).
  • 🚀 Hardware Acceleration: Peak performance via GPU and NPU accelerators.
  • 👁️ Multi-Modality: Support for vision and audio inputs.
  • 🔧 Tool Use: Function calling support for agentic workflows.
  • 📚 Broad Model Support: Gemma, Llama, Phi-4, Qwen, and more.


🚀 Production-Ready for Google's Products

LiteRT-LM powers on-device GenAI experiences in Chrome, Chromebook Plus, Pixel Watch, and more.

You can also try the Google AI Edge Gallery app to run models immediately on your device.

Install the app today from Google Play Install the app today from App Store
Get it on Google Play Download on the App Store

📰 Blogs & Announcements

Link Description
Bring state-of-the-art agentic skills to the edge with Gemma 4 Deploy Gemma 4 in-app and across a broader range of devices with stellar performance and broad reach using LiteRT-LM.
On-device GenAI in Chrome, Chromebook Plus and Pixel Watch Deploy language models on wearables and browser-based platforms using LiteRT-LM at scale.
On-device Function Calling in Google AI Edge Gallery Explore how to fine-tune FunctionGemma and enable function calling capabilities powered by LiteRT-LM Tool Use APIs.
Google AI Edge small language models, multimodality, and function calling Latest insights on RAG, multimodality, and function calling for edge language models.

🏃 Quick Start

🔗 Key Links

⚡ Quick Try (No Code)

Try LiteRT-LM immediately from your terminal without writing a single line of code using uv:

uv tool install litert-lm

litert-lm run \
  --from-huggingface-repo=google/gemma-3n-E2B-it-litert-lm \
  gemma-3n-E2B-it-int4 \
  --prompt="What is the capital of France?"

📚 Supported Language APIs

Ready to get started? Explore our language-specific guides and setup instructions.

Language Status Best For... Documentation
Python ✅ Stable Prototyping & Scripting Python Guide
Kotlin ✅ Stable Android apps & JVM Kotlin Guide
Swift 🚀 Early Preview Native iOS & macOS Swift Guide
JavaScript (web) 🚀 Early Preview Browser environments JavaScript Guide
Flutter 🚀 Community Cross-platform mobile Flutter Guide
C++ ✅ Stable High-performance native C++ Guide

🏗️ Build From Source

This guide shows how you can compile LiteRT-LM from source. If you want to build the program from source, you should checkout the stable Latest Release tag.


📦 Releases

  • v0.12.0: Added early preview of Swift and Web JavaScript APIs, and community Flutter support. Updated LiteRT-LM CLI to have full CPU and GPU backend support across Linux, macOS, and Windows.
  • v0.11.0: Support Single Position Multi-token Prediction (MTP) for Gemma 4. Expand LiteRT-LM CLI to run natively on Windows with CPU and GPU backends.
  • v0.10.1: Deploy Gemma 4 with stellar performance (blog) and introduce LiteRT-LM CLI.
  • v0.9.0: Improvements to function calling capabilities, better app performance stability.
  • v0.8.0: Desktop GPU support and Multi-Modality.
  • v0.7.0: NPU acceleration for Gemma models.

For a full list of releases, see GitHub Releases.


About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors