rag

package
v0.7.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 13, 2026 License: Apache-2.0 Imports: 19 Imported by: 0

Documentation

Overview

Package rag provides embedding generation, vector storage, and semantic retrieval for driftlessAF agents.

This package enables agents to learn from historical data by storing embeddings of past outcomes (fixes, advisories, build configurations) and retrieving semantically similar records at inference time.

Architecture

The package has three core components:

  • Embedder: Generates vector embeddings from text using Vertex AI (gemini-embedding-001).
  • Store: Persists embeddings with metadata in a vector index (Vertex AI Matching Engine).
  • Retriever: Searches the vector index for similar embeddings.

Each component is defined as an interface (Store, Retriever) with Vertex AI Matching Engine implementations provided. The Client type wraps all three for convenience.

Basic Usage

Store a document embedding:

client, err := rag.NewClient(ctx, rag.ClientConfig{
	Project:         "my-project",
	Location:        "us-east5",
	EmbeddingModel:  "gemini-embedding-001",
	IndexName:       "projects/my-project/locations/us-east5/indexes/12345",
	IndexEndpointID: "67890",
	DeployedIndexID: "my_deployed_index",
})
if err != nil {
	log.Fatal(err)
}
defer client.Close()

// Store a past fix
err = client.EmbedAndStore(ctx, "fix-123", "dependency version conflict in go.mod",
	rag.TaskTypeRetrievalDocument,
	map[string]string{
		"pr_url": "https://github.com/org/repo/pull/456",
		"patch":  "--- a/go.mod\n+++ b/go.mod\n...",
	})

Search for similar past fixes:

// Start with no threshold to examine raw distances, then set one for your corpus.
results, err := client.EmbedAndSearch(ctx, "cannot find module providing package foo/bar",
	rag.SearchOptions{TopK: 3})
for _, r := range results {
	fmt.Printf("Similar fix (distance %.3f): %s\n", r.Distance, r.Metadata["pr_url"])
}

MCP Integration

This package is designed to power a RAG MCP server that exposes search/retrieve tools to any driftlessAF agent via the Model Context Protocol. See the rag-mcp-server service for the MCP integration layer.

Index

Examples

Constants

View Source
const (
	// DefaultTopK is the default number of results returned by a search.
	DefaultTopK = 5

	// MetadataKeySourceText is the metadata key used to store the original
	// source text alongside embeddings, enabling re-embedding with newer models.
	MetadataKeySourceText = "_source_text"

	// MetadataKeyStoredAt is the metadata key for the timestamp when a
	// datapoint was stored.
	MetadataKeyStoredAt = "_stored_at"
)
View Source
const DefaultDimensions = 3072

DefaultDimensions is the output dimensionality for gemini-embedding-001. 3072 is the maximum supported and provides the best recall quality.

Variables

This section is empty.

Functions

This section is empty.

Types

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client wraps Embedder, Store, and Retriever for common RAG workflows.

func NewClient

func NewClient(ctx context.Context, cfg ClientConfig) (*Client, error)

NewClient creates a fully configured RAG client with embedding, storage, and retrieval.

func (*Client) Close

func (c *Client) Close() error

Close releases all resources held by the client.

func (*Client) EmbedAndSearch

func (c *Client) EmbedAndSearch(ctx context.Context, queryText string, opts SearchOptions) ([]Result, error)

EmbedAndSearch generates an embedding for the query text and searches for similar vectors. Queries are embedded with TaskTypeRetrievalQuery, the counterpart to TaskTypeRetrievalDocument used during ingestion.

func (*Client) EmbedAndStore

func (c *Client) EmbedAndStore(ctx context.Context, id, text string, taskType TaskType, metadata map[string]string, opts ...UpsertOption) error

EmbedAndStore generates an embedding for the text and stores it with the given metadata. The source text is automatically included in metadata under MetadataKeySourceText so that embeddings can be regenerated when upgrading models.

Pass UpsertOption values (e.g. WithRestricts) to attach index-level restrict tags that callers can later filter on via SearchOptions.Restricts.

The caller's metadata map is not modified; a copy is made before adding the source text.

func (*Client) Embedder

func (c *Client) Embedder() *Embedder

Embedder returns the client's embedder for direct use.

func (*Client) Retriever

func (c *Client) Retriever() Retriever

Retriever returns the client's retriever for direct use.

func (*Client) Store

func (c *Client) Store() Store

Store returns the client's store for direct use.

type ClientConfig

type ClientConfig struct {
	// GCP project ID.
	Project string

	// GCP region (e.g., "us-east5").
	Location string

	// Embedding model name (e.g., "gemini-embedding-001").
	EmbeddingModel string

	// Vertex AI Matching Engine index resource name.
	// Format: projects/{project}/locations/{location}/indexes/{index}
	IndexName string

	// Vertex AI index endpoint resource ID.
	IndexEndpointID string

	// Deployed index ID within the endpoint.
	DeployedIndexID string

	// PublicDomainName is the public endpoint domain for the index endpoint
	// (e.g., "1234.us-central1-5678.vdb.vertexai.goog").
	// Required for public endpoints. Leave empty for private (VPC) endpoints.
	PublicDomainName string

	// GCSBucket enables durable persistence of embeddings to GCS.
	// When set, stores write to both GCS and Matching Engine.
	// Records in GCS can be re-embedded when upgrading models.
	// Optional — leave empty for Matching Engine only.
	GCSBucket string

	// GCSPrefix is an optional path prefix within the GCS bucket.
	GCSPrefix string

	// Dimensions specifies the output dimensionality for embeddings.
	// If 0, defaults to DefaultDimensions (768).
	// Note: different models support different ranges (e.g., text-embedding-005: 1-768).
	Dimensions int
}

ClientConfig configures a RAG Client with all three components.

Example
package main

import (
	"fmt"

	"chainguard.dev/driftlessaf/agents/rag"
)

func main() {
	// ClientConfig wires up all three RAG components.
	cfg := rag.ClientConfig{
		Project:          "my-project",
		Location:         "us-central1",
		EmbeddingModel:   "gemini-embedding-001",
		IndexName:        "projects/my-project/locations/us-central1/indexes/12345",
		IndexEndpointID:  "67890",
		DeployedIndexID:  "my_deployed_index",
		PublicDomainName: "1234.us-central1-5678.vdb.vertexai.goog",
		// Optional: enable GCS dual-write for durability / re-embedding.
		GCSBucket: "my-embeddings-bucket",
		GCSPrefix: "build-failures",
	}

	fmt.Println("project:", cfg.Project)
	fmt.Println("model:", cfg.EmbeddingModel)
	fmt.Println("gcs:", cfg.GCSBucket)
}
Output:
project: my-project
model: gemini-embedding-001
gcs: my-embeddings-bucket

type Embedder

type Embedder struct {
	// contains filtered or unexported fields
}

Embedder generates vector embeddings from text using Vertex AI.

func NewEmbedder

func NewEmbedder(ctx context.Context, project, location, model string, dimensions ...int) (*Embedder, error)

NewEmbedder creates a new embedding generator.

The model should be a Google embedding model name (e.g., "gemini-embedding-001"). Project and location identify the GCP project and region for Vertex AI. Dimensions defaults to DefaultDimensions (3072) for best recall quality. Pass a non-zero dimensions to override.

func (*Embedder) Close

func (e *Embedder) Close() error

Close releases resources held by the embedder.

func (*Embedder) Embed

func (e *Embedder) Embed(ctx context.Context, text string, taskType TaskType) ([]float32, error)

Embed generates a vector embedding for the given text.

The taskType affects how the embedding is optimized. Use TaskTypeRetrievalDocument when storing documents and TaskTypeRetrievalQuery when searching. Use TaskTypeSemanticSimilarity when comparing texts directly.

type GCSStore

type GCSStore struct {
	// contains filtered or unexported fields
}

GCSStore implements Store by persisting embedding records to Google Cloud Storage. Each datapoint is written as a JSON file at {prefix}/{id}.json.

GCSStore is designed for durability, not search. Pair it with MatchingEngineStore via MultiStore for both durable persistence and real-time search.

func NewGCSStore

func NewGCSStore(ctx context.Context, bucket, prefix string, opts ...option.ClientOption) (*GCSStore, error)

NewGCSStore creates a store that persists records to GCS.

Records are written to gs://{bucket}/{prefix}/{id}.json. The prefix is optional — pass "" to write to the bucket root.

func (*GCSStore) Close

func (s *GCSStore) Close() error

Close releases the storage client.

func (*GCSStore) Upsert

func (s *GCSStore) Upsert(ctx context.Context, id string, vector []float32, metadata map[string]string, opts ...UpsertOption) error

Upsert writes a record to GCS. The source text used to generate the embedding should be passed in metadata under MetadataKeySourceText so it can be re-embedded when upgrading models. If not present, the vector is still stored but re-embedding will require the original source data from elsewhere.

Restricts attached via WithRestricts are persisted in the JSON record so re-index scripts can re-attach them when re-upserting into Matching Engine.

type MatchingEngineRetriever

type MatchingEngineRetriever struct {
	// contains filtered or unexported fields
}

MatchingEngineRetriever implements Retriever using the Vertex AI MatchClient gRPC API. It connects directly to the index endpoint's public domain for FindNeighbors calls.

func NewMatchingEngineRetriever

func NewMatchingEngineRetriever(ctx context.Context, project, location, indexEndpointID, deployedIndexID, publicDomainName string) (*MatchingEngineRetriever, error)

NewMatchingEngineRetriever creates a retriever backed by a deployed Vertex AI Matching Engine index using gRPC.

Parameters:

  • project, location: GCP project and region
  • indexEndpointID: the index endpoint resource ID
  • deployedIndexID: the ID of the deployed index within the endpoint
  • publicDomainName: the public endpoint domain (e.g., "1234.us-central1-5678.vdb.vertexai.goog"). Required for public endpoints. For private (VPC) endpoints, pass "".

func (*MatchingEngineRetriever) Close

func (r *MatchingEngineRetriever) Close() error

Close releases the gRPC connection.

func (*MatchingEngineRetriever) Search

func (r *MatchingEngineRetriever) Search(ctx context.Context, query []float32, opts SearchOptions) ([]Result, error)

Search finds vectors similar to the query vector using gRPC FindNeighbors.

type MatchingEngineStore

type MatchingEngineStore struct {
	// contains filtered or unexported fields
}

MatchingEngineStore implements Store using Vertex AI Matching Engine with stream updates (real-time upsert, no batch import needed).

func NewMatchingEngineStore

func NewMatchingEngineStore(ctx context.Context, location, indexName string) (*MatchingEngineStore, error)

NewMatchingEngineStore creates a store backed by Vertex AI Matching Engine.

The indexName should be the full resource name: projects/{project}/locations/{location}/indexes/{index}

func (*MatchingEngineStore) Close

func (s *MatchingEngineStore) Close() error

Close releases the index client connection.

func (*MatchingEngineStore) Upsert

func (s *MatchingEngineStore) Upsert(ctx context.Context, id string, vector []float32, metadata map[string]string, opts ...UpsertOption) error

Upsert inserts or updates a datapoint with its embedding and metadata.

type MultiStore

type MultiStore struct {
	// contains filtered or unexported fields
}

MultiStore writes to multiple Store backends simultaneously. Use this to write to both GCS (durability) and Matching Engine (search).

func NewMultiStore

func NewMultiStore(stores ...Store) *MultiStore

NewMultiStore creates a store that fans out writes to all provided stores. At least one store must be provided; panics if called with zero stores.

Example
package main

import (
	"fmt"
)

func main() {
	// MultiStore fans out writes to multiple Store backends.
	// Typical usage: GCSStore (durability) + MatchingEngineStore (search).
	//
	//   gcsStore, _ := rag.NewGCSStore(ctx, "my-bucket", "prefix")
	//   meStore, _ := rag.NewMatchingEngineStore(ctx, "us-central1", indexName)
	//   store := rag.NewMultiStore(gcsStore, meStore)
	//
	// All stores are attempted regardless of individual failures.
	// Errors are collected via errors.Join.
	fmt.Println("MultiStore fans out writes to all stores")
}
Output:
MultiStore fans out writes to all stores

func (*MultiStore) Close

func (m *MultiStore) Close() error

Close closes all stores, collecting errors from each.

func (*MultiStore) Upsert

func (m *MultiStore) Upsert(ctx context.Context, id string, vector []float32, metadata map[string]string, opts ...UpsertOption) error

Upsert writes to all stores. All stores are attempted regardless of individual failures; errors are collected and returned via errors.Join. Options (e.g. WithRestricts) are forwarded to every backing store, so each one decides how to honour them — Matching Engine attaches them as query-filterable restricts; GCS persists them in the JSON record.

type Result

type Result struct {
	// ID is the unique identifier of the matched datapoint.
	ID string

	// Distance is the cosine distance from the query vector.
	// Lower values indicate higher similarity (0 = identical, 2 = opposite).
	Distance float64

	// Metadata contains the key-value pairs stored with the datapoint.
	Metadata map[string]string
}

Result represents a single vector search result.

type Retriever

type Retriever interface {
	// Search finds the nearest neighbors to the query vector.
	// Results are ordered by distance (most similar first).
	// Returns an empty (non-nil) slice when no results match.
	Search(ctx context.Context, query []float32, opts SearchOptions) ([]Result, error)

	// Close releases resources held by the retriever.
	Close() error
}

Retriever searches a vector index for similar embeddings.

type SearchOptions

type SearchOptions struct {
	// TopK is the maximum number of results to return.
	// Defaults to DefaultTopK (5) when zero.
	TopK int

	// DistanceThreshold is the maximum cosine distance for results.
	// Lower values mean stricter matching (higher similarity required).
	//
	// When zero (the default), no threshold filtering is applied and all
	// TopK results are returned. This lets you examine raw distances to
	// calibrate the right threshold for your corpus.
	//
	// Set to a positive value (e.g., 0.4) to filter out results with
	// distance greater than that value.
	//
	// Examples:
	//   SearchOptions{TopK: 5}                       // no filtering, return all 5
	//   SearchOptions{TopK: 10, DistanceThreshold: 0.3} // strict: only very similar
	//   SearchOptions{TopK: 10, DistanceThreshold: 0.6} // moderate: related content
	DistanceThreshold float64

	// Restricts narrows results to datapoints whose stored restricts overlap
	// the supplied allow lists. The map is keyed by namespace; each value
	// is the set of allowed values for that namespace. Restricts are
	// AND-ed across namespaces (a result must match every namespace) and
	// OR-ed within a namespace (any value matches).
	//
	// Datapoints carry restricts via WithRestricts at write time. A
	// datapoint with no value for a queried namespace is excluded — there
	// is no implicit "untagged" group.
	//
	// Use restricts to partition a single index into logical sub-corpora
	// (e.g., by tenant, language, or domain) without standing up a
	// separate index per partition. Leave nil/empty to search across the
	// whole corpus.
	//
	// Example — return only fixes from the "package-build" domain:
	//
	//   SearchOptions{
	//     TopK: 5,
	//     Restricts: map[string][]string{"domain": {"package-build"}},
	//   }
	Restricts map[string][]string
}

SearchOptions configures a vector similarity search.

Distance threshold

DistanceThreshold controls which results are returned based on their cosine distance from the query vector. Cosine distance ranges from 0 (identical) to 2 (opposite), with lower values indicating higher similarity.

By default, no threshold filtering is applied — all TopK results are returned regardless of distance. This is intentional: the right threshold depends entirely on your corpus, embedding model, and use case. We strongly recommend examining your result distances before setting a threshold.

To find the right threshold for your corpus:

  1. Run searches with no threshold (the default) and examine the distances
  2. Note the distance range for results you consider "good" vs "irrelevant"
  3. Set DistanceThreshold to a value that separates the two groups

Typical ranges (these vary by corpus — always verify with your own data):

  • 0.0–0.3: Very similar (near-duplicate content, same error with minor variations)
  • 0.3–0.5: Moderately similar (same category of problem, related issues)
  • 0.5–0.8: Loosely related (same domain but different specifics)
  • 0.8+: Weak or no meaningful similarity
Example
package main

import (
	"fmt"

	"chainguard.dev/driftlessaf/agents/rag"
)

func main() {
	// Default: TopK=5, no distance filtering. All results returned so you
	// can examine distances and calibrate a threshold for your corpus.
	opts := rag.SearchOptions{}
	fmt.Println("default TopK:", opts.TopK, "-> defaults to", rag.DefaultTopK)

	// After examining your results, set a threshold. Typical ranges:
	//   0.0–0.3: very similar (near-duplicates)
	//   0.3–0.5: moderately similar (same category)
	//   0.5–0.8: loosely related
	strict := rag.SearchOptions{TopK: 10, DistanceThreshold: 0.3}
	fmt.Printf("strict: TopK=%d, threshold=%.1f\n", strict.TopK, strict.DistanceThreshold)

	// More permissive — include related but not identical content.
	moderate := rag.SearchOptions{TopK: 10, DistanceThreshold: 0.6}
	fmt.Printf("moderate: TopK=%d, threshold=%.1f\n", moderate.TopK, moderate.DistanceThreshold)

}
Output:
default TopK: 0 -> defaults to 5
strict: TopK=10, threshold=0.3
moderate: TopK=10, threshold=0.6

type Store

type Store interface {
	// Upsert inserts or updates a datapoint in the vector index.
	// The id must be non-empty and unique within the index. Metadata values
	// are stored alongside the vector for retrieval; pass UpsertOption
	// values (e.g. WithRestricts) to attach index-level restrict tags
	// usable by SearchOptions.Restricts.
	Upsert(ctx context.Context, id string, vector []float32, metadata map[string]string, opts ...UpsertOption) error

	// Close releases resources held by the store.
	Close() error
}

Store persists vector embeddings with metadata for later retrieval. Store is write-only by design; reads go through Retriever.

type TaskType

type TaskType string

TaskType defines the purpose for which an embedding is optimized. Different task types affect how vectors are positioned in the embedding space.

const (
	// TaskTypeSemanticSimilarity optimizes embeddings for measuring
	// semantic similarity between texts.
	TaskTypeSemanticSimilarity TaskType = "SEMANTIC_SIMILARITY"

	// TaskTypeRetrievalDocument optimizes embeddings for document representation
	// in retrieval systems. Use this when storing documents; pair with
	// TaskTypeRetrievalQuery when searching.
	TaskTypeRetrievalDocument TaskType = "RETRIEVAL_DOCUMENT"

	// TaskTypeRetrievalQuery optimizes embeddings for query representation
	// in retrieval systems (counterpart to TaskTypeRetrievalDocument).
	TaskTypeRetrievalQuery TaskType = "RETRIEVAL_QUERY"

	// TaskTypeQuestionAnswering optimizes embeddings for matching queries
	// with relevant answers.
	TaskTypeQuestionAnswering TaskType = "QUESTION_ANSWERING"

	// TaskTypeClassification optimizes embeddings for text classification.
	TaskTypeClassification TaskType = "CLASSIFICATION"

	// TaskTypeClustering optimizes embeddings for grouping similar texts.
	TaskTypeClustering TaskType = "CLUSTERING"
)

type UpsertOption added in v0.7.0

type UpsertOption func(*upsertConfig)

UpsertOption configures a write to a Store. Options are applied in order; later options override earlier ones for the same field.

func WithRestricts added in v0.7.0

func WithRestricts(restricts map[string][]string) UpsertOption

WithRestricts attaches restrict tags to a datapoint at write time. The datapoint becomes selectable via SearchOptions.Restricts using matching namespace/value pairs.

The restricts map is keyed by namespace; each value is the set of allow values stored for that namespace on this datapoint. A query whose SearchOptions.Restricts intersects on every queried namespace will match this datapoint.

Pass nil or an empty map to write a datapoint with no restricts (the pre-existing default — searchable only by queries that don't restrict).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL