Index
ImageClassificationPredictionInstance(message)ImageObjectDetectionPredictionInstance(message)ImageSegmentationPredictionInstance(message)TextClassificationPredictionInstance(message)TextEmbeddingPredictionInstance(message)TextEmbeddingPredictionInstance.TaskType(enum)TextExtractionPredictionInstance(message)TextSentimentPredictionInstance(message)VideoActionRecognitionPredictionInstance(message)VideoClassificationPredictionInstance(message)VideoGenerationModelInstance(message)VideoGenerationModelInstance.Image(message)VideoGenerationModelInstance.Mask(message)VideoGenerationModelInstance.ReferenceImage(message)VideoGenerationModelInstance.Video(message)VideoObjectTrackingPredictionInstance(message)VisionEmbeddingModelInstance(message)VisionEmbeddingModelInstance.Image(message)VisionEmbeddingModelInstance.Video(message)VisionEmbeddingModelInstance.Video.VideoSegmentConfig(message)
ImageClassificationPredictionInstance
Prediction input format for Image Classification.
| Fields | |
|---|---|
content |
The image bytes or Cloud Storage URI to make the prediction on. |
mime_type |
The MIME type of the content of the image. Only the images in below listed MIME types are supported. - image/jpeg - image/gif - image/png - image/webp - image/bmp - image/tiff - image/vnd.microsoft.icon |
ImageObjectDetectionPredictionInstance
Prediction input format for Image Object Detection.
| Fields | |
|---|---|
content |
The image bytes or Cloud Storage URI to make the prediction on. |
mime_type |
The MIME type of the content of the image. Only the images in below listed MIME types are supported. - image/jpeg - image/gif - image/png - image/webp - image/bmp - image/tiff - image/vnd.microsoft.icon |
ImageSegmentationPredictionInstance
Prediction input format for Image Segmentation.
| Fields | |
|---|---|
content |
The image bytes to make the predictions on. |
mime_type |
The MIME type of the content of the image. Only the images in below listed MIME types are supported. - image/jpeg - image/png |
TextClassificationPredictionInstance
Prediction input format for Text Classification.
| Fields | |
|---|---|
content |
The text snippet to make the predictions on. |
mime_type |
The MIME type of the text snippet. The supported MIME types are listed below. - text/plain |
TextEmbeddingPredictionInstance
Prediction input format for Text Embedding. An embedding is a numerical representation (a vector of floating-point numbers) of text that captures its semantic meaning, enabling tasks like semantic search, classification, and clustering. LINT.IfChange
| Fields | |
|---|---|
content |
The text passage to generate the embedding for. This can be a phrase, sentence, or document. |
title |
The title for the text passage in the |
task_type |
The task that the generated embeddings will be used for. The model uses this hint to generate embeddings that are optimized for your use case. If not set, the default task type is used. |
TaskType
Represents the downstream task the embeddings will be used for. Specifying the task type helps the model generate embeddings that are optimized for your use case. New task types may be added in the future.
| Enums | |
|---|---|
DEFAULT |
Task type not specified. The model will use a default value suitable for semantic similarity and retrieval. |
RETRIEVAL_QUERY |
Specifies the given text is a query in a search/retrieval setting. Use this task type for embeddings that will be used to search a corpus of documents. |
RETRIEVAL_DOCUMENT |
Specifies the given text is a document from the corpus being searched. Use this task type for texts that are part of a corpus that will be searched. |
SEMANTIC_SIMILARITY |
Use this task type for Semantic Textual Similarity (STS). These embeddings will be compared with each other to measure semantic similarity. |
CLASSIFICATION |
Specifies that the given text will be classified. Use this task type to generate embeddings that will be used as input to a classification model (e.g., classifying text as spam or not spam). |
CLUSTERING |
Specifies that the embeddings will be used for clustering. Use this task type to generate embeddings that will be used to group similar texts together. |
QUESTION_ANSWERING |
Specifies that the embeddings will be used for question answering. Use this task type to generate embeddings for texts that are part of a question answering system. |
FACT_VERIFICATION |
Specifies that the embeddings will be used for fact verification. Use this task type to generate embeddings for texts that are part of a fact verification system. |
CODE_RETRIEVAL_QUERY |
Specifies that the embeddings will be used for code retrieval. Use this task type for embeddings that will be used to search a corpus of code. |
TextExtractionPredictionInstance
Prediction input format for Text Extraction.
| Fields | |
|---|---|
content |
The text snippet to make the predictions on. |
mime_type |
The MIME type of the text snippet. The supported MIME types are listed below. - text/plain |
key |
This field is only used for batch prediction. If a key is provided, the batch prediction result will by mapped to this key. If omitted, then the batch prediction result will contain the entire input instance. Agent Platform will not check if keys in the request are duplicates, so it is up to the caller to ensure the keys are unique. |
TextSentimentPredictionInstance
Prediction input format for Text Sentiment.
| Fields | |
|---|---|
content |
The text snippet to make the predictions on. |
mime_type |
The MIME type of the text snippet. The supported MIME types are listed below. - text/plain |
VideoActionRecognitionPredictionInstance
Prediction input format for Video Action Recognition.
| Fields | |
|---|---|
content |
The Google Cloud Storage location of the video on which to perform the prediction. |
mime_type |
The MIME type of the content of the video. Only the following are supported: video/mp4 video/avi video/quicktime |
time_segment_start |
The beginning, inclusive, of the video's time segment on which to perform the prediction. Expressed as a number of seconds as measured from the start of the video, with "s" appended at the end. Fractions are allowed, up to a microsecond precision. |
time_segment_end |
The end, exclusive, of the video's time segment on which to perform the prediction. Expressed as a number of seconds as measured from the start of the video, with "s" appended at the end. Fractions are allowed, up to a microsecond precision, and "inf" or "Infinity" is allowed, which means the end of the video. |
VideoClassificationPredictionInstance
Prediction input format for Video Classification.
| Fields | |
|---|---|
content |
The Google Cloud Storage location of the video on which to perform the prediction. |
mime_type |
The MIME type of the content of the video. Only the following are supported: video/mp4 video/avi video/quicktime |
time_segment_start |
The beginning, inclusive, of the video's time segment on which to perform the prediction. Expressed as a number of seconds as measured from the start of the video, with "s" appended at the end. Fractions are allowed, up to a microsecond precision. |
time_segment_end |
The end, exclusive, of the video's time segment on which to perform the prediction. Expressed as a number of seconds as measured from the start of the video, with "s" appended at the end. Fractions are allowed, up to a microsecond precision, and "inf" or "Infinity" is allowed, which means the end of the video. |
VideoGenerationModelInstance
An instance for a video generation prediction request.
| Fields | |
|---|---|
prompt |
A text description of the video you want to generate. The prompt should specify the subject, style, and any specific elements or actions that should appear in the video. |
image |
An optional image to use as a starting point for video generation, used as the first frame of the generated video. If |
video |
An optional input video to use as a starting point for video generation. If |
last_frame |
Image to use as the last frame of the generated video. This field can only be used if |
camera_control |
The camera motion to apply to the generated video. This field can only be used if |
mask |
An optional mask to apply to the input |
reference_images[] |
Optional reference images to guide video generation. If |
Image
Defines the input image format.
| Fields | |
|---|---|
mime_type |
The MIME type of the image. Supported MIME types: - image/jpeg - image/png |
Union field data. The image data. The image can be provided as either base64 encoded bytes or a Google Cloud Storage URI. data can be only one of the following: |
|
bytes_base64_encoded |
The image bytes encoded in base64. |
gcs_uri |
A Google Cloud Storage URI pointing to the image file. |
Mask
Defines the input mask format for video editing. A mask specifies regions of a video or image to modify or preserve.
| Fields | |
|---|---|
mime_type |
The MIME type of the mask. Supported MIME types: - image/png - image/jpeg - image/webp - video/mov - video/mpeg - video/mp4 - video/mpg - video/avi - video/wmv - video/mpegps - video/flv |
mask_mode |
Specifies how the mask is applied to the input video for editing. For |
Union field data. The mask data. The mask can be provided as either base64 encoded bytes or a Google Cloud Storage URI. data can be only one of the following: |
|
bytes_base64_encoded |
The mask bytes encoded in base64. |
gcs_uri |
A Google Cloud Storage URI pointing to the mask file. |
ReferenceImage
Defines the input reference image format. A reference image provides additional context to guide video generation, such as style or assets.
| Fields | |
|---|---|
image |
The image data for the reference image. |
reference_type |
The type of reference image, which defines how it influences video generation. Supported values: - |
Video
Defines the input video format.
| Fields | |
|---|---|
mime_type |
The MIME type of the video. Supported MIME types: - video/mov - video/mpeg - video/mp4 - video/mpg - video/avi - video/wmv - video/mpegps - video/flv |
Union field data. The video data. The video can be provided as either base64 encoded bytes or a Google Cloud Storage URI. data can be only one of the following: |
|
gcs_uri |
A Google Cloud Storage URI pointing to the video file. |
bytes_base64_encoded |
The video bytes encoded in base64. |
VideoObjectTrackingPredictionInstance
Prediction input format for Video Object Tracking.
| Fields | |
|---|---|
content |
The Google Cloud Storage location of the video on which to perform the prediction. |
mime_type |
The MIME type of the content of the video. Only the following are supported: video/mp4 video/avi video/quicktime |
time_segment_start |
The beginning, inclusive, of the video's time segment on which to perform the prediction. Expressed as a number of seconds as measured from the start of the video, with "s" appended at the end. Fractions are allowed, up to a microsecond precision. |
time_segment_end |
The end, exclusive, of the video's time segment on which to perform the prediction. Expressed as a number of seconds as measured from the start of the video, with "s" appended at the end. Fractions are allowed, up to a microsecond precision, and "inf" or "Infinity" is allowed, which means the end of the video. |
VisionEmbeddingModelInstance
Input format for requesting embeddings from vision models. An embedding is a list of numbers that represents the semantic meaning of text, an image, or a video. Embeddings can be used for many applications, like searching for similar images or getting recommendations. Each instance must specify exactly one of text, image, or video field.
| Fields | |
|---|---|
image |
An image to generate embeddings for. |
text |
Text to generate embeddings for. |
video |
A video to generate embeddings for. |
Image
Represents an image input for embedding generation.
| Fields | |
|---|---|
mime_type |
The MIME type of the image. The supported MIME types are:
|
Union field
|
|
bytes_base64_encoded |
Base64-encoded bytes of the image. |
gcs_uri |
A Cloud Storage URI pointing to the image file. Format: |
Video
Represents a video input for embedding generation.
| Fields | |
|---|---|
video_segment_config |
Configuration for processing a video segment. If specified, embeddings are generated for the segment. If not specified, embeddings are generated for the entire video. |
Union field
|
|
bytes_base64_encoded |
Base64-encoded bytes of the video. |
gcs_uri |
A Cloud Storage URI pointing to the video file. Format: |
VideoSegmentConfig
Configuration for processing a segment of a video.
| Fields | |
|---|---|
start_offset_sec |
The start offset of the video segment in seconds. |
end_offset_sec |
The end offset of the video segment in seconds. |
interval_sec |
The interval of the video for which the embedding will be generated. The minimum value for |