Index
ImageClassificationPredictionParams(message)ImageObjectDetectionPredictionParams(message)ImageSegmentationPredictionParams(message)TextEmbeddingPredictionParams(message)VideoActionRecognitionPredictionParams(message)VideoClassificationPredictionParams(message)VideoGenerationModelParams(message)VideoObjectTrackingPredictionParams(message)VisionEmbeddingModelParams(message)
ImageClassificationPredictionParams
Prediction model parameters for Image Classification.
| Fields | |
|---|---|
confidence_threshold |
The Model only returns predictions with at least this confidence score. Default value is 0.0 |
max_predictions |
The Model only returns up to that many top, by confidence score, predictions per instance. If this number is very high, the Model may return fewer predictions. Default value is 10. |
ImageObjectDetectionPredictionParams
Prediction model parameters for Image Object Detection.
| Fields | |
|---|---|
confidence_threshold |
The Model only returns predictions with at least this confidence score. Default value is 0.0 |
max_predictions |
The Model only returns up to that many top, by confidence score, predictions per instance. Note that number of returned predictions is also limited by metadata's predictionsLimit. Default value is 10. |
ImageSegmentationPredictionParams
Prediction model parameters for Image Segmentation.
| Fields | |
|---|---|
confidence_threshold |
When the model predicts category of pixels of the image, it will only provide predictions for pixels that it is at least this much confident about. All other pixels will be classified as background. Default value is 0.5. |
TextEmbeddingPredictionParams
Prediction model parameters for Text Embedding. Text embeddings are numerical representations of text that capture semantic meaning, used for tasks like semantic search, classification, and clustering.
| Fields | |
|---|---|
auto_truncate |
Optional. Whether to silently truncate inputs longer than the maximum input token limit. This behavior is enabled by default. If this option is set to false, inputs longer than the limit will cause an INVALID_ARGUMENT error. |
output_dimensionality |
Parameter to reduce the dimensionality of the output embedding. Some models support this feature, which can reduce storage and computation costs. If you specify this parameter, you must use a value supported by the model. If the model does not support it, or if you specify an unsupported dimension, the request will fail with an |
VideoActionRecognitionPredictionParams
Prediction model parameters for Video Action Recognition.
| Fields | |
|---|---|
confidence_threshold |
The Model only returns predictions with at least this confidence score. Default value is 0.0 |
max_predictions |
The model only returns up to that many top, by confidence score, predictions per frame of the video. If this number is very high, the Model may return fewer predictions per frame. Default value is 50. |
VideoClassificationPredictionParams
Prediction model parameters for Video Classification.
| Fields | |
|---|---|
confidence_threshold |
The Model only returns predictions with at least this confidence score. Default value is 0.0 |
max_predictions |
The Model only returns up to that many top, by confidence score, predictions per instance. If this number is very high, the Model may return fewer predictions. Default value is 10,000. |
segment_classification |
Set to true to request segment-level classification. Agent Platform returns labels and their confidence scores for the entire time segment of the video that user specified in the input instance. Default value is true |
shot_classification |
Set to true to request shot-level classification. Agent Platform determines the boundaries for each camera shot in the entire time segment of the video that user specified in the input instance. Agent Platform then returns labels and their confidence scores for each detected shot, along with the start and end time of the shot. WARNING: Model evaluation is not done for this classification type, the quality of it depends on the training data, but there are no metrics provided to describe that quality. Default value is false |
one_sec_interval_classification |
Set to true to request classification for a video at one-second intervals. Agent Platform returns labels and their confidence scores for each second of the entire time segment of the video that user specified in the input WARNING: Model evaluation is not done for this classification type, the quality of it depends on the training data, but there are no metrics provided to describe that quality. Default value is false |
VideoGenerationModelParams
| Fields | |
|---|---|
sample_count |
The number of videos to generate. If not specified, 1 video is generated. |
storage_uri |
The Google Cloud Storage URI for saving the generated videos. The URI must start with |
fps |
The frame rate of the generated videos in frames per second (fps). This value can affect the smoothness of motion in the video. If not specified, a default value appropriate for the model is used. |
duration_seconds |
The target duration of the generated videos in seconds. The actual duration of the generated videos may vary slightly. If not specified, a default value appropriate for the model is used. |
seed |
Seed for random number generation. Providing the same seed with the same input parameters will produce consistent video generation results. If not specified, a random seed is used, resulting in different videos each time. If |
aspect_ratio |
The aspect ratio of the generated videos. Supported values: * |
resolution |
The resolution of the generated videos. Supported values: * |
person_generation |
Controls whether videos of people can be generated, based on age appearance. Supported values: * |
pubsub_topic |
The Cloud Pub/Sub topic to publish video generation progress to. If this field is specified, messages are published to the topic detailing the progress of video generation. The topic must be in the format |
negative_prompt |
Things that shouldn't appear in the generated videos. For example: "low quality", "ugly", "deformed". |
enable_prompt_rewriting |
Deprecated: This field is deprecated and has no effect. Use |
enhance_prompt |
Whether to automatically enhance the prompt before generating videos. If true, the prompt is improved to generate higher quality videos. If prompt enhancement is enabled, providing a |
generate_audio |
Whether to generate audio along with the video. If true, an audio track is generated for the videos. Defaults to true. |
compression_quality |
The compression quality of the generated videos. A lower quality might result in a smaller file size, while a higher quality might result in a better-looking video. Supported values: * |
task |
The task to perform. If not specified, the task is inferred from other input fields. Supported values: * |
resize_mode |
The resize mode for the generated videos. Supported values: * |
VideoObjectTrackingPredictionParams
Prediction model parameters for Video Object Tracking.
| Fields | |
|---|---|
confidence_threshold |
The Model only returns predictions with at least this confidence score. Default value is 0.0 |
max_predictions |
The model only returns up to that many top, by confidence score, predictions per frame of the video. If this number is very high, the Model may return fewer predictions per frame. Default value is 50. |
min_bounding_box_size |
Only bounding boxes with shortest edge at least that long as a relative value of video frame size are returned. Default value is 0.0. |
VisionEmbeddingModelParams
This type has no fields.
Parameter format for large vision model embedding api.