How to integrate Google cloud vision MCP with Google ADK

This guide walks you through connecting Google cloud vision to Google ADK using the Composio tool router. By the end, you'll have a working Google cloud vision agent that can bulk import product images from gcs csv, list all vision ai service locations, create a new product for image recognition through natural language commands. This guide will help you understand how to give your Google ADK agent real control over a Google cloud vision account through Composio's Google cloud vision MCP server. Before we dive in, let's take a quick look at the key ideas and tools involved.

Google cloud vision logoGoogle cloud vision
Api Key

Google Cloud Vision API adds advanced image analysis—like labeling, OCR, and detection—to apps. It helps you extract structured data and insights from images at scale.

29 Tools

Introduction

This guide walks you through connecting Google cloud vision to Google ADK using the Composio tool router. By the end, you'll have a working Google cloud vision agent that can bulk import product images from gcs csv, list all vision ai service locations, create a new product for image recognition through natural language commands.

This guide will help you understand how to give your Google ADK agent real control over a Google cloud vision account through Composio's Google cloud vision MCP server.

Before we dive in, let's take a quick look at the key ideas and tools involved.

Also integrate Google cloud vision with

TL;DR

Here's what you'll learn:
  • Get a Google cloud vision account set up and connected to Composio
  • Install the Google ADK and Composio packages
  • Create a Composio Tool Router session for Google cloud vision
  • Build an agent that connects to Google cloud vision through MCP
  • Interact with Google cloud vision using natural language

What is Google ADK?

Google ADK (Agents Development Kit) is Google's framework for building AI agents powered by Gemini models. It provides tools for creating agents that can use external services through the Model Context Protocol.

Key features include:

  • Gemini Integration: Native support for Google's Gemini models
  • MCP Toolset: Built-in support for Model Context Protocol tools
  • Streamable HTTP: Connect to external services through streamable HTTP
  • CLI and Web UI: Run agents via command line or web interface

What is the Google cloud vision MCP server, and what's possible with it?

The Google cloud vision MCP server is an implementation of the Model Context Protocol that connects your AI agent and assistants like Claude, Cursor, etc directly to your Google Cloud Vision account. It provides structured and secure access to your image analysis resources, so your agent can perform actions like registering products, managing reference images, listing endpoints, and automating large-scale image operations on your behalf.

  • Product and reference image management: Easily create new products and add reference images for visual search, enabling your agent to organize and expand your vision datasets effortlessly.
  • Bulk import and product set operations: Let your agent import large numbers of reference images into product sets from Cloud Storage CSV files, streamlining dataset curation at scale.
  • Automated product cleanup and deletion: Direct your agent to purge unused or orphan products from your project, keeping your cloud resources tidy without manual effort.
  • Location and endpoint discovery: Quickly list available Vision AI service locations and existing IndexEndpoints, making it easy for your agent to select optimal regions and manage deployment targets.
  • Vision API operation tracking: Retrieve and review ongoing or past Vision API operations, so your agent can monitor processing jobs and ensure workflow transparency.

What is the Composio tool router, and how does it fit here?

What is Composio SDK?

Composio's Composio SDK helps agents find the right tools for a task at runtime. You can plug in multiple toolkits (like Gmail, HubSpot, and GitHub), and the agent will identify the relevant app and action to complete multi-step workflows. This can reduce token usage and improve the reliability of tool calls. Read more here: Getting started with Composio SDK

The tool router generates a secure MCP URL that your agents can access to perform actions.

How the Composio SDK works

The Composio SDK follows a three-phase workflow:

  1. Discovery: Searches for tools matching your task and returns relevant toolkits with their details.
  2. Authentication: Checks for active connections. If missing, creates an auth config and returns a connection URL via Auth Link.
  3. Execution: Executes the action using the authenticated connection.

Step-by-step Guide

Step by step09 STEPS
1

Prerequisites

Before starting, make sure you have:
  • A Google API key for Gemini models
  • A Composio account and API key
  • Python 3.9 or later installed
  • Basic familiarity with Python
2

Getting API Keys for Google and Composio

Google API Key
  • Go to Google AI Studio and create an API key.
  • Copy the key and keep it safe. You will put this in GOOGLE_API_KEY.
Composio API Key and User ID
  • Log in to the Composio dashboard.
  • Go to Settings → API Keys and copy your Composio API key. Use this for COMPOSIO_API_KEY.
  • Decide on a stable user identifier to scope sessions, often your email or a user ID. Use this for COMPOSIO_USER_ID.
3

Install dependencies

bash
pip install google-adk composio python-dotenv

Inside your virtual environment, install the required packages.

What's happening:

  • google-adk is Google's Agents Development Kit
  • composio connects your agent to Google cloud vision via MCP
  • python-dotenv loads environment variables
4

Set up ADK project

bash
adk create my_agent

Set up a new Google ADK project.

What's happening:

  • This creates an agent folder with a root agent file and .env file
5

Set environment variables

bash
GOOGLE_API_KEY=your-google-api-key
COMPOSIO_API_KEY=your-composio-api-key
COMPOSIO_USER_ID=your-user-id-or-email

Save all your credentials in the .env file.

What's happening:

  • GOOGLE_API_KEY authenticates with Google's Gemini models
  • COMPOSIO_API_KEY authenticates with Composio
  • COMPOSIO_USER_ID identifies the user for session management
6

Import modules and validate environment

python
import os
import warnings

from composio import Composio
from dotenv import load_dotenv
from google.adk.agents.llm_agent import Agent
from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPConnectionParams
from google.adk.tools.mcp_tool.mcp_toolset import McpToolset

load_dotenv()

warnings.filterwarnings("ignore", message=".*BaseAuthenticatedTool.*")

GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
COMPOSIO_API_KEY = os.getenv("COMPOSIO_API_KEY")
COMPOSIO_USER_ID = os.getenv("COMPOSIO_USER_ID")

if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY is not set in the environment.")
if not COMPOSIO_API_KEY:
    raise ValueError("COMPOSIO_API_KEY is not set in the environment.")
if not COMPOSIO_USER_ID:
    raise ValueError("COMPOSIO_USER_ID is not set in the environment.")
What's happening:
  • os reads environment variables
  • Composio is the main Composio SDK client
  • GoogleProvider declares that you are using Google ADK as the agent runtime
  • Agent is the Google ADK LLM agent class
  • McpToolset lets the ADK agent call MCP tools over HTTP
7

Create Composio client and Tool Router session

python
composio_client = Composio(api_key=COMPOSIO_API_KEY)

composio_session = composio_client.create(
    user_id=COMPOSIO_USER_ID,
    toolkits=["google_cloud_vision"],
)

COMPOSIO_MCP_URL = composio_session.mcp.url,
print(f"Composio MCP URL: {COMPOSIO_MCP_URL}")
What's happening:
  • Authenticates to Composio with your API key
  • Declares Google ADK as the provider
  • Spins up a short-lived MCP endpoint for your user and selected toolkit
  • Stores the MCP HTTP URL for the ADK MCP integration
8

Set up the McpToolset and create the Agent

python
composio_toolset = McpToolset(
    connection_params=StreamableHTTPConnectionParams(
        url=COMPOSIO_MCP_URL,
        headers={"x-api-key": COMPOSIO_API_KEY}
    )
)

root_agent = Agent(
    model="gemini-2.5-flash",
    name="composio_agent",
    description="An agent that uses Composio tools to perform actions.",
    instruction=(
        "You are a helpful assistant connected to Composio. "
        "You have the following tools available: "
        "COMPOSIO_SEARCH_TOOLS, COMPOSIO_MULTI_EXECUTE_TOOL, "
        "COMPOSIO_MANAGE_CONNECTIONS, COMPOSIO_REMOTE_BASH_TOOL, COMPOSIO_REMOTE_WORKBENCH. "
        "Use these tools to help users with Google cloud vision operations."
    ),
    tools=[composio_toolset],
)

print("\nAgent setup complete. You can now run this agent directly ;)")
What's happening:
  • Connects the ADK agent to the Composio MCP endpoint through McpToolset
  • Uses Gemini as the model powering the agent
  • Lists exact tool names in instruction to reduce misnamed tool calls
9

Run the agent

bash
# Run in CLI mode
adk run my_agent

# Or run in web UI mode
adk web

Execute the agent from the project root. The web command opens a web portal where you can chat with the agent.

What's happening:

  • adk run runs the agent in CLI mode
  • adk web . opens a web UI for interactive testing

Complete Code

Here's the complete code to get you started with Google cloud vision and Google ADK:

python
import os
import warnings

from composio import Composio
from composio_google import GoogleProvider
from dotenv import load_dotenv
from google.adk.agents.llm_agent import Agent
from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPConnectionParams
from google.adk.tools.mcp_tool.mcp_toolset import McpToolset

load_dotenv()
warnings.filterwarnings("ignore", message=".*BaseAuthenticatedTool.*")

GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
COMPOSIO_API_KEY = os.getenv("COMPOSIO_API_KEY")
COMPOSIO_USER_ID = os.getenv("COMPOSIO_USER_ID")

if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY is not set in the environment.")
if not COMPOSIO_API_KEY:
    raise ValueError("COMPOSIO_API_KEY is not set in the environment.")
if not COMPOSIO_USER_ID:
    raise ValueError("COMPOSIO_USER_ID is not set in the environment.")

composio_client = Composio(api_key=COMPOSIO_API_KEY, provider=GoogleProvider())

composio_session = composio_client.create(
    user_id=COMPOSIO_USER_ID,
    toolkits=["google_cloud_vision"],
)

COMPOSIO_MCP_URL = composio_session.mcp.url


composio_toolset = McpToolset(
    connection_params=StreamableHTTPConnectionParams(
        url=COMPOSIO_MCP_URL,
        headers={"x-api-key": COMPOSIO_API_KEY}
    )
)

root_agent = Agent(
    model="gemini-2.5-flash",
    name="composio_agent",
    description="An agent that uses Composio tools to perform actions.",
    instruction=(
        "You are a helpful assistant connected to Composio. "
        "You have the following tools available: "
        "COMPOSIO_SEARCH_TOOLS, COMPOSIO_MULTI_EXECUTE_TOOL, "
        "COMPOSIO_MANAGE_CONNECTIONS, COMPOSIO_REMOTE_BASH_TOOL, COMPOSIO_REMOTE_WORKBENCH. "
        "Use these tools to help users with Google cloud vision operations."
    ),  
    tools=[composio_toolset],
)

print("\nAgent setup complete. You can now run this agent directly ;)")

Conclusion

You've successfully integrated Google cloud vision with the Google ADK through Composio's MCP Tool Router. Your agent can now interact with Google cloud vision using natural language commands.

Key takeaways:

  • The Tool Router approach dynamically routes requests to the appropriate Google cloud vision tools
  • Environment variables keep your credentials secure and separate from code
  • Clear agent instructions reduce tool calling errors
  • The ADK web UI provides an interactive interface for testing and development

You can extend this setup by adding more toolkits to the toolkits array in your session configuration.

TOOLS

Supported Tools

Every Google cloud vision action and event your agent gets out of the box.

Annotate Files with Vision API

Tool to perform image detection and annotation for batch files in Google Cloud Vision.

Async Batch Annotate Files

Tool to run asynchronous image detection and annotation for a list of generic files (PDF, TIFF, GIF).

Annotate Images

Run image detection and annotation for a batch of images using Google Cloud Vision API.

Annotate Images Async Batch

Tool to run asynchronous image detection and annotation for a batch of images.

Annotate Location Images

Tool to run image detection and annotation for a batch of images scoped to a specific project and location.

Create Vision Product

Creates a new Product resource in Google Cloud Vision Product Search.

Create Product Set

Creates a new ProductSet resource in Google Cloud Vision Product Search.

Create ReferenceImage

Tool to create a ReferenceImage under a product.

Delete Product

Permanently deletes a Product and its associated reference images from Google Cloud Vision API.

Get Product

Tool to get information associated with a Product.

Get Product Set

Tool to get a ProductSet.

Import Product Sets

Asynchronously imports product sets and reference images from a CSV file stored in Google Cloud Storage.

List Vision AI IndexEndpoints

Lists IndexEndpoints in Vertex AI Vision for a given project and location.

List Locations

Tool to list available Vision AI service locations for a project.

List Vision API Operations

Tool to list operations that match the specified filter.

Purge Products

Tool to asynchronously delete products in a ProductSet or orphan products.

Update Product

Tool to update a Product's mutable fields: displayName, description, and productLabels.

Update Product Set

Tool to update a ProductSet resource.

Add Product to ProductSet

Add a Product to a ProductSet in Google Cloud Vision Product Search.

Cancel Vision Operation

Starts asynchronous cancellation of a long-running Vision API operation.

Delete Vision API Operation

Tool to delete a long-running Vision API operation.

Delete Product Set

Tool to permanently delete a ProductSet.

Delete Reference Image

Permanently removes a reference image from a product in Google Cloud Vision Product Search.

Get Vision API Operation

Retrieves the latest state of a long-running Vision API operation.

Get Reference Image

Tool to get information associated with a ReferenceImage.

List Products in ProductSet

Tool to list Products in a specified ProductSet.

List Projects

List Google Cloud projects accessible to the authenticated user via Cloud Resource Manager API.

List Reference Images

Tool to list reference images for a product.

Remove Product from ProductSet

Removes a Product from a specified ProductSet in Google Cloud Vision API.

FAQ

Frequently asked questions

With a standalone Google cloud vision MCP server, the agents and LLMs can only access a fixed set of Google cloud vision tools tied to that server. However, with the Composio Tool Router, agents can dynamically load tools from Google cloud vision and many other apps based on the task at hand, all through a single MCP endpoint.

Yes, you can. Google ADK fully supports MCP integration. You get structured tool calling, message history handling, and model orchestration while Tool Router takes care of discovering and serving the right Google cloud vision tools.

Yes, absolutely. You can configure which Google cloud vision scopes and actions are allowed when connecting your account to Composio. You can also bring your own OAuth credentials or API configuration so you keep full control over what the agent can do.

All sensitive data such as tokens, keys, and configuration is fully encrypted at rest and in transit. Composio is SOC 2 Type 2 compliant and follows strict security practices so your Google cloud vision data and credentials are handled as safely as possible.

Start with Google cloud vision.It takes 30 seconds.

Managed auth, hosted MCP servers, and every Google cloud vision tool your agent needs.Free to start.

Start building