How to integrate Google cloud vision MCP with Hermes

Hermes is a 24/7 autonomous agent that lives on your computer or server — it remembers what it learns and evolves as your usage grows. This guide explains the easiest and most robust way to connect your Google cloud vision account to Hermes. You can do this through either Composio Connect CLI or Composio Connect MCP. For personal use we recommend the CLI, but you won't go wrong with MCP either.

Google cloud vision logoGoogle cloud vision
Api Key

Google Cloud Vision API adds advanced image analysis—like labeling, OCR, and detection—to apps. It helps you extract structured data and insights from images at scale.

29 Tools

Introduction

Hermes is a 24/7 autonomous agent that lives on your computer or server — it remembers what it learns and evolves as your usage grows.

This guide explains the easiest and most robust way to connect your Google cloud vision account to Hermes. You can do this through either Composio Connect CLI or Composio Connect MCP. For personal use we recommend the CLI, but you won't go wrong with MCP either.

Also integrate Google cloud vision with

What is Composio Connect?

Composio Connect is a consumer offering that lets anyone plug 1,000+ applications directly into their agent harness — including Hermes. It can:

  • Search and load tools from relevant toolkits on-demand, reducing context usage.
  • Chain multiple tools to accomplish complex workflows via a remote workbench, without excessive back-and-forth with the LLM.
  • Manage app authentication end-to-end with zero manual overhead.

Integrating Google cloud vision with Hermes

Using Composio Connect CLI

1. Install the Composio CLI

Run the install script directly, or paste https://composio.dev/hermes into your Hermes chat box to have it installed for you.

bash
curl -fsSL https://composio.dev/install | bash
Hermes authenticating with Composio

2. Authenticate

Once the CLI is installed, ask Hermes to authenticate with Composio.

3. Connect to Google cloud vision

Ask your agent to connect to Google cloud vision, or simply request any Google cloud vision-related task. Hermes will prompt you to authenticate and authorize access.

4. Done. You're all set with a new Google cloud vision connection.


Using Composio Connect MCP

1. Get your MCP URL and API Key

Go to dashboard.composio.dev and copy your Connect MCP URL and API key.

Copy MCP URL and API key from Composio dashboard

2. Open the Hermes config file

bash
nano ~/.hermes/config.yaml

3. Add the Composio Connect MCP server

bash
mcp_servers:
  composio:
    url: "https://connect.composio.dev/mcp"
    headers:
      x-consumer-api-key: "YOUR_COMPOSIO_API_KEY"
    connect_timeout: 60
    timeout: 180

Save with Ctrl + O, Enter, then exit with Ctrl + X.

4. Restart your Hermes agent

Once restarted, ask your agent to connect to Google cloud vision or request any Google cloud vision-related task. It will prompt you to authenticate and authorize access.

5. Done!

What is the Google cloud vision MCP server, and what's possible with it?

The Google cloud vision MCP server is an implementation of the Model Context Protocol that connects your AI agent and assistants like Claude, Cursor, etc directly to your Google Cloud Vision account. It provides structured and secure access to your image analysis resources, so your agent can perform actions like registering products, managing reference images, listing endpoints, and automating large-scale image operations on your behalf.

  • Product and reference image management: Easily create new products and add reference images for visual search, enabling your agent to organize and expand your vision datasets effortlessly.
  • Bulk import and product set operations: Let your agent import large numbers of reference images into product sets from Cloud Storage CSV files, streamlining dataset curation at scale.
  • Automated product cleanup and deletion: Direct your agent to purge unused or orphan products from your project, keeping your cloud resources tidy without manual effort.
  • Location and endpoint discovery: Quickly list available Vision AI service locations and existing IndexEndpoints, making it easy for your agent to select optimal regions and manage deployment targets.
  • Vision API operation tracking: Retrieve and review ongoing or past Vision API operations, so your agent can monitor processing jobs and ensure workflow transparency.

Way Forward

With Google cloud vision connected, Hermes can now act on your behalf whenever it detects a relevant task or you ask it to.

From here, you can extend Hermes further:

  • Connect more apps: Calendar, Slack, Notion, Linear, and hundreds of others are available through the same Composio Connect setup. Each new integration compounds what Hermes can do for you.
  • Build workflows across tools: Once multiple apps are connected, Hermes can chain actions together — turn an email into a calendar invite, a Slack message into a Linear ticket, or a meeting note into a follow-up draft.
  • Let it learn your patterns: The more you use Hermes, the better it gets at anticipating how you'd handle recurring tasks. Give it feedback on drafts and decisions, and it will adapt.

If you run into trouble or want to share what you've built, join the community or check out the Docs for deeper configuration options.

TOOLS

Supported Tools

Every Google cloud vision action and event your agent gets out of the box.

Annotate Files with Vision API

Tool to perform image detection and annotation for batch files in Google Cloud Vision.

Async Batch Annotate Files

Tool to run asynchronous image detection and annotation for a list of generic files (PDF, TIFF, GIF).

Annotate Images

Run image detection and annotation for a batch of images using Google Cloud Vision API.

Annotate Images Async Batch

Tool to run asynchronous image detection and annotation for a batch of images.

Annotate Location Images

Tool to run image detection and annotation for a batch of images scoped to a specific project and location.

Create Vision Product

Creates a new Product resource in Google Cloud Vision Product Search.

Create Product Set

Creates a new ProductSet resource in Google Cloud Vision Product Search.

Create ReferenceImage

Tool to create a ReferenceImage under a product.

Delete Product

Permanently deletes a Product and its associated reference images from Google Cloud Vision API.

Get Product

Tool to get information associated with a Product.

Get Product Set

Tool to get a ProductSet.

Import Product Sets

Asynchronously imports product sets and reference images from a CSV file stored in Google Cloud Storage.

List Vision AI IndexEndpoints

Lists IndexEndpoints in Vertex AI Vision for a given project and location.

List Locations

Tool to list available Vision AI service locations for a project.

List Vision API Operations

Tool to list operations that match the specified filter.

Purge Products

Tool to asynchronously delete products in a ProductSet or orphan products.

Update Product

Tool to update a Product's mutable fields: displayName, description, and productLabels.

Update Product Set

Tool to update a ProductSet resource.

Add Product to ProductSet

Add a Product to a ProductSet in Google Cloud Vision Product Search.

Cancel Vision Operation

Starts asynchronous cancellation of a long-running Vision API operation.

Delete Vision API Operation

Tool to delete a long-running Vision API operation.

Delete Product Set

Tool to permanently delete a ProductSet.

Delete Reference Image

Permanently removes a reference image from a product in Google Cloud Vision Product Search.

Get Vision API Operation

Retrieves the latest state of a long-running Vision API operation.

Get Reference Image

Tool to get information associated with a ReferenceImage.

List Products in ProductSet

Tool to list Products in a specified ProductSet.

List Projects

List Google Cloud projects accessible to the authenticated user via Cloud Resource Manager API.

List Reference Images

Tool to list reference images for a product.

Remove Product from ProductSet

Removes a Product from a specified ProductSet in Google Cloud Vision API.

FAQ

Frequently asked questions

With a standalone Google cloud vision MCP server, the agents and LLMs can only access a fixed set of Google cloud vision tools tied to that server. However, with the Composio Tool Router, agents can dynamically load tools from Google cloud vision and many other apps based on the task at hand, all through a single MCP endpoint.

Yes, you can. Hermes fully supports MCP integration. You get structured tool calling, message history handling, and model orchestration while Tool Router takes care of discovering and serving the right Google cloud vision tools.

Yes, absolutely. You can configure which Google cloud vision scopes and actions are allowed when connecting your account to Composio. You can also bring your own OAuth credentials or API configuration so you keep full control over what the agent can do.

All sensitive data such as tokens, keys, and configuration is fully encrypted at rest and in transit. Composio is SOC 2 Type 2 compliant and follows strict security practices so your Google cloud vision data and credentials are handled as safely as possible.

Start with Google cloud vision.It takes 30 seconds.

Managed auth, hosted MCP servers, and every Google cloud vision tool your agent needs.Free to start.

Start building