行業Apr 27, 2026

How to Run Claude Cowork and Claude Code with Any LLM (GPT, Grok, Gemini, OpenRouter, Local Models)

Anthropic's third-party inference feature lets you power Claude's agent harness with any model — no $200/month subscription required

Douglas Lai

Share to

How to Run Claude Cowork and Claude Code with Any LLM (GPT, Grok, Gemini, OpenRouter, Local Models)

Anthropic shipped a feature that barely anyone noticed: you can now run Claude Cowork and Claude Code against any LLM you choose — GPT-5, Grok, Gemini, open-weight models via OpenRouter, a local model on your laptop, or your enterprise gateway (Bedrock, Vertex AI, Azure AI Foundry). The agent harness, skills, plugins, MCP servers, and admin controls all stay in place. Only the model behind them changes.

This opens up two compelling scenarios. Individuals who are hitting Claude's Max plan limits — or who simply don't want to pay $100–$200/month — can now point Cowork at a free OpenRouter model and get the full agent experience at near-zero cost. Enterprises that are already running on Bedrock, Vertex, or Foundry can deploy Cowork inside their existing compliance boundary without routing data through Anthropic's first-party infrastructure.

This guide covers what Cowork on third-party inference actually is, who it's for, how to set it up step by step, and how to troubleshoot the most common issues.

Claude Cowork running with Third Party gateway

What Is Cowork on Third-Party Inference?

Cowork on third-party inference (also called "Cowork on 3P") is a configuration mode that replaces the Claude model powering Cowork with any OpenAI-compatible API endpoint. Anthropic's official docs frame three deployment paths:

Regulated industries — organizations with data residency requirements
Enterprise gateway users — companies running Claude through Amazon Bedrock, Google Cloud Vertex AI, or Azure AI Foundry
Individuals on pilot or evaluation setups — anyone who installs Claude Desktop and wants to try a different model

All three paths share the same configuration panel and the same set of admin controls. Every enterprise-grade feature (per-user token caps, MCP allowlist, OpenTelemetry, block auto-updates, remove built-in tools) ships on all three paths.

OpenRouter also works as an LLM gateway — Anthropic doesn't list it in the official docs, but independent testing confirms it. This means any model on OpenRouter's catalog (including many free-tier models) can power Cowork's full agent harness.

Here's a quick comparison of what changes and what stays the same:

Claude Desktop vs. Cowork on 3P gateway comparison

One important note on data residency: During this research preview, Claude on Azure AI Foundry still routes through Anthropic's infrastructure. Amazon Bedrock and Google Cloud Vertex AI are the paths that offer full provider-side data residency.

Who Should Use This Feature?

Two distinct audiences benefit from third-party inference, and the use cases don't overlap much.

Individual users are the primary beneficiaries in the near term. If you're hitting Claude's Max plan weekly usage limits, want to evaluate Cowork before committing to a paid subscription, or are working with a local model containing proprietary code you can't send to any external API, Cowork on 3P solves all three problems. You can run the full agent harness — including file tools, MCP servers, skills, and plugins — against a free model on OpenRouter today.

Enterprise teams get something different: the ability to keep Cowork inside an approved cloud provider boundary. If your organization has already cleared Bedrock, Vertex, or Foundry through your security review, Cowork on 3P lets you deploy to employees without opening a new data flow to Anthropic's first-party infrastructure. The admin controls (token caps, MCP allowlist, OpenTelemetry exporter) work the same way regardless of which provider sits behind them.

Step-by-Step Setup

1. Connect to OpenRouter

No proxy server is needed. You point Claude Desktop directly at OpenRouter's API endpoint.

Open Claude Desktop → Menu → Developer → Configure Third-Party Inference
Set the following values:
- Connection: Gateway
- Gateway base URL: https://openrouter.ai/api
- Gateway API key: your OpenRouter API key
- Gateway auth scheme: x-api-key
In Sandbox & workspace, configure Allowed egress hosts so your agents can reach the web. To allow all sites, you can use a wildcard entry.
Click Apply locally → Relaunch now
Log out of your Anthropic account, then choose Continue with Gateway
You'll see "Setting up Claude's workspace…" — once that completes, you can start chatting

To select your model, use OpenRouter's model identifier in the model picker. A free model that works reliably for agentic tasks: tencent/hy3-preview:free

Local models follow the same pattern — route them through an OpenAI-compatible proxy such as LiteLLM or Ollama's built-in OpenAI endpoint.

2. Import Anthropic Skills

OpenRouter setups ship with the Customize → Skills panel empty. The official Anthropic skills (docx, pdf, pptx, xlsx, skill-creator) need to be installed manually.

Download the Anthropic skills repository as a .zip file
Extract skills-main.zip
Zip each skill folder you want individually (e.g., the docx folder becomes docx.zip)
In Claude Desktop: Customize → Skills → Create skill → Upload a skill — upload each .zip

A practical tip: only import the skills you actually need. Each skill occupies context window space even when it isn't actively running, so keeping the list lean improves performance on smaller models.

3. Import Anthropic Plugins

The same manual import pattern applies to plugins. Two official repositories are worth pulling from:

Knowledge-work plugins (marketing, product management, legal, finance): github.com/anthropics/knowledge-work-plugins
Code tab plugins: github.com/anthropics/claude-plugins-official

Steps:

Download the repo as a .zip and extract it
Zip each plugin folder you want individually
In Claude Desktop: Customize → Personal plugins → + → Create plugin → Upload plugin

Start with no more than 2–3 plugins. For many workflows, zero plugins is the right answer — lean prompts perform better on non-Claude models.

4. Configure MCP Servers

MCP servers run locally on your machine and are configured under Settings → Developer.

Configuring MCP servers in Claude Cowork on a third-party gateway

MCP configurations are written to a dedicated claude_desktop_config.json file (if you don't see the "Local MCP servers" option, see the troubleshooting section below). The official MCP server registry is at github.com/modelcontextprotocol/servers.

As an example, the community mcp-atlassian server gives your agents access to both Jira and Confluence through a single MCP connection — a common enterprise setup.

5. Handle Web Search

OpenRouter doesn't support Anthropic's native web_search tool. You have three options:

Option A — Replace with a Brave Search MCP server (recommended) Add WebSearch to disabledBuiltinTools in your config so the agent doesn't try to use the native tool, then install the Brave MCP server. Brave offers 2,000 free searches per month, or $3 per 1,000 beyond that.

Option B — Switch to Vertex AI or Azure AI Foundry Both providers natively support Anthropic's web_search tool. Amazon Bedrock does not yet support it.

Option C — Wait OpenRouter has discussed routing Anthropic's web_search tool to Perplexity or a similar provider, but this capability hasn't shipped yet.

Troubleshooting

"Configure Third-Party Inference" is missing from Menu → Developer Update Claude Desktop and restart. Then enable developer mode via Help → Troubleshooting → Enable Developer Mode. Users on corporate or Team plans have reported this setting remaining hidden even with developer mode enabled — it may be plan-gated or A/B tested.

"Local MCP servers" doesn't appear in Cowork on 3P Same fix: update Claude Desktop, restart, and enable developer mode.

Connectors show as "Unavailable" This is expected behavior, not a bug. Third-party inference means running without Anthropic's connector infrastructure. Connectors depend on that layer. Use MCP servers as the replacement (see step 4 above).

Tool calling is unreliable on non-Claude models Model quality varies significantly for agentic tasks. Some models handle multi-step MCP calls cleanly; others break on complex flows. If a free model isn't working well for your workflow, try a more capable model on OpenRouter's paid tier or switch to one of the provider-hosted options (Bedrock, Vertex).

Code tab settings don't match Cowork settings This is a known issue acknowledged in Anthropic's docs: some Cowork on 3P configuration keys don't propagate identically to Code-tab sessions yet. Expect this to be resolved as the feature moves out of research preview.

What the Admin Controls Signal

The configuration panel for third-party inference ships with controls that go well beyond what an individual user would need:

Admin telemetry controls in Cowork on third-party gateway

Max tokens per window (per-user soft cap)
Allow user-added MCP servers
OpenTelemetry collector endpoint
Block auto-updates
Remove built-in tools from Cowork

These are enterprise administration controls. Paired with Bedrock, Vertex, and Foundry support, Anthropic's official framing is explicit: Cowork on 3P is designed for organizations whose security, regulatory, or contractual requirements prevent them from sending data to Anthropic's first-party infrastructure.

The individual path being documented — not just an edge case — tells you something about the direction of the product. Claude Desktop is evolving into a managed agent platform where the harness (skills, plugins, MCP servers, sub-agents) is the core value, independent of which model powers it.

Key Takeaways

Running Claude Cowork and Claude Code against third-party LLMs is now a documented, supported feature — not a workaround. Here's what to remember:

Anyone can use it today. Install Claude Desktop, configure OpenRouter as your gateway, and start with a free model. No subscription required.
The full agent harness works. Skills, plugins, MCP servers, and file tools all function the same way regardless of which model is behind them.
Enterprise paths are well-supported. Bedrock and Vertex AI provide full data residency. Foundry routes through Anthropic infrastructure during the current research preview.
Web search requires a workaround. Swap the native tool for Brave's MCP server or switch to Vertex/Foundry.
Model quality matters for agentic tasks. Smaller or free models may struggle with complex multi-step tool calls. Test with your actual workflows before committing to a model.

The feature is in research preview, but the core setup is stable and production-ready for most use cases. Getting started takes about ten minutes.

Cowork on Third-Party API vs. Fully Open-Source Cowork: What's the Difference?

Cowork on third-party inference is a meaningful step toward model flexibility, but it's still built on top of Anthropic's proprietary harness. A fully open-source alternative like Eigent takes a different architectural stance — and the differences matter depending on what you actually need.

The Core Distinction

Claude Cowork on third-party inference lets you swap the model powering Anthropic's closed-source agent platform. You still rely on Anthropic's desktop application, their skills and plugin formats, their session infrastructure, and their update cycle. The model changes; the platform doesn't.

Eigent is the agent platform itself — open source (Apache 2.0), built on CAMEL-AI, and designed to run entirely on your machine. There's no proprietary harness. Every component — orchestration, tool execution, file access, multi-agent coordination — is inspectable, forkable, and self-hostable.

Side-by-Side Comparison

Dimension	Claude Cowork on 3P API	Eigent (Open Source)
Platform source code	Proprietary (closed)	Apache 2.0 open source
Model flexibility	Any OpenAI-compatible endpoint	Claude, GPT, Gemini, Ollama, any provider
Infrastructure	Anthropic's desktop app + cloud connectors	Runs fully on your local machine
Data flow	Connectors route through Anthropic's infrastructure	Data never leaves your machine
Multi-agent orchestration	Research preview, gated access	Production-ready via CAMEL-AI
Skills & plugins	Anthropic-format, manual install for 3P	Open skills system, community-extensible
MCP servers	Supported	200+ MCP tools supported
Admin controls	Token caps, allowlist, OpenTelemetry	Full self-hosted control
Compliance boundary	Bedrock/Vertex for data residency	On-premises, no external dependency
Cost	API inference costs + Claude Desktop	Free (Apache 2.0) + API inference costs
Update control	Optional block via admin config	Full control — you own the binary
Customization	Config-level only	Full source access, fork and extend

When Cowork on Third-Party API Is the Right Choice

If you're already using Claude Desktop as your daily driver and want to extend it to non-Claude models — especially for cost reasons or because your organization has cleared Bedrock or Vertex — Cowork on 3P is the lowest-friction path. You get the familiar interface, existing workflows, and Anthropic's polished UX, with a different model underneath.

It's also the right choice if you depend on Claude's native connectors (Gmail, Notion, Slack, etc.) and don't want to replicate that setup through MCP servers. Those connectors don't work on 3P, but if your primary motivation is just model cost, and you're fine rebuilding search via Brave MCP, the tradeoff can be worth it.

When Fully Open-Source Cowork Is the Right Choice

If data sovereignty is non-negotiable — regulated industries, proprietary codebases, sensitive business logic — running on Eigent eliminates the question entirely. Nothing leaves your machine. There's no Anthropic infrastructure layer to trust or audit.

If you need real model diversity within a single workflow, Eigent's architecture supports assigning different LLMs to different workers in the same multi-agent task. Run Claude Opus for complex reasoning, a local Llama model for document parsing, and GPT for code review — all coordinated by a single CAMEL-AI orchestrator.

And if extensibility matters — building on top of the platform, contributing to it, or auditing exactly what it does — open source is the only honest answer. Cowork on 3P gives you config knobs. Eigent gives you the source code.

Eigent is a free, open-source multi-agent AI platform built on CAMEL-AI. It runs entirely on your machine, supports any LLM provider, and gives you full control over your agent infrastructure.

How to Run Claude Cowork and Claude Code with Any LLM (GPT, Grok, Gemini, OpenRouter, Local Models)

Anthropic's third-party inference feature lets you power Claude's agent harness with any model — no $200/month subscription required

Douglas Lai

Share to

How to Run Claude Cowork and Claude Code with Any LLM (GPT, Grok, Gemini, OpenRouter, Local Models)

This guide covers what Cowork on third-party inference actually is, who it's for, how to set it up step by step, and how to troubleshoot the most common issues.

Claude Cowork running with Third Party gateway

What Is Cowork on Third-Party Inference?

Regulated industries — organizations with data residency requirements
Enterprise gateway users — companies running Claude through Amazon Bedrock, Google Cloud Vertex AI, or Azure AI Foundry
Individuals on pilot or evaluation setups — anyone who installs Claude Desktop and wants to try a different model

Here's a quick comparison of what changes and what stays the same:

One important note on data residency: During this research preview, Claude on Azure AI Foundry still routes through Anthropic's infrastructure. Amazon Bedrock and Google Cloud Vertex AI are the paths that offer full provider-side data residency.

Who Should Use This Feature?

Two distinct audiences benefit from third-party inference, and the use cases don't overlap much.