Concepts

MCP & Tool Use

An AI without tools is a brain in a jar. Tool use is how agents go from "here's what I think" to "here's what I did." MCP (Model Context Protocol) is the open standard that makes every tool plug-and-play.

1. Watch an Agent Use Tools

Pick a scenario and watch step-by-step as the agent thinks about what tool to use, calls it, gets a result, and responds. The agent isn't following a script — it decides which tool fits the task.

But tool use goes beyond simple lookups. Agents can now learn from the same sources humans do. In this example, an agent receives a YouTube tutorial URL, watches the video through an API, extracts step-by-step instructions, then executes each step using its own tools — replicating the tutorial automatically.

How agents learn from the same medium humans learn from — YouTube tutorial to structured steps to tool execution

Now try the interactive demos below to see the basic building blocks — how an agent thinks, calls a tool, and responds:

2. MCP: USB-C for AI

Before MCP, every AI tool integration was custom. Want to connect to Gmail? Build a connector. Slack? Another connector. Database? Yet another. Model Context Protocol (MCP) is a universal standard — one protocol, every tool.

Before MCP: Custom connectors everywhere

Every AI app built its own integration for every tool. Gmail connector for App A, different Gmail connector for App B. Nothing was reusable. N apps × M tools = N×M custom integrations.

After MCP: One standard, everything connects

Build one MCP server for Gmail and every AI app can use it. The tool speaks MCP, the AI speaks MCP, they just connect. N apps + M tools = M servers total.

App A

App B

App C

↔

MCP

↔

Gmail

Slack

Database

3. Build a Tool Definition

This is what agents actually see when they look at available tools. Try building one — give it a name, describe what it does, and add parameters. Watch the JSON schema update in real-time.

Tool Name

Description (what does it do?)

Parameters

What the agent sees (JSON Schema)

4. What Can Agents Actually Do?

Here are the major categories of tools agents can connect to. Click any card to see real examples of what that tool enables.

Before exploring individual tool categories, here's what a complete agent workflow looks like end-to-end. One agent, one task: find a website's contact form, extract the fields, fill them out, and submit. No human clicks — the agent handles the browser, the reading, and the typing.

Anatomy of an automated agent task — one agent completing a full form-submission workflow from URL to submit

Each step in that workflow uses one or more of the tool categories below. Click any card to see what's possible:

📄

Read & Write Files

Create, edit, search, and organize files on disk.

Agents can read source code, create new files, search for patterns across codebases, and edit existing content. This is the foundation of AI-assisted development.

Read: "Find all TODO comments in the codebase"

Write: "Create a new React component for the dashboard"

Search: "Find where the login function is defined"

🌐

Search the Web

Look up current information, documentation, and research.

Breaks the training data cutoff barrier. Agents can search for current events, look up API documentation, find Stack Overflow solutions, and verify facts in real-time.

Research: "What's the latest React 19 migration guide?"

Verify: "Is this npm package still maintained?"

Current: "What's the S&P 500 at right now?"

✉

Send Email & Messages

Draft, send, and manage communications across platforms.

Agents can compose and send emails, post Slack messages, create Discord notifications, and manage communication workflows — always with human approval for sensitive actions.

Draft: "Write a follow-up email to the client"

Notify: "Post deployment status to #engineering"

Manage: "Archive all newsletters older than 30 days"

📊

Query Databases

Run SQL queries, analyze data, generate reports.

Turn natural language into SQL. Agents can query your database, aggregate results, identify trends, and generate visualizations — making data accessible to non-technical users.

Query: "How many users signed up last week?"

Analyze: "Show me revenue by region for Q4"

Report: "Generate a monthly churn analysis"

⌨

Run Code & Commands

Execute scripts, run tests, manage infrastructure.

Agents can execute code in sandboxed environments, run test suites, build projects, deploy applications, and manage git workflows — turning instructions into working software.

Test: "Run the test suite and fix any failures"

Build: "Build the project and show me any warnings"

Deploy: "Push the latest changes to staging"

🖥

Control Browsers

Navigate sites, fill forms, scrape data, test UIs.

Using tools like Playwright, agents can interact with web applications — clicking buttons, filling forms, navigating pages, and extracting data. Essential for testing and automation.

Test: "Walk through the checkout flow and verify it works"

Scrape: "Extract pricing from these 5 competitor pages"

Automate: "Fill out this expense report from my receipt"

4b. Scale Up: Agent Swarms

One agent filling out one form is useful. But what if you need to hit 50 websites in an hour? Instead of running one agent 50 times, you spawn a swarm — an orchestrator agent that launches multiple worker agents in parallel. Same workflow, multiplied.

Agent swarm — Claude orchestrator spawning 8 parallel agents, each completing the same workflow simultaneously

In this example, a single prompt triggers 8 agents working simultaneously — each navigating to a different site, finding the contact form, and submitting. 8 forms in 60 seconds instead of 8 minutes. This is where tool use stops being a convenience and starts being a competitive advantage.

5. The Trust Question

"Why would I let AI send emails?" — Fair question. Here's how tool use stays safe.

Guardrails That Matter

Human-in-the-loop: Sensitive actions (send email, delete files, deploy code) require explicit human approval before execution. The agent proposes, you approve.
Scoped permissions: Tools can be read-only or read-write. An agent might read your calendar but not create events. Read your email but not send replies.
Sandboxed execution: Code runs in isolated containers. File access is restricted to specific directories. Network access can be limited.
Audit trails: Every tool call is logged — what was called, with what parameters, and what was returned. Full transparency.
Rate limits: Prevent runaway agents from making 1,000 API calls. Set hard limits on actions per session.

💡

The principle: start narrow, expand with trust

Give agents read-only access first. Once you see they handle it well, add write permissions for low-risk actions. Graduate to autonomous actions only for repeatable, well-tested workflows. Same way you'd onboard a new employee.

Key Takeaways

A good agent has 4 parts: personality, goals, tools, and skills

Personality is who it is (tone, role, expertise). Goals are what it's trying to accomplish. Tools are external capabilities it can call (APIs, databases, email). Skills are learned behaviors — multi-step workflows it knows how to execute, like "deploy a website" or "write a PR review." Tools are atomic actions; skills are choreography.

Tools vs. skills: know the difference

Tool = a single capability: "send email," "read file," "query database." Skill = a workflow that chains tools together: "research a topic, draft an article, optimize for SEO, and publish it." You give an agent tools; the agent develops skills by combining them.

The agent decides which tool to use — you don't hardcode it

You describe what each tool does (like the JSON schema in the builder above). The agent reads those descriptions and picks the right tool for each task. That's why good tool descriptions matter more than good tool code.

MCP (Model Context Protocol) is the universal connector

Before MCP, every AI app built custom integrations for every tool. MCP standardizes the connection — like USB-C for AI. Build one MCP server for Gmail, and every AI app can use it. This is why the tool ecosystem is exploding.

Trust is built incrementally

Start with read-only tools. Add write permissions when you're comfortable. Go autonomous only for proven, repeatable workflows. Always keep audit logs. Treat AI agents like new hires — earn trust over time.