Large language models are amazing at reasoning over text, but they become truly useful when they can reach out and run real tools on demand. The usual way to give them that superpower involves running a web server, exposing an HTTP endpoint, wrapping everything in Docker, and hoping your firewall agrees. That is a lot of ceremony for a simple helper script.

In this guide, you will learn how to skip all of that and still get the power of external tools. You will build a Model Context Protocol (MCP) server that talks to clients over plain standard input and output.

The entire server fits in a single Python file, yet it can answer real-world queries by calling Tavily web search and working hand-in-hand with a LangGraph ReAct agent.

By the end you will have a repeatable pattern for shipping new tools in minutes instead of days.

What is MCP and why does it matter?

The Model Context Protocol is an open standard that lets an LLM call a tool as if the tool were one of its own native functions. The caller sends a JSON message that describes the function name and arguments. The server runs the function and replies with JSON that carries the result. Because the payloads are structured, the model can plan multiple steps, call tools in loops, and combine their outputs in complex ways.

MCP deliberately separates the transport layer from the message format. In production you might run MCP over WebSockets or HTTP, but the specification also supports STDIO transport. STDIO means the server reads JSON lines from stdin and writes responses to stdout. No sockets, no ports, no SSL certificates. Anything that can spawn a subprocess can talk MCP.

What you will build

A fully working MCP tool server written in Python and powered by the fastmcp package.
A real web-search tool that calls Tavily’s API to fetch fresh information.
An end-to-end demo that connects the server to a LangGraph ReAct agent and answers a question about current AI policy.

Prerequisites

Make sure you have Python 3.9+ and install the required packages:

pip install fastmcp requests langchain langgraph

Create a Tavily account and export your key:

export TAVILY_API_KEY="your_real_api_key"

Building MCP servers : Step-by-step guide

This section walks you through assembling a fully functional MCP tool server from an empty Python file to a live demo that answers real-world questions. Follow each step in order and you’ll end up with a lightweight, STDIO-based service that plugs effortlessly into LangGraph agents and scales without extra infrastructure.

Step 1 — Create the skeleton server

1# tavily_server.py
2from mcp.server import FastMCP
3
4mcp = FastMCP("tavily-search-server")

FastMCP handles message routing, serialization, and function registration, so the rest of the file focuses on business logic.

Step 2 — Add the Tavily Search Tool

1import os
2import requests
3
4TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")
5
6@mcp.tool()
7def tavily_search(query: str) -> str:
8    """
9    Look up the query on Tavily and return the top snippet.
10    """
11    url = "https://api.tavily.com/search"
12    headers = {"Authorization": f"Bearer {TAVILY_API_KEY}"}
13    payload = {"query": query, "max_results": 1}
14
15    response = requests.post(url, headers=headers, json=payload, timeout=10)
16    response.raise_for_status()
17
18    results = response.json()
19    if results.get("results"):
20        return results["results"][0]["content"]
21
22    return "No results found."

The decorator marks the function as an MCP-callable tool. From the LLM’s point of view, tavily_search is now as easy to invoke as len().

Step 3 — Run the server over STDIO

1if __name__ == "__main__":
2    print("✅ MCP server has started")
3    mcp.run()   # blocks forever, listening on stdin/stdout

You can test the server manually by piping JSON into it, but the real fun starts when you wire it into an agent.

Step 4 — Call the tool from a LangGraph ReAct agent

1# run_agent.py
2import asyncio
3from mcp import ClientSession, StdioServerParameters
4from mcp.client.stdio import stdio_client
5from langchain_mcp_adapters.tools import load_mcp_tools
6from langgraph.prebuilt import create_react_agent
7from langchain.chat_models import ChatOpenAI
8
9async def run_agent():
10    server_params = StdioServerParameters(command=["python", "tavily_server.py"])
11
12    async with stdio_client(server_params) as (read, write):
13        async with ClientSession(read, write) as session:
14            await session.initialize()
15
16            tools = await load_mcp_tools(session)
17
18            agent_executor = create_react_agent(
19                llm=ChatOpenAI(model="gpt-4"),
20                tools=tools,
21                max_iterations=5,
22            )
23
24            result = await agent_executor.ainvoke(
25                {"input": "What is the latest news about AI policy in Europe?"}
26            )
27            print(result["output"])
28
29if __name__ == "__main__":
30    asyncio.run(run_agent())

When you run this script, LangGraph spawns the MCP server as a child process, discovers the tavily_search tool, plans a chain of actions, and prints a concise answer that includes fresh search results.

Key advantages of an STDIO-based MCP Server for LLM Tool integration

Zero deployment friction. The server is a plain Python script that speaks over STDIO. You can ship it with your application, embed it in a CLI, or run it as a test fixture.
Real-time knowledge. Tavily fills the model’s knowledge gaps with up-to-date web content, so the answers remain relevant long after the model’s training cutoff.
Future-proof integration. Because the interface is MCP, the same server can later be exposed over HTTP or WebSockets without changing the tool code.
Composable agents. LangGraph’s ReAct loop turns the search tool into one step of a larger reasoning pipeline. You can plug in math solvers, vector search, or database calls the same way.

Practical use cases

Finance

Scenario: A financial data provider exposes an MCP server offering structured access to economic reports, central bank updates, and sentiment analysis tools.‍

Use: Any investment-focused AI agent can ask, “What did the European Central Bank signal about rate cuts this morning?” The server fetches the latest press release and grades its tone.‍

Result: Agents using this MCP endpoint can generate timely summaries or alerts for analysts, traders, or clients without having to build a data pipeline from scratch.

EdTech

Scenario: An educational content provider hosts an MCP server with tools that surface current news articles, map them to curriculum topics, and auto-generate assessments.‍

Use: AI tutoring agents can call this server to dynamically create quizzes tied to real-world events. E.g., “Build a quiz on renewable energy using this week’s news.”‍

Result: Any edtech product can instantly add contextual, fresh learning experiences without owning news ingestion or assessment logic.

Healthcare and medical research

‍Scenario: A medical knowledge provider maintains a secure MCP server offering access to treatment guidelines, research papers (via PubMed), and diagnostic tools.‍

Use: Clinical AI agents can call the server to retrieve the latest evidence on rare conditions. E.g., “Get recent guidelines for treating X syndrome.”‍

Result: Agents embedded in hospitals, apps, or telehealth platforms can offer peer-reviewed insights without accessing raw patient data or managing compliance infrastructure.

Real estate platforms

Scenario: A property intelligence provider hosts an MCP server that aggregates listings, zoning laws, pricing history, and area insights.

‍Use: Real estate-focused AI agents (e.g., buyer advisors, investor bots) can call the server for hyperlocal data. E.g., “Show properties under ₹1.2 Cr in Indiranagar with price trends.”‍

Result: Developers building real estate tools don’t need to source, clean, or manage property data, it's all accessible via the provider’s MCP layer.

Conclusion

You have built a lightweight yet production-grade MCP server, connected it to a real-time search API, and driven it from a LangGraph agent without touching Docker or web servers. From here you can:

Register more tools with the same @mcp.tool() decorator.
Switch the transport to HTTP when you outgrow single-machine setups.
Tune the agent’s reasoning style or swap in a different LLM.

If you are exploring ways to integrate AI tools into your product without drowning in infrastructure, reach out to Aubergine Solutions. We love helping teams turn promising prototypes into scalable systems.

Happy building!

Download report

Authors

Avinash Kariya

Software Engineer - AI

An Associate Software Engineer specializing in AI-powered backend solutions. Avinash combines a strong foundation in backend development with a growing expertise in artificial intelligence. He thrives in an environment that encourages innovation and enjoys the freedom to explore and implement new technologies that align with business goals. Outside of work, Avinash is a Rubik’s Cube enthusiast and a curious learner who often dives into AI projects just for the joy of discovery.

View Author