Package your app as a FastAPI or LangServe endpoint — ready to scale, secure, and serve.
You've built your LangChain app. It runs locally. It answers queries. Maybe it even talks to tools or searches documents.
But now you need to put it behind an API, host it on the web, and serve real users — whether it's your internal team, frontend app, or customers.
This chapter covers:
Let's turn your script into a real product.
FastAPI is the most common way to expose LangChain apps as APIs. It's async-ready, fast, and integrates cleanly with Python's LLM ecosystem.
pip install fastapi uvicorn
Here's a minimal FastAPI app that runs a LangChain Q&A workflow:
from fastapi import FastAPI, Request
from pydantic import BaseModel
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
app = FastAPI()
# LangChain setup
llm = OpenAI()
prompt = PromptTemplate(
input_variables=["question"],
template="Answer this question clearly: {question}"
)
chain = LLMChain(llm=llm, prompt=prompt)
# Input model
class Query(BaseModel):
question: str
@app.post("/ask")
def ask(query: Query):
result = chain.run(query.question)
return {"answer": result}
Run the server:
uvicorn main:app --reload
Send a POST request with:
{
"question": "What is LangChain?"
}
And get back an LLM-powered answer via JSON.
Before you deploy to the world, make sure you:
Don't hardcode API keys. Use .env files or cloud environment config (e.g., in Render, Vercel, or AWS).
export OPENAI_API_KEY=sk-...
Load with:
import os
os.getenv("OPENAI_API_KEY")
Use basic auth, API keys, or JWT to prevent abuse of your LLM endpoints. Add request limits or quotas to prevent runaway costs.
Wrap your chain runs in try/except blocks. Log inputs, errors, and token usage. FastAPI integrates with logging easily:
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
LLM + tool workflows may take 5–15 seconds. Offload them to background jobs and return a task ID or webhook callback if needed.
You can deploy LangChain apps like any other FastAPI project. Some popular options:
Use Gunicorn + Uvicorn workers for production-grade concurrency.
If you want an even faster way to expose LangChain chains as REST APIs, try LangServe:
pip install langserve
You can turn any chain into an API with just:
from langserve import add_routes
add_routes(app, chain, path="/my-chain")
It auto-generates OpenAPI docs and request/response schemas — perfect for rapid prototyping.
Before you go live:
Congratulations — you've gone from zero to production-ready LLM apps using LangChain.
You now know how to:
Whether you're powering an internal agent, a startup MVP, or a customer-facing AI tool — your LangChain foundation is now solid.