Know your user: Identity-aware MCP servers with Cosmos DB
Pamela Fox
Demo
What we're building
Authenticated expense tracking MCP server with per-user data in Cosmos DB
Start with the final server running in VS Code, already logged in. Show adding an expense, querying expenses, and how each user only sees their own data. Demo the admin-only stats tool.
Before MCP
Every integration with an AI agent had to be done individually:
🤖 AI agent
Cosmos DB
Slack
GitHub
REST API
SDK
CLI
Before MCP, connecting to a database required one approach, Slack required another, GitHub a third. Every time a new application wanted access to the same services, someone had to wire them all up again.
Model Context Protocol
An open protocol that defines how AI applications get context from external tools and data sources.
🤖 AI agent
Cosmos DB
Slack
GitHub
MCP
MCP
MCP
MCP defines a standard way for models to get context. Covered in depth in session 1. The protocol homepage is modelcontextprotocol.io.
MCP architecture
MCP Host
GitHub Copilot, Claude,
ChatGPT...
MCP Client A
MCP Client B
MCP Server A
MCP Server B
Tools
Prompts
Resources
Tools
Prompts
Resources
MCP
MCP
MCP hosts like VS Code contain MCP clients that connect to servers. Servers expose tools (functions the LLM can call), resources (read-only data), and prompts (instruction templates).
MCP server components
This identity-aware MCP server combines four pieces:
FastMCP
Python framework for the MCP server implementation.
Microsoft Graph
Checks directory data after sign-in, including admin group membership.
This server authenticates users via Entra, stores their expenses in Cosmos DB partitioned by user ID, and checks group membership via Graph API for admin-only tools.
MCP server architecture
VS Code
MCP client
FastMCP server
tools and middleware
Cosmos DB
per-user data
Microsoft Entra
user authentication
Microsoft Graph
group membership
Bearer
Query
JWT
OBO
VS Code authenticates directly with Entra, gets a token, and sends it to the FastMCP server. The server validates the JWT and queries Cosmos DB with the user's OID. For admin checks, the server uses OBO flow to call Graph API.
Storing per-user data with Cosmos DB
Cosmos DB data model
Azure Cosmos DB is a NoSQL database that stores JSON documents in containers.
Each expense is stored as a document that includes a user_id property :
{
"id": "a1b2c3d4-...",
"user_id": "00000000-0000-...",
"date": "2026-03-31",
"amount": 20.00,
"category": "food",
"description": "Avocado toast",
"payment_method": "visa"
}
Partition key = /user_id
All of a user's expenses are in the same partition
Queries with WHERE user_id = @uid are single-partition (fast)
The user_id is an Entra object ID, unique for each user.
The partition key is the user_id from the Entra OID. This means all of a user's expenses live in the same logical partition. Queries filtered by user_id are single-partition and very efficient. Cross-partition queries are only used for admin-level aggregate stats.
Setting up the Cosmos DB client
Use azure-cosmos for easy Python integration with Cosmos DB:
from azure.cosmos.aio import CosmosClient
from azure.identity.aio import ManagedIdentityCredential
azure_credential = ManagedIdentityCredential(
client_id=os.environ["AZURE_CLIENT_ID"])
cosmos_client = CosmosClient(
url=f"https://{os.environ['AZURE_COSMOSDB_ACCOUNT']}.documents.azure.com/",
credential=azure_credential,
)
cosmos_db = cosmos_client.get_database_client(
os.environ["AZURE_COSMOSDB_DATABASE"])
cosmos_container = cosmos_db.get_container_client(
os.environ["AZURE_COSMOSDB_USER_CONTAINER"])
🔒 Cosmos DB supports keyless access via managed identity.
The Cosmos DB client uses managed identity in production and the azd CLI credential for local development. No connection strings or secrets. The container is configured with user_id as the partition key.
Tool: add_user_expense
@mcp.tool
async def add_user_expense(
date: Annotated[date, "Date of the expense in YYYY-MM-DD format"],
amount: Annotated[float, "Positive numeric amount of the expense"],
category: Annotated[Category, "Category label"],
description: Annotated[str, "Human-readable description of the expense"],
payment_method: Annotated[PaymentMethod, "Payment method used"],
ctx: Context,
):
"""Add a new expense to Cosmos DB."""
user_id = await ctx.get_state("user_id")
if not user_id:
return "Error: Authentication required"
expense_item = {
"id": str(uuid.uuid4()),
"user_id": user_id,
"date": date.isoformat(),
"amount": amount,
"category": category.value,
"description": description,
"payment_method": payment_method.value,
}
await cosmos_container.create_item(body=expense_item)
return f"Successfully added expense: ${amount} for {description}"
The tool uses typed Annotated parameters so the LLM knows exactly what to fill in. The user_id comes from middleware (extracted from the JWT). The expense document includes the user_id as both a property and the partition key value.
Tool: get_user_expenses
@mcp.tool
async def get_user_expenses(ctx: Context):
"""Get the authenticated user's expense data from Cosmos DB."""
user_id = await ctx.get_state("user_id")
if not user_id:
return "Error: Authentication required"
query = "SELECT * FROM c WHERE c.user_id = @uid ORDER BY c.date DESC"
parameters = [{"name": "@uid", "value": user_id}]
expenses_data = []
async for item in cosmos_container.query_items(
query=query, parameters=parameters, partition_key=user_id
):
expenses_data.append(item)
return json.dumps([{
"date": e.get("date"), "amount": e.get("amount"),
"category": e.get("category"), "description": e.get("description"),
"payment_method": e.get("payment_method"),
} for e in expenses_data], indent=2)
Passing partition_key=user_id makes this a single-partition query .
The query filters by user_id and passes the partition key to the SDK, making it a single-partition read. Parameterized queries prevent injection. The result is JSON that the LLM can summarize or analyze.
Show the Azure Cosmos DB extension for VS Code, identifier `ms-azuretools.vscode-cosmosdb`. Browse the container, inspect document structure and partition key values, then query for a specific user's expenses. Show how different users have isolated data.
Demo
Chat with Cosmos DB
Use the Azure MCP server
to interact with your data using natural language queries.
Use GitHub Copilot with the Cosmos DB MCP server installed.
Show how you can ask questions about your data, explore the schema, and get insights — all through natural language.
"List the items in my Azure Cosmos DB for this project using Azure MCP"
Authenticating the user with Microsoft Entra
OAuth-based access flow
An MCP client can make requests to an MCP server on behalf of a signed-in user:
MCP Client
MCP Server
Verifies the access token
and returns results scoped
to that signed-in user.
MCP
Authorization: Bearer <access token>
{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "get_expenses",
"user_id": "abc123"
}
}
MCP
{
"jsonrpc": "2.0", "result": {
"content": [{ "amount": 200 }, ...] } }
This is the high-level access pattern for OAuth-protected MCP over HTTP. The client sends a normal MCP request, but includes a bearer token for the signed-in user. The server validates that token and returns user-scoped results. We'll use the next slides to explain how the client gets recognized and how the auth flow is wired up.
OAuth 2.1 roles in MCP
MCP authentication builds on the standard OAuth 2.1 roles and token flow.
Authorization server (AS)
Microsoft Entra
OAuth 2.1 client
MCP client
OAuth 2.1 resource server
MCP server
Access token
issued by the AS
Resource owner
signed-in user
MCP auth is built on top of OAuth 2.1, which involves four entities:
Authorization server, or AS. For example, Microsoft Entra. It manages users and determines who is allowed access.
OAuth 2.1 client, or MCP client. The application requesting access on behalf of the user, like VS Code, Claude Desktop, or a programmatic agent.
OAuth 2.1 resource server, or MCP server. The server that has the resources the user wants to access.
Resource owner. The user who owns the data and authorizes the client to access it.
Choosing a client registration path
Does the
auth server
already know the
client?
Yes
No
Pre-registration
existing relationship
If not, do both sides support
CIMD?
Yes
No
CIMD
Client ID Metadata Document
(default MCP path)
DCR
Dynamic Client Registration
(legacy fallback)
For this demo: Microsoft Entra does not support CIMD or DCR , so we use pre-registration with VS Code's known client ID. (Alternative approach: put a DCR proxy in front of Entra.)
This tree matches the MCP auth spec's preferred decision order. If the auth server already knows the client, use pre-registration. If not, the modern MCP path is CIMD, where the client identifies itself with a hosted metadata document. If CIMD is unavailable, DCR is the legacy fallback. For this demo, Microsoft Entra does not support CIMD or DCR, so we rely on VS Code's known client ID for pre-registration. Another option would be to place a DCR-capable auth proxy in front of Entra.
Pre-registered OAuth flow for MCP
User
MCP client
Authorization server (AS)
MCP server
asks to use an MCP tool
MCP request without token
401 + PRM metadata
redirects to Entra with known client ID
signs in
returns authorization code
exchanges code for token
returns access token
sends MCP request with bearer token
returns authenticated MCP results
pre-registered client is already
known to the auth server
This is the simplified pre-registration flow we actually use. The client first makes an MCP request and gets a 401 with protected resource metadata. That tells it to go to Entra. Because VS Code is already pre-registered, Entra already knows the client ID. In this setup, admin consent is already granted, so the user just signs in. The client gets an authorization code, exchanges it for a token, and then retries the MCP request with that bearer token.
Setting up OAuth with Entra in FastMCP
FastMCP sends clients directly to Entra, then validates the returned JWTs with Entra's public signing keys.
from fastmcp import FastMCP
from fastmcp.server.auth import RemoteAuthProvider
from fastmcp.server.auth.providers.azure import AzureJWTVerifier
verifier = AzureJWTVerifier(
client_id=entra_client_id,
tenant_id=os.environ["AZURE_TENANT_ID"],
required_scopes=["user_impersonation"],
)
auth = RemoteAuthProvider(
token_verifier=verifier,
authorization_servers=[f"https://login.microsoftonline.com/{os.environ['AZURE_TENANT_ID']}/v2.0"],
base_url=base_url,
)
mcp = FastMCP("Expenses Tracker", auth=auth)
This is Option 3 from the plan. VS Code is pre-registered as an authorized application on our Entra app registration. When the PRM points to Entra, VS Code uses its own first-party Entra GUID and authenticates via the macOS or Windows authentication broker. AzureJWTVerifier validates Entra v2.0 tokens using public signing keys, and RemoteAuthProvider only needs to serve the PRM metadata. The server just validates the resulting JWT.
Extracting user identity from the token
We pull user_id from the OAuth token claims so that each tool can access the user id.
from fastmcp.server.dependencies import get_access_token
from fastmcp.server.middleware import Middleware, MiddlewareContext
class UserAuthMiddleware(Middleware):
def _get_user_id(self):
token = get_access_token()
if not (token and hasattr(token, "claims")):
return None
return token.claims.get("oid") # Entra Object ID
async def on_call_tool(self, context: MiddlewareContext, call_next):
user_id = self._get_user_id()
if context.fastmcp_context is not None:
await context.fastmcp_context.set_state("user_id", user_id)
return await call_next(context)
mcp = FastMCP("Expenses Tracker", auth=auth,
middleware=[UserAuthMiddleware()])
Middleware extracts the user's Object ID (oid) from the JWT claims on every tool call. This OID uniquely identifies the user in Entra and becomes the key for storing their data in Cosmos DB. The state is passed through the FastMCP context so tools can access it.
Demo
Full auth flow
Reconnect the server and walk through the 401, metadata discovery, Entra sign-in, and authenticated retry.
Disconnect the MCP server, then reconnect. Show the 401 → PRM discovery → Entra login → authenticated request flow. Show the server logs to confirm no proxy endpoints were hit.
Adding role-based access with Microsoft Graph
Why role-based access?
Some tools should only be available to certain users:
Regular users : add and view their own expenses
Admins : view aggregate stats across all users
Use Entra security groups to define roles:
Create an "MCP Admins" group in Entra
Add admin users to the group
Server checks group membership via Microsoft Graph API
Not all tools should be available to all users. Regular users can only manage their own expenses. Admins can see stats across all users. We use Entra security groups to define who is an admin.
On-Behalf-Of (OBO) flow for Graph API
MCP client
MCP server
Entra
Microsoft Graph
tools/list or admin tool call
OBO exchange using user token
returns Graph access token
check transitive group membership
member or not?
if member ✓
show tool / allow call
if not ✕
hide tool / block call
This is the same role-check flow used during tools listing and again at invocation time. The MCP client either asks for the tools list or attempts an admin tool call. The MCP server uses OBO to exchange the user's token for a Graph token, calls Microsoft Graph to check transitive membership in the admin group, and then either shows or runs the admin tool, or hides and blocks it.
Checking group membership via Graph API
The server uses MSAL for token exchange, then Graph API to check group membership:
confidential_client = ConfidentialClientApplication(
client_id=entra_client_id,
client_credential=client_credential,
authority=f"https://login.microsoftonline.com/{os.environ['AZURE_TENANT_ID']}",
token_cache=TokenCache(),
)
graph_token = confidential_client.acquire_token_on_behalf_of(
user_assertion=ctx.token.token,
scopes=["https://graph.microsoft.com/.default"],
)
client = httpx.AsyncClient()
response = await client.get(
"https://graph.microsoft.com/v1.0/me/transitiveMemberOf"
"/microsoft.graph.group"
f"?$filter=id eq '{group_id}'&$count=true",
headers={
"Authorization": f"Bearer {graph_token['access_token']}",
"ConsistencyLevel": "eventual",
})
is_admin = response.json().get("@odata.count", 0) > 0
This is the core OBO pattern. The server starts from an existing confidential client, exchanges the user's token for a Graph token, and then calls Graph to check transitive group membership. The ConsistencyLevel: eventual header is required for count-based membership queries.
Enforcing role-based access in tools
FastMCP runs this auth check at both tools listing time and tool invocation time:
async def require_admin_group(ctx: AuthContext) -> bool:
admin_group_id = os.environ["ENTRA_ADMIN_GROUP_ID"]
graph_token = confidential_client.acquire_token_on_behalf_of(
user_assertion=ctx.token.token,
scopes=["https://graph.microsoft.com/.default"],
)
return await check_user_in_group(
graph_token["access_token"], admin_group_id)
@mcp.tool(auth=require_admin_group)
async def get_expense_stats(ctx: Context):
"""Get expense statistics. Only accessible to admins."""
...
Non-admin users won't even see get_expense_stats in the tools list.
FastMCP's component-level authorization runs the auth check both at tool discovery (tools/list) and at invocation time. The require_admin_group function uses the OBO flow to get a Graph token and check group membership. Non-admins won't see the tool at all.
Demo
Role-based access
Admin tool visibility
Show that an admin user can see and use the get_expense_stats tool. Switch to a non-admin user and show the tool is hidden. Try to call it directly — it's blocked.
Agent skills for Cosmos DB
Use the cosmosdb-best-practices agent skill to:
Review partition key design
Optimize query performance
Validate data model choices
Check SDK usage patterns
"Review my Cosmos DB data model and suggest improvements"
Install the skills:
npx skills add AzureCosmosDB/cosmosdb-agent-kit
github.com/AzureCosmosDB/cosmosdb-agent-kit
Agent skills give Copilot domain-specific knowledge about Cosmos DB best practices. Ask it to review your data model, partition key design, or query patterns for optimization suggestions. Install the extension from the cosmosdb-agent-kit repo.
Demo
Using agent skills in Copilot
Reviewing Cosmos DB data model and optimizing queries
Use GitHub Copilot with the cosmosdb-best-practices skill. Ask it to review the data model. Show how it analyzes the partition key choice, query patterns, and suggests improvements.