Beyond Copilot: Why Custom AI Agents Matter
Microsoft Copilot for SharePoint handles general-purpose AI tasks well. But enterprises need specialized AI agents that understand their specific business processes, terminology, and data relationships. A healthcare organization needs an agent that understands HIPAA compliance workflows. A financial firm needs an agent that can navigate SOX-regulated document hierarchies.
Custom AI agents built on Azure OpenAI + SharePoint give you that specificity while keeping data within your Azure tenant boundary.
Architecture: SharePoint AI Agent Stack
The architecture flows from the user query through a SharePoint interface (SPFx Web Part, Teams Bot, or Power App) to an Azure Function orchestration layer. This connects to Azure OpenAI Service (GPT-4o for reasoning, text-embedding-ada-002 for embeddings, GPT-4o-mini for classification/routing) which works alongside Azure AI Search (Vector Index) and SharePoint Graph API (Real-time Data Access), with SharePoint Document Libraries as the source of truth.
Pattern 1: RAG (Retrieval-Augmented Generation) Agent
The most common pattern: an agent that answers questions using your SharePoint content as the knowledge base.
Step 1: Index SharePoint Content into Azure AI Search
# Extract SharePoint documents for Azure AI Search indexing
Connect-PnPOnline -Url "https://contoso.sharepoint.com/sites/Policies" -Interactive
$documents = Get-PnPListItem -List "Documents" -PageSize 500 -Fields "FileLeafRef","FileRef","Modified","Editor","File_x0020_Size"
$indexBatch = @()
foreach ($doc in $documents) {
$fileUrl = $doc.FieldValues.FileRef
$fileName = $doc.FieldValues.FileLeafRef
if ($fileName -match "\.(docx|pdf|pptx|txt|md)$") {
$tempPath = Join-Path $env:TEMP $fileName
Get-PnPFile -Url $fileUrl -Path $env:TEMP -FileName $fileName -AsFile -Force
$content = ""
if ($fileName -match "\.txt$|\.md$") {
$content = Get-Content $tempPath -Raw
} else {
$content = "PENDING_OCR"
}
$indexBatch += @{
id = [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes($fileUrl))
title = $fileName
content = $content
url = "https://contoso.sharepoint.com$fileUrl"
lastModified = $doc.FieldValues.Modified.ToString("yyyy-MM-ddTHH:mm:ssZ")
author = $doc.FieldValues.Editor.Email
fileType = [System.IO.Path]::GetExtension($fileName)
}
Remove-Item $tempPath -ErrorAction SilentlyContinue
}
}
$indexBatch | ConvertTo-Json -Depth 5 | Out-File "sharepoint_index_batch.json" -Encoding UTF8
Write-Host "Prepared $($indexBatch.Count) documents for indexing"
Step 2: Create Azure AI Search Index with Vector Fields
Create an index with standard text fields (id, title, content, url, lastModified, author, fileType) plus a vector field (contentVector) using HNSW algorithm with cosine similarity for 1536-dimension embeddings.
Step 3: Build the Agent Orchestration Layer
The orchestration function handles:
- Query understanding — classify the user's intent
- Retrieval — search Azure AI Search for relevant SharePoint documents
- Context assembly — combine retrieved documents with system prompt
- Generation — send to Azure OpenAI for response
- Citation — include SharePoint document links in the response
Key design decisions:
- Chunk size: 500-1000 tokens per chunk for optimal retrieval
- Top-K retrieval: Return top 5-8 most relevant chunks
- Temperature: 0.1-0.3 for factual responses, 0.5-0.7 for creative tasks
- System prompt: Include organization-specific terminology and response format
Pattern 2: Document Processing Agent
An agent that automatically processes documents uploaded to SharePoint.
Workflow
Document uploaded to SharePoint triggers Event Grid, which calls an Azure Function. The function sends the document to Azure Document Intelligence (extract text + structure), then to Azure OpenAI (classify, extract entities, summarize), writes metadata back to SharePoint, and routes the document based on classification.
Use Cases
Contract Processing: Extract parties, effective date, termination date, value, key terms. Classify as NDA, MSA, SOW, Amendment. Route to legal review, set calendar reminders for renewals.
Invoice Processing: Extract vendor, amount, line items, PO number, due date. Validate by matching against PO, checking for duplicates. Route to AP for approval, update financial tracking list.
Resume/Application Processing: Extract candidate name, skills, experience, education. Score by matching against job requirements. Route to hiring manager, update applicant tracking list.
Pattern 3: Conversational Knowledge Agent
A Teams bot or SPFx chat widget that answers questions about your SharePoint content.
Agent Capabilities
- Knowledge Q&A — "What is our policy on remote work?"
- Document finding — "Find the latest quarterly report for Project Atlas"
- Process guidance — "How do I submit a purchase requisition?"
- Content creation — "Draft a project status update based on the latest documents in the Project Alpha site"
- Compliance checking — "Does this document comply with our data retention policy?"
System Prompt Design
The system prompt is the most critical component. A well-designed prompt turns a generic LLM into a domain-specific expert. It should define the role, set rules (only answer from retrieved documents, always cite sources, never reveal restricted content), and specify the available SharePoint sites.
Security and Governance for AI Agents
Data Boundary Enforcement
All data stays within your Azure tenant:
| Component | Location | Data Residency |
|-----------|----------|---------------|
| SharePoint content | SharePoint Online | Your M365 tenant |
| Azure OpenAI | Your Azure subscription | Your chosen region |
| Azure AI Search | Your Azure subscription | Your chosen region |
| Vector embeddings | Azure AI Search | Your chosen region |
| Agent logic | Azure Functions | Your chosen region |
Permission Enforcement
Critical: Your AI agent must respect SharePoint permissions. Never index content globally — always filter by the requesting user's access at query time.
Audit Logging
Log every AI agent interaction for compliance: who asked the question, what documents were retrieved, what response was generated, timestamp and session ID, and whether any restricted content was filtered out.
Cost Management
Azure OpenAI costs can escalate quickly at enterprise scale:
| Component | Cost Driver | Optimization |
|-----------|-------------|-------------|
| GPT-4o | Input/output tokens | Cache common queries, use GPT-4o-mini for classification |
| Embeddings | Token count | Batch embeddings, reuse for unchanged documents |
| Azure AI Search | Index size + queries | Optimize chunk size, use semantic ranking selectively |
| Azure Functions | Execution time | Use consumption plan, optimize cold starts |
Budget estimate for 1,000 users: approximately $900-2,950/month total across all Azure services.
Frequently Asked Questions
Can I use OpenAI directly instead of Azure OpenAI?
You can, but Azure OpenAI keeps data within your Azure tenant boundary, which is critical for HIPAA, SOX, and GDPR compliance. OpenAI's consumer API processes data in OpenAI's infrastructure. For enterprise SharePoint data, Azure OpenAI is the correct choice.
How do I keep the AI Search index in sync with SharePoint changes?
Use SharePoint webhooks or Microsoft Graph change notifications to detect document changes. When a document is created, modified, or deleted, trigger an Azure Function that updates the corresponding Azure AI Search index entry.
What happens when SharePoint permissions change?
Your agent should check permissions at query time, not index time. When a user asks a question, filter search results through the Microsoft Graph API to verify the user has access to each retrieved document before including it in the AI response.
Can the AI agent write back to SharePoint?
Yes. Your agent can create list items, update metadata, upload documents, and trigger workflows. Implement strict authorization controls — the agent should only write to libraries where the requesting user has contribute permissions.
How do I handle documents that are too large for the AI context window?
Use chunking strategies: split large documents into 500-1000 token chunks with 100-token overlap. Store each chunk as a separate search index entry with a parent document reference. Retrieve the most relevant chunks rather than entire documents.
Getting Started
- Assess your use case — identify the top 3 questions employees ask that SharePoint search fails to answer well
- Prepare your data — ensure target SharePoint libraries have clean, well-structured content
- Start with RAG — the retrieval-augmented generation pattern covers 80%% of enterprise needs
- Deploy Azure resources — Azure OpenAI, AI Search, Functions in your subscription
- Build incrementally — start with one library, one use case, 50 pilot users
EPC Group builds custom AI agents for SharePoint using Azure OpenAI, AI Search, and enterprise governance frameworks. We specialize in regulated industries where data security and compliance are non-negotiable. Contact us to discuss your AI agent strategy.
Written by Errin O'Connor
Founder, CEO & Chief AI Architect | Microsoft Press Bestselling Author | 25+ Years Microsoft Ecosystem
Errin O'Connor is a Microsoft Press bestselling author of 4 books covering SharePoint, Power BI, Azure, and large-scale migrations. He leads our SharePoint consulting practice with expertise spanning 500+ enterprise migrations and compliance implementations across HIPAA, SOC 2, and FedRAMP environments.
Expert SharePoint Services
Need Expert Help?
Our SharePoint consultants are ready to help you implement these strategies in your organization.