The Controls Everyone Buys and Nobody Gets Right
Sensitivity labels and Data Loss Prevention (DLP) policies are part of every enterprise Microsoft 365 E5 or Purview deployment, and they are the two capabilities that consistently underperform in the field. The labels get published but never actually classify content correctly. The DLP policies generate thousands of false positives and get bypassed or disabled. The auto-labeling engine mislabels documents. The end result is an expensive compliance investment that fails to produce the evidence auditors actually want.
This deep dive walks through how to design a sensitivity label taxonomy that works, how to configure DLP rules that do not trigger alert fatigue, how to tune auto-labeling, and the specific pitfalls that trip up most enterprise deployments. The content assumes you have Microsoft 365 E5 or Microsoft 365 E3 with add-on licenses for Purview and advanced compliance.
Starting With the Taxonomy
The single most important decision in a sensitivity label deployment is the taxonomy. Too few labels and the classifications carry no meaning. Too many labels and users cannot remember what they mean. The Microsoft-recommended starting point is four or five labels, and the evidence from enterprise deployments supports that range.
The working taxonomy that succeeds across most enterprises:
- Public: Content intended for external distribution. No protection required. Examples include published marketing materials, job postings, and press releases.
- General: Internal business content with no regulatory or sensitivity concerns. No encryption. The default for unlabeled content created in the organization.
- Confidential: Sensitive internal content that requires access controls. Encryption enabled. Typical contents include financial reports, employee records, and business strategy documents.
- Highly Confidential: Regulated or strategic content that requires strict controls. Encryption with restricted permissions, often watermarking, and possibly content marking. Typical contents include M&A documents, health records, unreleased financial results, and source code for critical systems.
- Restricted (optional fifth tier): Content that must never leave specific users or systems. Maximum protection, typically encryption with do-not-forward restrictions. Used for the smallest, most sensitive set of content.
Sub-labels under each parent can represent business-specific variants, such as Confidential / HR or Confidential / Financial. Sub-labels let the organization apply different protections to the same general category depending on the content type, without forcing users to remember 20 different top-level labels.
How Labels Actually Protect Content
Labels are metadata that can trigger a set of protection actions. The protections that matter in enterprise deployments:
Encryption
When a label applies encryption, the content is encrypted with rights management, and access requires authentication against Azure AD. Encryption travels with the file regardless of where it is stored, shared, or copied. Encrypted content can only be opened by users explicitly granted access through the label configuration.
Permission Assignment
Labels can assign specific permissions (Owner, Co-Owner, Reviewer, Co-Author, Viewer) to specific users, groups, or domains. The permissions are embedded in the file and enforced by the application (Word, Excel, PowerPoint, Outlook).
Content Marking
Labels can add visible headers, footers, or watermarks to documents. Content marking is useful for training users to recognize labeled content and for providing visual evidence in print or screenshots.
Access Restrictions
Labels can require specific authentication factors (such as MFA), restrict access from specific locations, or enforce session timeouts. These restrictions integrate with Conditional Access policies.
SharePoint Site Scoping
Labels applied to SharePoint sites and Microsoft 365 groups control the default privacy of the site, external sharing capabilities, and device access restrictions. Site-level labels and file-level labels interact to produce the effective protection for any given document.
Auto-Labeling That Actually Works
Auto-labeling is where most deployments fall apart. The engine applies labels automatically based on content inspection, keywords, or trainable classifiers. When it works, auto-labeling eliminates the burden of user classification. When it does not, it mislabels content in ways that break workflows.
Sensitive Information Types
The first auto-labeling mechanism is Sensitive Information Types (SITs). Microsoft provides 150+ built-in SITs for common patterns like credit card numbers, Social Security numbers, and health record numbers. Custom SITs can be defined using regular expressions, keyword lists, and pattern matching.
The trap with SITs is that they match patterns, not context. A SIT for US Social Security numbers will match any 9-digit string with the right pattern, including purchase order numbers, test data, and random numeric strings in log files. Successful deployments always combine SITs with supporting context such as keywords nearby or document metadata.
Trainable Classifiers
Trainable classifiers are AI models trained on example documents. Microsoft provides pre-trained classifiers for common content types (contracts, resumes, source code, finance documents), and organizations can train custom classifiers with 50 to 500 example documents per class.
Trainable classifiers work well when the content is stylistically consistent. A custom classifier for policy documents trained on 200 policies will reliably classify new policy documents. Classifiers struggle when content varies widely in style or when the training set is too small or not representative.
Simulation Mode
Before publishing an auto-labeling policy, always run it in simulation mode. Simulation analyzes existing content and shows what would be labeled without actually applying labels. This catches over-labeling and under-labeling before the policy causes real impact.
```powershell
# Connect to Security and Compliance Center
Connect-IPPSSession
# Create a simulation-mode auto-labeling policy
New-AutoSensitivityLabelPolicy -Name "Confidential-Finance-Simulation" -Mode TestWithoutNotifications -Workload SharePoint -SharePointLocation "https://contoso.sharepoint.com/sites/finance" -ApplySensitivityLabel "Confidential / Financial"
# Create the rule inside the policy
New-AutoSensitivityLabelRule -Policy "Confidential-Finance-Simulation" -Name "FinancialStatements" -ContentContainsSensitiveInformation @{Name="Financial Statement Keywords"; minCount=3}
# Review simulation results
Get-AutoSensitivityLabelPolicy -Identity "Confidential-Finance-Simulation" | Format-List Mode, Status, CreatedBy, LastModifiedTime
```
DLP Rules That Do Not Generate Alert Fatigue
DLP policies detect sensitive content in transit and apply enforcement actions. Common actions include blocking sharing, requiring justification, encrypting the content, or simply alerting.
The single biggest DLP failure mode is alert fatigue. A DLP policy that generates 500 alerts per day gets ignored after the first week. Writing DLP rules that produce useful, actionable alerts requires three design principles.
Principle 1: Context, Not Patterns
DLP rules should match on the combination of a sensitive data pattern AND supporting context. A rule that fires on any 16-digit number will match invoice numbers, order confirmations, and random text in forwarded emails. A rule that fires on a 16-digit number AND the text "credit card" within 300 characters produces dramatically fewer false positives.
Principle 2: Severity Tiers
DLP rules should produce at least three severity tiers: Low (informational, logged only), Medium (user notified, action logged), and High (blocked or quarantined). Tier the rules so that the High severity tier only fires on high-confidence, high-impact violations.
Principle 3: Incident Review Workflow
Every DLP policy needs a named team responsible for reviewing incidents. Without a review workflow, incidents accumulate in the compliance portal and no one investigates them. The working model assigns DLP incidents to the security operations team for initial triage, with escalation to compliance or legal for specific incident types.
Example High-Quality DLP Rule
A DLP rule for credit card numbers in email should look something like this logical structure:
- Match: Credit card number (built-in SIT, high confidence) AND (keyword "payment" OR keyword "charge" OR keyword "card") within 300 characters
- Exclusion: Internal senders who are in the approved finance group
- Action at High severity: Block the message, notify sender with education text, alert the security operations team
- Action at Medium severity: Allow with justification prompt, audit
- Action at Low severity: Allow, log
The combination of pattern, context, exclusions, and tiered actions produces a rule that enforces correctly without burying the security team in alerts.
The Site-Level Label Story
Site-level labels are the piece that most deployments miss entirely. A site-level label applied to a SharePoint site or Microsoft 365 group controls the default privacy, external sharing, and device access for the entire site.
The working pattern is to publish site-level labels aligned to the content sensitivity labels, and to require a site label at site creation time. Organizations with this in place get consistent protection across the full site, not just individual files.
```powershell
# Publish a label for SharePoint site scope
New-Label -Name "SiteConfidential" -DisplayName "Confidential Site" -Tooltip "Site containing confidential content" -Comment "Site-level label for confidential content"
# Configure protections for the site label
Set-Label -Identity "SiteConfidential" -SiteAndGroupProtectionEnabled $true -SiteAndGroupProtectionPrivacy Private -SiteAndGroupProtectionAllowAccessToGuestUsers $false -SiteAndGroupProtectionBlockAccessFromUnmanagedDevices $true
# Publish the site label to the label policy
New-LabelPolicy -Name "ConfidentialSitePolicy" -Labels "SiteConfidential" -ExchangeLocation "All"
```
Common Pitfalls and How to Avoid Them
Over 18 months of enterprise deployments, four pitfalls consistently cause the most pain.
Pitfall 1: Published without testing. Labels are pushed to users before the IT team validates the auto-labeling behavior, the user experience, and the compatibility with existing workflows. The fix is mandatory simulation mode for auto-labeling and mandatory pilot groups for user-facing labels before organization-wide rollout.
Pitfall 2: Encryption without a key strategy. Labels that encrypt content tie that content to the Azure AD tenant's rights management infrastructure. Organizations that do not have a documented key strategy face problems when they need to share content with acquired companies, divested entities, or partner organizations. The fix is defining the key strategy, including Bring Your Own Key (BYOK) or Double Key Encryption (DKE) for the most sensitive content, before encryption is deployed at scale.
Pitfall 3: DLP false positives erode trust. Overly broad DLP rules trigger on routine business activity, users learn to ignore the alerts or find workarounds, and the compliance value of DLP collapses. The fix is rule design that combines pattern matching with strong contextual signals, plus continuous rule tuning based on false positive analysis.
Pitfall 4: No ongoing label maintenance. The label taxonomy is designed in Week 1 and never updated. Business needs evolve, new regulations come into effect, and the labels drift from reality. The fix is a quarterly label review that assesses label usage, incident patterns, user feedback, and regulatory changes, and updates the taxonomy accordingly.
Measurement and Continuous Improvement
Successful sensitivity label and DLP programs track specific metrics.
- Percentage of new content that receives a label within 24 hours of creation
- Distribution of labels applied (if 95 percent of content is General, the taxonomy is too coarse)
- Auto-labeling accuracy measured against a sample audit
- DLP incidents per 1,000 users per week, trended over time
- Mean time to review a DLP incident
- Percentage of DLP incidents that result in substantive enforcement action (if under 5 percent, the rules are too noisy)
Reviewing these metrics monthly surfaces the trends that drive the next round of taxonomy and rule tuning.
Getting Started
The fastest path to a working sensitivity label and DLP program is a 90-day structured deployment. In the first 30 days, design the taxonomy, build label protection definitions, and deploy to a pilot group. In the next 30 days, deploy auto-labeling in simulation mode and tune based on results. In the final 30 days, enable enforcement for the pilot group, measure outcomes, and begin planned organization-wide rollout.
Our SharePoint specialists run 90-day compliance deployments across healthcare, finance, and regulated industries. Contact our team to scope a sensitivity label and DLP engagement, or review our SharePoint consulting services for the full methodology.
Written by the SharePoint Support Team
Senior SharePoint Consultants | 25+ Years Microsoft Ecosystem Experience
Our senior SharePoint consultants bring deep expertise spanning 500+ enterprise migrations and compliance implementations across HIPAA, SOC 2, and FedRAMP environments. We cover SharePoint Online, Microsoft 365, migrations, Copilot readiness, and large-scale governance.
Expert SharePoint Services
Frequently Asked Questions
How many sensitivity labels should we publish in Microsoft 365?▼
What is the difference between sensitivity labels and retention labels?▼
Can we auto-label content in SharePoint Online?▼
What licenses do we need for sensitivity labels and DLP?▼
How do site-level sensitivity labels work with file-level labels?▼
Why are our DLP policies generating so many false positives?▼
Can sensitivity labels be applied automatically when content is created in SharePoint?▼
What happens when we apply encryption through a sensitivity label?▼
Need Expert Help?
Our SharePoint consultants are ready to help you implement these strategies in your organization.