Sensitive Personal Information Detection

This system lets each tenant decide whether sensitive personal information should be detected and masked before a request is sent to a downstream service. The platform gateway sends qualifying requests to a dedicated redaction service, and that service calls Azure AI Language PII detection to detect sensitive entities such as names, addresses, financial details, and medical information.

Platform scope:
This shared Language integration is limited to PII detection. The AI Hub does not use Azure AI Language for summarization, sentiment, key phrase extraction, classification, question answering, or orchestration workflows. New non-PII text-analysis workloads should be built on tenant Azure AI Foundry model deployments instead.

Broad Sensitive Data Detection
The Azure language service uses machine learning to detect and mask many types of sensitive information, including names, addresses, financial details, and medical terms.

Overview

The system uses Azure AI Language PII detection to identify and redact sensitive information in requests. This machine-learning approach gives broader coverage than a simple list of regular expressions because it can use context, not just text patterns, to decide what should be masked.

Supported Entity Types

The Azure language service can detect and redact a wide range of sensitive personal information types:

Personal Identifiers: Names, email addresses, phone numbers
Financial Data: Credit card numbers, bank account numbers
Government IDs: Social Security Numbers, passport numbers, driver's license numbers
Location Data: Physical addresses, GPS coordinates
Medical Information: Medical record numbers, health insurance information

Architecture

Requests first enter the shared API Management gateway, often shortened to APIM in the implementation. Tenant-specific gateway policies then decide whether sensitive-data redaction should run. When redaction is enabled, the pii-anonymization policy fragment sends the full request body to the external PII Redaction Service, which runs as a container app inside the shared internal container app environment. That service makes the Azure language service calls, breaks large content into manageable pieces, checks that every part was processed, and returns the fully redacted body to the gateway. The service address is stored in the API Management named value piiExternalRedactionUrl.

Request Flow

A client request reaches the API Management gateway and includes content that may contain sensitive information.
The tenant's gateway policy reads that tenant's redaction settings.
The pii-anonymization fragment sends the request body and redaction settings to the PII Redaction Service by calling POST /redact.
The redaction service breaks large messages into smaller documents, sends them to the language service in bounded concurrent batches, retries transient throttling and 5xx responses within the request deadline, and then reassembles the response.
The service returns the redacted body together with coverage status and diagnostic details.
The API Management gateway replaces the original request body with the redacted content and forwards the request to the backend service.

How It Works

A tenant request arrives at the API Management gateway.
The tenant gateway policy, generated from api_policy.xml.tftpl, routes the request to the correct backend based on the path. Examples include openai, documentintelligence, ai-search, and storage.
For OpenAI requests, when sensitive-data redaction is enabled, the tenant policy sets configuration variables from the tenant's apim_policies.pii_redaction settings:
- piiExcludedCategories - Categories to exclude from detection
- piiDetectionLanguage - Language code for detection accuracy
- piiFailClosed - Whether to block requests on failure (fail-closed) or pass through (fail-open)
- piiScanRoles - JSON array of message roles to scan (default: user, assistant, tool)
- piiExternalRedactionUrl - Base URL of the PII Redaction Container App
The policy includes the pii-anonymization policy fragment.
The fragment creates a JSON payload that contains the original request body and the redaction settings, then sends it to the external PII Redaction Service by using <send-request> with a POST /redact call and a 90-second timeout.
The PII Redaction Service:
- Extracts chat messages and chunks long messages into documents (max 5 000 chars each, word-boundary split).
- Calls /language/:analyze-text in batches of up to 5 documents, with bounded concurrency and a maximum of 15 batches per request.
- Enforces a per-attempt timeout (10 s), honors Retry-After for 429 responses, applies exponential backoff for 5xx responses, and keeps all retry/backoff activity within an 85-second request deadline.
- Reassembles the redacted text back into the original message structure.
- Returns a JSON response with status, full_coverage, redacted_body, and diagnostics.
The API Management gateway validates the response by checking for HTTP 200, status == "ok", and full_coverage == true. It then stores the redacted body in piiAnonymizedContent, and the tenant policy replaces the original body by using <set-body>.

Example Redaction
Input: "My email is alice@example.com and my SSN is 123-45-6789"
Output: "My email is ##################### and my SSN is ###########"

Configuration Guide

Shared Configuration (All Environments)

Sensitive-data detection depends on two shared platform components being available: the Azure language service must be enabled through shared_config.language_service.enabled, and the PII Redaction Service container app must be deployed. The redaction service address is passed into API Management through the named value piiExternalRedactionUrl. The language service does not allow public network access. Instead, it is reached through a private endpoint, and the redaction service calls it over the virtual network.

Per-Tenant Configuration

Sensitive-data redaction is controlled separately for each tenant through API Management policy settings at apim_policies.pii_redaction.enabled. The tenant gateway policy only includes the redaction logic when both conditions are true: the tenant has turned redaction on, and the shared language service is enabled for the platform.

# Example tenant policy flags
apim_policies {
  pii_redaction {
    enabled                 = true

    # Optional: Fail-closed mode (default: false)
    # When true: blocks requests with 503 if Language Service fails
    # When false: passes through unredacted content on failure (fail-open)
    fail_closed             = false

    # Optional: Exclude specific PII categories from detection
    excluded_categories     = ["PhoneNumber", "Address"]

    # Optional: Language for detection (default: "en")
    detection_language      = "en"

    # Optional: Message roles to scan (default: ["user", "assistant", "tool"])
    scan_roles              = ["user", "assistant", "tool"]
  }
  rate_limiting {
    enabled = true
    tokens_per_minute = 10000
  }
  usage_logging {
    enabled = true
  }
  streaming_metrics {
    enabled = true
  }
  tracking_dimensions {
    enabled = true
  }
}

PII Redaction Configuration Options

Option	Type	Default	Description
`enabled`	boolean	false	Enable or disable PII redaction for this tenant
`fail_closed`	boolean	false	Controls failure behavior when Language Service is unavailable. When `true`, blocks requests with HTTP 503 error if PII service fails (fail-closed mode). When `false` (default), allows requests through with unredacted content on failure (fail-open mode). See FAQ below for details on choosing between modes.
`excluded_categories`	list(string)	[]	PII categories to exclude from detection (e.g., ["PhoneNumber", "Address"]). See Microsoft Learn for available categories.
`detection_language`	string	"en"	Language code for PII detection. Controls detection accuracy for language-specific entities.
`scan_roles`	list(string)	["user", "assistant", "tool"]	Message roles to scan for PII. Only messages with these roles are sent through the PII Redaction Service. Other roles (e.g., "system") pass through unredacted.

Flexible Tenant Control

Each tenant can independently enable or disable PII redaction based on their specific requirements. This allows for granular control over data privacy settings while sharing the same infrastructure.

Security Model

Managed Identity Authentication

Language Service uses managed identity only (local_auth_enabled = false) rather than API keys. The PII Redaction Service Container App authenticates to Language Service via its own managed identity with the Cognitive Services User role.

Private Network Access

Language Service network ACLs default to deny and it is accessed via a private endpoint. The PII Redaction Service runs on the shared internal Container App Environment and reaches Language Service over the VNet.

APIM routes requests to the PII Redaction Service over VNet-internal ingress. The external service handles all Language Service authentication via its own Managed Identity.

Observability

PII Detection Diagnostics

The pii-anonymization fragment emits rich diagnostic traces to Application Insights at multiple stages of the request lifecycle:

pii-outbound (verbose): Logged before sending to the PII Redaction Service. Includes the outbound body, service URL, request ID, and subscription ID.
pii-inbound (verbose): Logged when the PII Redaction Service responds. Includes HTTP status code, response duration (pii-duration-ms), and response size (pii-response-bytes).
pii-anonymization (information/error): Logged after processing the response.
- On success: pii-redaction-succeeded=true, pii-content-changed, pii-coverage-full, pii-entity-count, pii-document-count
- On failure: pii-redaction-succeeded=false, pii-failure-reason (e.g., no-response, payload-too-large, http-500, incomplete-coverage, service-error: ...)
request-id: Correlation ID for tracing the request across APIM and the PII Redaction Service
subscription-id: Tenant identifier for per-tenant analytics

These diagnostics enable monitoring of PII detection effectiveness, tracking service health and latency, and understanding failure modes across the APIM → PII Redaction Service → Language Service chain.

OpenAI Usage Logging

For OpenAI requests, the tenant policy template sets routing metadata variables before including the openai-usage-logging fragment:

backendId: APIM backend identifier (e.g., tenant-openai)
routeLocation: Azure region for the backend (single-backend for now)
routeName: Route identifier for future intelligent routing
deploymentName: Extracted from the URL path (e.g., gpt-4)

These dimensions are logged to Application Insights along with token usage data, enabling detailed cost allocation, chargeback analytics, and deployment-level monitoring even in the current single-backend setup.

Additional Monitoring Fragments

Fragment	Purpose	Status
`tracking-dimensions`	Extract session, user, app, and correlation IDs from headers	Active
`openai-usage-logging`	Detailed usage tracking to Application Insights with routing metadata	Active
`openai-streaming-metrics`	Token metrics for streaming responses	Reserved for future use
`intelligent-routing`	Priority-based multi-backend selection	Reserved for future use

Frequently Asked Questions

What kinds of entities are supported?

The PII fragment is designed for enterprise PII detection and supports names, addresses, SSN, medical terms, financial data, and many other entity types. Azure Language Service uses machine learning to detect entities based on context.

What happens if Language Service fails?

Failure behavior is controlled by the fail_closed configuration setting:

Fail-Open Mode (fail_closed = false, default)

PII redaction operates on a best-effort basis. If Language Service fails:

If the Language Service returns a non-200 response, the original unredacted content is passed through
If the response cannot be parsed, the original unredacted content is passed through
If the Language Service is temporarily unavailable, requests continue to flow without redaction
All failures are logged in diagnostics with appropriate status codes and error details

Use when: Service availability is prioritized over PII protection, or when PII redaction is a defense-in-depth measure rather than a strict requirement.

Fail-Closed Mode (fail_closed = true)

PII redaction is strictly enforced. If Language Service fails:

If the Language Service returns a non-200 response, the request is blocked with HTTP 503
If the response cannot be parsed, the request is blocked with HTTP 503
If the Language Service is unavailable, all requests are blocked until service is restored
Clients receive a JSON error response with code PiiRedactionFailed

Use when: Strict PII protection is required and no unredacted content should reach downstream services. Ensures that sensitive data is never exposed, even during service outages.

Configuration example:

pii_redaction {
  enabled     = true
  fail_closed = true  # Block requests if PII service fails
}

Does this apply to all requests?

PII anonymization is tenant-controlled and is applied (in the current routing) for OpenAI requests when enabled via apim_policies.pii_redaction.enabled.

What is the performance impact?

PII anonymization routes each request to the PII Redaction Service Container App, which uses bounded batch concurrency and transient retry handling for 429 and 5xx responses. For typical payloads (a few short messages), the total round-trip adds roughly 1–3 seconds. Larger payloads with many messages may take longer due to multiple batches, but the total processing deadline remains capped at 85 seconds, fitting within the APIM 90 s send-request timeout.

Can I customize which entity types are redacted?

Yes. Use the excluded_categories setting in apim_policies.pii_redaction to specify PII categories that should not be detected or redacted. For example:

pii_redaction {
  enabled = true
  excluded_categories = ["PhoneNumber", "Address"]
}

See Microsoft Learn: PII Entity Categories for the full list of available categories.

How does the PII Redaction Service handle large payloads?

The PII Redaction Service handles the Language API's per-call limits transparently:

Messages are chunked into documents of up to 5 000 characters using word-boundary splitting
Documents are sent to Language Service in batches of up to 5 per API call (max 15 batches = 75 documents), with bounded concurrency to control fan-out
Payloads exceeding 75 documents are rejected with HTTP 413
A per-attempt timeout (10 s) and total request deadline (85 s) prevent runaway processing, and transient 429 and 5xx responses are retried only within that same 85-second budget
Full-coverage verification ensures every document was successfully redacted before returning

Troubleshooting

Issue	Cause	Solution
503 PiiRedactionFailed (fail-closed)	PII Redaction Service returned non-200, incomplete coverage, or was unreachable	Check PII Redaction Service Container App logs and health endpoint (`/health`). Verify the service is running and Language Service is reachable from the CAE subnet.
413 from PII Redaction Service	Payload exceeds 75-document limit (15 batches × 5 docs)	The request payload is too large for PII processing. Reduce message count or length.
DNS / private endpoint not ready	Private DNS zone integration incomplete	Deployment includes a DNS wait step; ensure private DNS zone integration is complete before testing
Transient failures after deploy	Role assignment propagation delay	The PII Redaction Service managed identity needs `Cognitive Services User` on Language Service. Propagation can take 5–10 minutes.
No redaction occurring	PII detection not enabled	Confirm tenant `apim_policies.pii_redaction.enabled` is true, shared `language_service.enabled` is true, and `piiExternalRedactionUrl` is set.

For Technical Details
See the APIM Policy README for deep technical documentation on policy fragments, infrastructure components, and configuration schema.