Sensitive Personal Information Detection
This system lets each tenant decide whether sensitive personal information should be detected and masked before a request is sent to a downstream service. The platform gateway sends qualifying requests to a dedicated redaction service, and that service calls the Azure language service to detect sensitive entities such as names, addresses, financial details, and medical information.
The Azure language service uses machine learning to detect and mask many types of sensitive information, including names, addresses, financial details, and medical terms.
Overview
The system uses the Azure language service to identify and redact sensitive information in requests. This machine-learning approach gives broader coverage than a simple list of regular expressions because it can use context, not just text patterns, to decide what should be masked.
Supported Entity Types
The Azure language service can detect and redact a wide range of sensitive personal information types:
- Personal Identifiers: Names, email addresses, phone numbers
- Financial Data: Credit card numbers, bank account numbers
- Government IDs: Social Security Numbers, passport numbers, driver's license numbers
- Location Data: Physical addresses, GPS coordinates
- Medical Information: Medical record numbers, health insurance information
Architecture
Requests first enter the shared API Management gateway, often shortened to APIM in the implementation. Tenant-specific gateway policies then decide whether sensitive-data redaction should run. When redaction is enabled, the pii-anonymization policy fragment sends the full request body to the external PII Redaction Service, which runs as a container app inside the shared internal container app environment. That service makes the Azure language service calls, breaks large content into manageable pieces, checks that every part was processed, and returns the fully redacted body to the gateway. The service address is stored in the API Management named value piiExternalRedactionUrl.
Request Flow
- A client request reaches the API Management gateway and includes content that may contain sensitive information.
- The tenant's gateway policy reads that tenant's redaction settings.
- The
pii-anonymizationfragment sends the request body and redaction settings to the PII Redaction Service by callingPOST /redact. - The redaction service breaks large messages into smaller documents, sends them to the language service in bounded concurrent batches, retries transient throttling and 5xx responses within the request deadline, and then reassembles the response.
- The service returns the redacted body together with coverage status and diagnostic details.
- The API Management gateway replaces the original request body with the redacted content and forwards the request to the backend service.
How It Works
- A tenant request arrives at the API Management gateway.
- The tenant gateway policy, generated from
api_policy.xml.tftpl, routes the request to the correct backend based on the path. Examples includeopenai,documentintelligence,ai-search, andstorage. - For OpenAI requests, when sensitive-data redaction is enabled, the tenant policy sets configuration variables from the tenant's
apim_policies.pii_redactionsettings:piiExcludedCategories- Categories to exclude from detectionpiiDetectionLanguage- Language code for detection accuracypiiFailClosed- Whether to block requests on failure (fail-closed) or pass through (fail-open)piiScanRoles- JSON array of message roles to scan (default: user, assistant, tool)piiExternalRedactionUrl- Base URL of the PII Redaction Container App
- The policy includes the
pii-anonymizationpolicy fragment. - The fragment creates a JSON payload that contains the original request body and the redaction settings, then sends it to the external PII Redaction Service by using
<send-request>with aPOST /redactcall and a 90-second timeout. - The PII Redaction Service:
- Extracts chat messages and chunks long messages into documents (max 5 000 chars each, word-boundary split).
- Calls
/language/:analyze-textin batches of up to 5 documents, with bounded concurrency and a maximum of 15 batches per request. - Enforces a per-attempt timeout (10 s), honors
Retry-Afterfor 429 responses, applies exponential backoff for 5xx responses, and keeps all retry/backoff activity within an 85-second request deadline. - Reassembles the redacted text back into the original message structure.
- Returns a JSON response with
status,full_coverage,redacted_body, anddiagnostics.
- The API Management gateway validates the response by checking for HTTP 200,
status == "ok", andfull_coverage == true. It then stores the redacted body inpiiAnonymizedContent, and the tenant policy replaces the original body by using<set-body>.
Input: "My email is alice@example.com and my SSN is 123-45-6789"
Output: "My email is ##################### and my SSN is ###########"
Configuration Guide
Shared Configuration (All Environments)
Sensitive-data detection depends on two shared platform components being available: the Azure language service must be enabled through shared_config.language_service.enabled, and the PII Redaction Service container app must be deployed. The redaction service address is passed into API Management through the named value piiExternalRedactionUrl. The language service does not allow public network access. Instead, it is reached through a private endpoint, and the redaction service calls it over the virtual network.
Per-Tenant Configuration
Sensitive-data redaction is controlled separately for each tenant through API Management policy settings at apim_policies.pii_redaction.enabled. The tenant gateway policy only includes the redaction logic when both conditions are true: the tenant has turned redaction on, and the shared language service is enabled for the platform.
# Example tenant policy flags
apim_policies {
pii_redaction {
enabled = true
# Optional: Fail-closed mode (default: false)
# When true: blocks requests with 503 if Language Service fails
# When false: passes through unredacted content on failure (fail-open)
fail_closed = false
# Optional: Exclude specific PII categories from detection
excluded_categories = ["PhoneNumber", "Address"]
# Optional: Language for detection (default: "en")
detection_language = "en"
# Optional: Message roles to scan (default: ["user", "assistant", "tool"])
scan_roles = ["user", "assistant", "tool"]
}
rate_limiting {
enabled = true
tokens_per_minute = 10000
}
usage_logging {
enabled = true
}
streaming_metrics {
enabled = true
}
tracking_dimensions {
enabled = true
}
}
PII Redaction Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
enabled |
boolean | false | Enable or disable PII redaction for this tenant |
fail_closed |
boolean | false | Controls failure behavior when Language Service is unavailable. When true, blocks requests with HTTP 503 error if PII service fails (fail-closed mode). When false (default), allows requests through with unredacted content on failure (fail-open mode). See FAQ below for details on choosing between modes. |
excluded_categories |
list(string) | [] | PII categories to exclude from detection (e.g., ["PhoneNumber", "Address"]). See Microsoft Learn for available categories. |
detection_language |
string | "en" | Language code for PII detection. Controls detection accuracy for language-specific entities. |
scan_roles |
list(string) | ["user", "assistant", "tool"] | Message roles to scan for PII. Only messages with these roles are sent through the PII Redaction Service. Other roles (e.g., "system") pass through unredacted. |
Flexible Tenant Control
Each tenant can independently enable or disable PII redaction based on their specific requirements. This allows for granular control over data privacy settings while sharing the same infrastructure.
Security Model
Managed Identity Authentication
Language Service uses managed identity only (local_auth_enabled = false) rather than API keys. The PII Redaction Service Container App authenticates to Language Service via its own managed identity with the Cognitive Services User role.
Private Network Access
Language Service network ACLs default to deny and it is accessed via a private endpoint. The PII Redaction Service runs on the shared internal Container App Environment and reaches Language Service over the VNet.
APIM routes requests to the PII Redaction Service over VNet-internal ingress. The external service handles all Language Service authentication via its own Managed Identity.
Observability
PII Detection Diagnostics
The pii-anonymization fragment emits rich diagnostic traces to Application Insights at multiple stages of the request lifecycle:
- pii-outbound (verbose): Logged before sending to the PII Redaction Service. Includes the outbound body, service URL, request ID, and subscription ID.
- pii-inbound (verbose): Logged when the PII Redaction Service responds. Includes HTTP status code, response duration (
pii-duration-ms), and response size (pii-response-bytes). - pii-anonymization (information/error): Logged after processing the response.
- On success:
pii-redaction-succeeded=true,pii-content-changed,pii-coverage-full,pii-entity-count,pii-document-count - On failure:
pii-redaction-succeeded=false,pii-failure-reason(e.g.,no-response,payload-too-large,http-500,incomplete-coverage,service-error: ...)
- On success:
- request-id: Correlation ID for tracing the request across APIM and the PII Redaction Service
- subscription-id: Tenant identifier for per-tenant analytics
These diagnostics enable monitoring of PII detection effectiveness, tracking service health and latency, and understanding failure modes across the APIM → PII Redaction Service → Language Service chain.
OpenAI Usage Logging
For OpenAI requests, the tenant policy template sets routing metadata variables before including the openai-usage-logging fragment:
- backendId: APIM backend identifier (e.g.,
tenant-openai) - routeLocation: Azure region for the backend (single-backend for now)
- routeName: Route identifier for future intelligent routing
- deploymentName: Extracted from the URL path (e.g.,
gpt-4)
These dimensions are logged to Application Insights along with token usage data, enabling detailed cost allocation, chargeback analytics, and deployment-level monitoring even in the current single-backend setup.
Additional Monitoring Fragments
| Fragment | Purpose | Status |
|---|---|---|
tracking-dimensions |
Extract session, user, app, and correlation IDs from headers | Active |
openai-usage-logging |
Detailed usage tracking to Application Insights with routing metadata | Active |
openai-streaming-metrics |
Token metrics for streaming responses | Reserved for future use |
intelligent-routing |
Priority-based multi-backend selection | Reserved for future use |
Frequently Asked Questions
What kinds of entities are supported?
The PII fragment is designed for enterprise PII detection and supports names, addresses, SSN, medical terms, financial data, and many other entity types. Azure Language Service uses machine learning to detect entities based on context.
What happens if Language Service fails?
Failure behavior is controlled by the fail_closed configuration setting:
Fail-Open Mode (fail_closed = false, default)
PII redaction operates on a best-effort basis. If Language Service fails:
- If the Language Service returns a non-200 response, the original unredacted content is passed through
- If the response cannot be parsed, the original unredacted content is passed through
- If the Language Service is temporarily unavailable, requests continue to flow without redaction
- All failures are logged in diagnostics with appropriate status codes and error details
Use when: Service availability is prioritized over PII protection, or when PII redaction is a defense-in-depth measure rather than a strict requirement.
Fail-Closed Mode (fail_closed = true)
PII redaction is strictly enforced. If Language Service fails:
- If the Language Service returns a non-200 response, the request is blocked with HTTP 503
- If the response cannot be parsed, the request is blocked with HTTP 503
- If the Language Service is unavailable, all requests are blocked until service is restored
- Clients receive a JSON error response with code
PiiRedactionFailed
Use when: Strict PII protection is required and no unredacted content should reach downstream services. Ensures that sensitive data is never exposed, even during service outages.
Configuration example:
pii_redaction {
enabled = true
fail_closed = true # Block requests if PII service fails
}
Does this apply to all requests?
PII anonymization is tenant-controlled and is applied (in the current routing) for OpenAI requests when enabled via apim_policies.pii_redaction.enabled.
What is the performance impact?
PII anonymization routes each request to the PII Redaction Service Container App, which uses bounded batch concurrency and transient retry handling for 429 and 5xx responses. For typical payloads (a few short messages), the total round-trip adds roughly 1–3 seconds. Larger payloads with many messages may take longer due to multiple batches, but the total processing deadline remains capped at 85 seconds, fitting within the APIM 90 s send-request timeout.
Can I customize which entity types are redacted?
Yes. Use the excluded_categories setting in apim_policies.pii_redaction to specify PII categories that should not be detected or redacted. For example:
pii_redaction {
enabled = true
excluded_categories = ["PhoneNumber", "Address"]
}
See Microsoft Learn: PII Entity Categories for the full list of available categories.
How does the PII Redaction Service handle large payloads?
The PII Redaction Service handles the Language API's per-call limits transparently:
- Messages are chunked into documents of up to 5 000 characters using word-boundary splitting
- Documents are sent to Language Service in batches of up to 5 per API call (max 15 batches = 75 documents), with bounded concurrency to control fan-out
- Payloads exceeding 75 documents are rejected with HTTP 413
- A per-attempt timeout (10 s) and total request deadline (85 s) prevent runaway processing, and transient 429 and 5xx responses are retried only within that same 85-second budget
- Full-coverage verification ensures every document was successfully redacted before returning
Troubleshooting
| Issue | Cause | Solution |
|---|---|---|
| 503 PiiRedactionFailed (fail-closed) | PII Redaction Service returned non-200, incomplete coverage, or was unreachable | Check PII Redaction Service Container App logs and health endpoint (/health). Verify the service is running and Language Service is reachable from the CAE subnet. |
| 413 from PII Redaction Service | Payload exceeds 75-document limit (15 batches × 5 docs) | The request payload is too large for PII processing. Reduce message count or length. |
| DNS / private endpoint not ready | Private DNS zone integration incomplete | Deployment includes a DNS wait step; ensure private DNS zone integration is complete before testing |
| Transient failures after deploy | Role assignment propagation delay | The PII Redaction Service managed identity needs Cognitive Services User on Language Service. Propagation can take 5–10 minutes. |
| No redaction occurring | PII detection not enabled | Confirm tenant apim_policies.pii_redaction.enabled is true, shared language_service.enabled is true, and piiExternalRedactionUrl is set. |
See the APIM Policy README for deep technical documentation on policy fragments, infrastructure components, and configuration schema.