Available AI Services
This page describes the services that the platform makes available to tenant teams, what a tenant actually receives when it is onboarded, and how access is controlled. It is intended to answer the practical question, “What can my team use here?”
Quick Navigation
OpenAI baseline plus tenant-specific Cohere and Mistral
Document Intelligence, Language PII, Speech
Shared vs. dedicated resource breakdown
Model Providers
Each tenant accesses model deployments through its dedicated AI Foundry Project endpoint, routed via APIM. OpenAI is the baseline model family for tenants across environments. Additional provider families such as Cohere and Mistral are enabled only when they are explicitly deployed for a tenant.
OpenAI Baseline Models
Chat & Reasoning Models
| Model | Kind | Best For | Quota (Canada East) |
|---|---|---|---|
| gpt-4.1 | Chat | Complex reasoning, long context | 30,000 TPM |
| gpt-4.1-mini | Chat | Fast, cost-efficient tasks | 150,000 TPM |
| gpt-4.1-nano | Chat | High-throughput simple tasks | 150,000 TPM |
| gpt-4o | Chat | Multimodal inputs (text + images) | 30,000 TPM |
| gpt-4o-mini | Chat | Fast multimodal tasks | 150,000 TPM |
| gpt-5-mini | Chat | Next-gen compact model | 10,000 TPM |
| gpt-5-nano | Chat | Next-gen ultra-fast tasks | 150,000 TPM |
| gpt-5.1-chat | Chat Preview | Next-gen preview chat model | 5,000 TPM |
| o1 | Reasoning | Step-by-step scientific / math problems | 5,000 TPM |
| o3-mini | Reasoning | Cost-efficient multi-step reasoning | 5,000 TPM |
| o4-mini | Reasoning | Latest compact reasoning model | 10,000 TPM |
| gpt-5.1-codex-mini | Code | Code generation & completion | 10,000 TPM |
Embedding Models
Embedding models convert text into dense vector representations — essential for search, RAG (Retrieval Augmented Generation), and semantic similarity tasks.
| Model | Dimensions | Best For | Quota (Canada East) |
|---|---|---|---|
| text-embedding-3-large | 3,072 | Highest accuracy RAG retrieval | 10,000 TPM |
| text-embedding-3-small | 1,536 | Fast, cost-efficient semantic search | 10,000 TPM |
| text-embedding-ada-002 | 1,536 | Legacy compatibility | 10,000 TPM |
Tenant-Specific Provider Additions
The current test deployment for ai-hub-admin includes additional non-OpenAI provider families. These are not currently universal tenant defaults and should be treated as explicitly assigned deployments.
| Provider | Currently Documented Deployments | Current Scope | Notes |
|---|---|---|---|
| Cohere | cohere-command-a, Cohere-rerank-v4.0-pro, Cohere-rerank-v4.0-fast |
ai-hub-admin only test | Configured in tenant IaC and deployed through the Foundry stack. Several other Cohere catalog models were evaluated but are not currently available in BC Gov Private Marketplace. |
| Mistral AI | Mistral-Large-3, mistral-document-ai-2505, mistral-document-ai-2512 |
ai-hub-admin only test | Chat traffic uses the OpenAI-compatible route /openai/v1/chat/completions. Document AI uses /providers/mistral/azure/ocr. The legacy non-OpenAI Mistral chat route is intentionally rejected by APIM. |
Cognitive Services
Beyond the model providers listed above, each tenant can access the following Azure AI capabilities through the hub.
Document Intelligence
Extract structured data from unstructured documents — forms, invoices, PDFs, scanned images. Supports custom models trained on your document types.
Dedicated instance per tenant — View setup guide →
Language Service / PII Detection
Detect and redact Personally Identifiable Information (PII) from text using Microsoft's pre-trained ML models. Supports named entity recognition, sentiment analysis, and key phrase extraction.
Shared hub service — View PII guide →
Speech Services
Convert speech to text and text to speech. Supports real-time transcription, batch audio processing, and custom voice models. Optimized for Canadian English and French.
Dedicated instance per tenant — See FAQ for details
AI Search
Managed vector + full-text search service for building RAG pipelines, semantic search over documents, and hybrid retrieval. Integrates directly with AI Foundry for grounding model responses.
Dedicated instance per tenant
What Each Tenant Gets
The hub operates on a shared platform, dedicated data model. Expensive control-plane infrastructure is shared; all data-plane services are isolated per tenant.
Shared Platform Infrastructure
Provisioned once, used by all tenants. Cost is split proportionally.
- AI Foundry Hub — shared model registry & endpoint
- API Management (APIM) — unified API gateway
- Application Gateway + WAF — TLS termination, routing
- Virtual Network & Private DNS — private connectivity
- Azure Container Registry — shared container images
- Log Analytics Workspace — centralized monitoring
Dedicated Per-Tenant Resources
Provisioned exclusively for your team. Cost is directly attributed to you.
- AI Foundry Project — your own API endpoint & prompt flows
- Model Deployments — OpenAI baseline plus any tenant-specific Cohere or Mistral deployments assigned to your project
- Document Intelligence — your own instance & models
- AI Search — your own indexes for RAG
- Speech Services — your own instance
- Key Vault — your secrets, isolated from other tenants
- Storage Account — your documents & data
- Cosmos DB — your NoSQL database (when enabled)
- Resource Group —
rg-{tenant}-{env} - APIM Subscription Key — your API credential
Accessing Your API Credentials
Approved tenant administrators can view and copy their APIM subscription keys for each environment directly from the portal — no platform team involvement required.
Credentials Panel
Navigate to your tenant's detail page in the Tenant Onboarding Portal. If your account is listed as a tenant admin and the tenant is approved, a Credentials panel appears below the configuration summary.
What You Can See
- Three environment tabs (dev / test / prod) — credentials load on demand when you select a tab.
- Primary and Secondary subscription keys — values are always masked. Use the copy button to place a key on your clipboard.
- Rotation metadata (when rotation is enabled) — shows last rotation date, next scheduled rotation, and which slot is currently safe to use.
- Tenant info panel (expandable) — lists your base APIM URL, enabled services, and deployed model names and capacities.
Security Notes
- Key values are never displayed as text in the page. Only copy-to-clipboard is supported.
- Access is restricted to users listed in your tenant's
admin_usersconfiguration. - Each environment's credentials are fetched independently from the hub Key Vault using the portal's Managed Identity.