Microscopic technical focus: Configuring Microsoft Foundry resources, projects, managed identities, RBAC, private networking, Bicep, and Azure CLI deployment.
Beginner explanation: Microsoft Foundry configuration gives GenAI apps a governed project, identity boundary, and network path. A model call must pass both permission and connectivity checks.
Operational split for this point: start with Microsoft Foundry resource, then verify Microsoft Foundry project and Managed identity before trusting any production outcome. The exam is testing whether the candidate can locate the missing dependency, not whether the candidate recognizes every service name in the scenario.
For this knowledge point, the target objects are Microsoft Foundry resource, Microsoft Foundry project, Managed identity, RBAC assignment, Private endpoint, Bicep deployment. The exam usually describes one broken link in that chain. The correct answer is the option that restores the missing operational dependency rather than the option that only describes the platform at a high level.
Why-layer: Microsoft Foundry resource becomes exam-relevant only when the surrounding dependency chain can run. In this topic, GenAI applications fail before inference when project RBAC, managed identity binding, private endpoint approval, or DNS integration is incomplete. The correct configuration matters because it changes the state that controls execution, authorization, resolution, evaluation, or observability; a nearby but unrelated action leaves the same failure mode in place.
Decision tree: if the scenario describes access failure, inspect identity and RBAC before changing compute or code; if it describes unresolved assets, inspect name, version, and scope; if it describes runtime failure, inspect logs, endpoint invocation, metrics, or evaluation output; if it describes quality degradation, inspect data, retrieval, evaluation, and monitoring evidence before changing the model.
Common mistakes: Selecting a familiar Azure service without checking the missing dependency in the scenario. Treating a successful create operation as proof of runtime behavior. Choosing a monitoring action when the scenario asks for configuration or access remediation.
Practice question: A GenAI application cannot call a Microsoft Foundry model deployment from a private network even though the application has a valid identity token.
A. Assign the application identity the required Microsoft Foundry or Azure OpenAI resource role and validate private endpoint DNS resolution from the application network. B. Rotate the API key because every Microsoft Foundry access issue is a credential leak. C. Switch to a different foundation model deployment without changing network or RBAC settings. D. Add more application logging before testing private endpoint connectivity.
Correct Answer: A
Explanation: A is correct because model invocation requires both authorization and network reachability. B, C, and D do not repair the missing RBAC or private DNS dependency.
The common decision point is: A valid token is insufficient when the client resolves the public endpoint while public network access is disabled. Therefore, read every scenario for the actor, the resource scope, the object version, the network path, the metric threshold, and the expected observable result.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Microsoft Foundry project | Access boundary | User, group, service principal, managed identity | No project operation until role assigned | Microsoft Foundry resource and Entra ID | Deployment or evaluation action is denied |
| Managed identity | Token use | System-assigned or user-assigned | Not bound to caller by default | Application hosting service | API request receives 401 or 403 |
| Private endpoint | Connection state | Pending, approved, rejected | No private route until approved | VNet and resource networking | Client cannot reach endpoint |
| Private DNS | Name resolution | Private IP for service FQDN | Public resolution by default | DNS zone link | Requests route to blocked public endpoint |
az role assignment create --assignee <principal-id> --role "Cognitive Services User" --scope <foundry-resource-id>
Command type: Azure CLI RBAC verification for Entra identity and Azure AI resource scope.
Reason: Assign the caller identity at the resource scope because token possession alone does not grant model invocation rights.
Checkpoint: Role assignment list shows Cognitive Services User for the managed identity.
az network private-endpoint-connection list --id <foundry-resource-id>
Command type: Azure CLI network verification for the Microsoft Foundry or Azure OpenAI resource; confirm the exact resource ID from the active environment.
Reason: Check private endpoint approval because a pending connection does not create a usable private route; confirm the exact resource ID from Microsoft Foundry management center or Azure resource properties.
Checkpoint: Connection state is Approved.
nslookup <resource-name>.openai.azure.com
Command type: network/DNS rehearsal command for private endpoint validation.
Reason: Validate DNS from the application network because private endpoint traffic depends on resolving the public FQDN to a private IP.
Checkpoint: Name resolves to a private address.
curl -H "Authorization: Bearer <token>" https://<resource-name>.openai.azure.com/openai/deployments/<deployment>/chat/completions?api-version=<version>
Command type: network/API rehearsal command for the selected app or Azure OpenAI endpoint; confirm URL, header, and API version before use.
Reason: Call the endpoint after RBAC and DNS checks to prove inference works from the intended network.
Checkpoint: API returns a model response rather than 401, 403, or network timeout.
A user, workflow, or deployment command targets Microsoft Foundry resource and submits configuration to Azure control plane or a project runtime. Azure validates identity, resource scope, quota, version references, and network reachability because the runtime cannot safely use an object that is not authorized, versioned, reachable, or measurable. The configured object then participates in the runtime path through Microsoft Foundry project, Managed identity, RBAC assignment. This sequence works because each object unlocks the next dependency: identity allows access, versioning allows reproducibility, network resolution allows execution, and telemetry allows verification. When the workload executes, telemetry, status output, logs, API response, or evaluation metrics prove whether the chain is complete. If the chain breaks, the failure appears as the operational symptom described in the scenario: GenAI applications fail before inference when project RBAC, managed identity binding, private endpoint approval, or DNS integration is incomplete. An incorrect configuration creates the observed failure because it changes a nearby object while leaving the actual missing dependency unresolved.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Verify caller authorization | az role assignment list --assignee <principal-id> --scope <foundry-resource-id> -o table |
The application identity has the required Microsoft Foundry or Azure OpenAI resource role at the expected scope. Command type: Azure CLI RBAC verification for Entra identity and Azure AI resource scope. |
| Check private endpoint approval | az network private-endpoint-connection list --id <foundry-resource-id> |
The private endpoint connection is Approved before private traffic is expected to work. Command type: Azure CLI network verification for the Microsoft Foundry or Azure OpenAI resource; confirm the exact resource ID from the active environment. |
| Validate private DNS from workload network | nslookup <resource-name>.openai.azure.com |
The service FQDN resolves to a private address from the application network. Command type: network/DNS rehearsal command for private endpoint validation. |
| Prove endpoint reachability | curl -H "Authorization: Bearer <token>" https://<resource-name>.openai.azure.com/openai/deployments/<deployment>/chat/completions?api-version=<version> |
The call returns a model response instead of authorization or network errors. Command type: network/API rehearsal command for the selected app or Azure OpenAI endpoint; confirm URL, header, and API version before use. |
Microscopic technical focus: Selecting models, deploying serverless endpoints, managing versions, and configuring provisioned throughput.
Beginner explanation: A foundation model deployment is not just model selection. It is a capacity, version, region, quota, and endpoint decision.
Operational split for this point: start with Foundation model, then verify Model deployment and Serverless API endpoint before trusting any production outcome. The exam is testing whether the candidate can locate the missing dependency, not whether the candidate recognizes every service name in the scenario.
For this knowledge point, the target objects are Foundation model, Model deployment, Serverless API endpoint, Managed compute option, Provisioned throughput unit, Model version. The exam usually describes one broken link in that chain. The correct answer is the option that restores the missing operational dependency rather than the option that only describes the platform at a high level.
Why-layer: Foundation model becomes exam-relevant only when the surrounding dependency chain can run. In this topic, High-volume workloads receive throttling or unstable latency when provisioned throughput and quota are not planned. The correct configuration matters because it changes the state that controls execution, authorization, resolution, evaluation, or observability; a nearby but unrelated action leaves the same failure mode in place.
Decision tree: if the scenario describes access failure, inspect identity and RBAC before changing compute or code; if it describes unresolved assets, inspect name, version, and scope; if it describes runtime failure, inspect logs, endpoint invocation, metrics, or evaluation output; if it describes quality degradation, inspect data, retrieval, evaluation, and monitoring evidence before changing the model.
Common mistakes: Selecting a familiar Azure service without checking the missing dependency in the scenario. Treating a successful create operation as proof of runtime behavior. Choosing a monitoring action when the scenario asks for configuration or access remediation.
Practice question: A workload needs predictable latency and capacity for a selected foundation model, and the team must decide between available deployment and throughput options.
A. Select a supported model/version/region and configure deployment capacity or provisioned throughput based on latency and volume requirements. B. Choose the largest available model because larger models always reduce production latency. C. Increase max output tokens to prevent deployment throttling. D. Monitor total calls only after users begin receiving 429 responses.
Correct Answer: A
Explanation: A is correct because production deployment is a model, region, quota, and capacity decision. B can increase latency/cost, C affects generation length, and D detects throttling after capacity planning failed.
The common decision point is: Model selection must balance latency, context window, modality, region, cost, and throughput instead of choosing the largest model by default. Therefore, read every scenario for the actor, the resource scope, the object version, the network path, the metric threshold, and the expected observable result.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Model deployment | Model and version | Supported model catalog entry and version | No endpoint until deployed | Region availability and quota | Application references deployment that does not exist |
| Serverless endpoint | Capacity mode | Provider-managed throughput | No reserved isolation | Model support and endpoint access | Variable latency under load |
| Provisioned throughput | Reserved capacity | Throughput unit count | Unavailable until quota approved | Supported model and region | 429 throttling or missed latency target |
| Deployment metric | Operational signal | Total calls, latency, throttling, token usage | Unobserved until monitored | Azure Monitor metric stream | Capacity issue is misdiagnosed |
az cognitiveservices account deployment list -g rg-ai300-genai -n <account>
Command type: Azure CLI verification for Azure OpenAI/Cognitive Services deployment state; confirm current parameters, region support, and model availability.
Reason: List deployments first so the application uses a deployment name that actually exists in the target account.
Checkpoint: The intended deployment appears with succeeded state.
az cognitiveservices account show-usage -g rg-ai300-genai -n <account>
Command type: Azure CLI verification for Azure OpenAI/Cognitive Services deployment state; confirm current parameters, region support, and model availability.
Reason: Check quota before deployment because capacity failures are quota constraints, not prompt or application bugs.
Checkpoint: Usage output shows available quota for the selected model family.
az cognitiveservices account deployment create --name <account> --resource-group rg-ai300-genai --deployment-name gpt-prod --model-name <model> --model-version <version> --model-format OpenAI --sku-name Standard --sku-capacity 30
Command type: Azure CLI verification for Azure OpenAI/Cognitive Services deployment state; confirm current parameters, region support, and model availability.
Reason: Create the deployment with explicit model and capacity so runtime calls target a stable endpoint configuration.
Checkpoint: Deployment provisioning state is Succeeded.
az monitor metrics list --resource <deployment-resource-id> --metric TotalCalls,ThrottledCalls,Latency
Command type: Azure Monitor CLI verification; confirm metric names for the selected Azure resource type.
Reason: Monitor deployment metrics because successful creation does not prove capacity is adequate under production traffic.
Checkpoint: Metrics show call volume, throttling, and latency by time window.
A user, workflow, or deployment command targets Foundation model and submits configuration to Azure control plane or a project runtime. Azure validates identity, resource scope, quota, version references, and network reachability because the runtime cannot safely use an object that is not authorized, versioned, reachable, or measurable. The configured object then participates in the runtime path through Model deployment, Serverless API endpoint, Managed compute option. This sequence works because each object unlocks the next dependency: identity allows access, versioning allows reproducibility, network resolution allows execution, and telemetry allows verification. When the workload executes, telemetry, status output, logs, API response, or evaluation metrics prove whether the chain is complete. If the chain breaks, the failure appears as the operational symptom described in the scenario: High-volume workloads receive throttling or unstable latency when provisioned throughput and quota are not planned. An incorrect configuration creates the observed failure because it changes a nearby object while leaving the actual missing dependency unresolved.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| List model deployments | az cognitiveservices account deployment list -g rg-ai300-genai -n <account> |
The intended deployment name appears in the target Azure OpenAI account. Command type: Azure CLI verification for Azure OpenAI/Cognitive Services deployment state; confirm current parameters, region support, and model availability. |
| Inspect account quota | az cognitiveservices account show-usage -g rg-ai300-genai -n <account> |
Usage and limit values show whether requested capacity is available. Command type: Azure CLI verification for Azure OpenAI/Cognitive Services deployment state; confirm current parameters, region support, and model availability. |
| Verify deployment state | az cognitiveservices account deployment show --name <account> --resource-group rg-ai300-genai --deployment-name gpt-prod |
Provisioning state and model/version match the release plan. Command type: Azure CLI verification for Azure OpenAI/Cognitive Services deployment state; confirm current parameters, region support, and model availability. |
| Review serving metrics | az monitor metrics list --resource <deployment-resource-id> --metric TotalCalls,ThrottledCalls,Latency |
Metrics expose traffic, throttling, and latency after deployment. Command type: Azure Monitor CLI verification; confirm metric names for the selected Azure resource type. |
Microscopic technical focus: Designing prompt variants, comparing prompt performance, and controlling releases through Git repositories.
Beginner explanation: A prompt is production logic. Treat prompt text like code: version it, review it, evaluate it, and keep a rollback point.
Operational split for this point: start with Prompt file, then verify Prompt variant and Evaluation dataset before trusting any production outcome. The exam is testing whether the candidate can locate the missing dependency, not whether the candidate recognizes every service name in the scenario.
For this knowledge point, the target objects are Prompt file, Prompt variant, Evaluation dataset, Git branch, Pull request, Release tag. The exam usually describes one broken link in that chain. The correct answer is the option that restores the missing operational dependency rather than the option that only describes the platform at a high level.
Why-layer: Prompt file becomes exam-relevant only when the surrounding dependency chain can run. In this topic, A prompt regression cannot be diagnosed when the active prompt text, model deployment, dataset, and metric output are not tied to a commit. The correct configuration matters because it changes the state that controls execution, authorization, resolution, evaluation, or observability; a nearby but unrelated action leaves the same failure mode in place.
Decision tree: if the scenario describes access failure, inspect identity and RBAC before changing compute or code; if it describes unresolved assets, inspect name, version, and scope; if it describes runtime failure, inspect logs, endpoint invocation, metrics, or evaluation output; if it describes quality degradation, inspect data, retrieval, evaluation, and monitoring evidence before changing the model.
Common mistakes: Selecting a familiar Azure service without checking the missing dependency in the scenario. Treating a successful create operation as proof of runtime behavior. Choosing a monitoring action when the scenario asks for configuration or access remediation.
Practice question: A prompt change improves one test case but breaks production behavior, and the team needs review, evaluation evidence, and rollback.
A. Commit the prompt variant and evaluation dataset to Git, run evaluation in CI, and tag the approved prompt version. B. Edit the production prompt directly in the portal and document the change in chat history. C. Rename the prompt file so reviewers can distinguish it from the old version. D. Rely on a few manual chat transcripts as release evidence.
Correct Answer: A
Explanation: A is correct because prompt changes need review, evaluation evidence, and rollback. B bypasses source control, C is naming not governance, and D is anecdotal testing.
The common decision point is: Editing a production prompt in a UI gives speed but loses repeatable comparison, review, rollback, and release traceability. Therefore, read every scenario for the actor, the resource scope, the object version, the network path, the metric threshold, and the expected observable result.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Prompt file | Version source | Git commit, branch, tag | Uncontrolled text until committed | Repository and review process | Active prompt cannot be traced |
| Evaluation dataset | Regression cases | JSONL or tabular test cases | Not linked until committed | Prompt evaluation workflow | Prompt passes anecdotal testing only |
| Pull request | Review gate | Approvals, checks, comments | No gate on direct edit | Branch protection and CI | Regression enters production |
| Release tag | Rollback pointer | Semantic tag or commit SHA | No rollback marker | Deployment process | Previous stable prompt cannot be restored |
git checkout -b prompt/claims-routing-v2
Command type: Git source-control verification.
Reason: Use a branch so prompt experimentation is isolated from the production prompt path.
Checkpoint: Branch name appears in git status.
git add prompts/claims-routing.prompt.yml evaluations/claims-routing.dataset.jsonl
Command type: Git source-control verification.
Reason: Commit prompt and evaluation data together because a prompt version without its test evidence is not release-ready.
Checkpoint: Git status shows both files staged.
gh workflow run evaluate-prompts.yml --ref prompt/claims-routing-v2
Command type: GitHub CLI workflow verification.
Reason: Run evaluation before merge so quality and safety regressions block the prompt release.
Checkpoint: Workflow result contains metric output for the branch.
git tag prompt-claims-routing-v2-approved
Command type: Git source-control verification.
Reason: Tag the approved prompt commit so rollback can target a known stable version.
Checkpoint: Git log and tag list show the approved release point.
A user, workflow, or deployment command targets Prompt file and submits configuration to Azure control plane or a project runtime. Azure validates identity, resource scope, quota, version references, and network reachability because the runtime cannot safely use an object that is not authorized, versioned, reachable, or measurable. The configured object then participates in the runtime path through Prompt variant, Evaluation dataset, Git branch. This sequence works because each object unlocks the next dependency: identity allows access, versioning allows reproducibility, network resolution allows execution, and telemetry allows verification. When the workload executes, telemetry, status output, logs, API response, or evaluation metrics prove whether the chain is complete. If the chain breaks, the failure appears as the operational symptom described in the scenario: A prompt regression cannot be diagnosed when the active prompt text, model deployment, dataset, and metric output are not tied to a commit. An incorrect configuration creates the observed failure because it changes a nearby object while leaving the actual missing dependency unresolved.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Check active branch | git status --short --branch |
Prompt work is isolated on the expected feature branch. Command type: Git source-control verification. |
| Verify prompt and evaluation files | git diff --name-only --cached |
Prompt and evaluation dataset changes are staged together. Command type: Git source-control verification. |
| Inspect evaluation workflow result | gh run view <run-id> --log |
Evaluation workflow logs include metric output for the prompt change. Command type: GitHub CLI workflow verification. |
| Confirm release tag | git tag --list prompt-claims-routing-v2-approved |
The approved prompt version has a rollback marker. Command type: Git source-control verification. |
What must be configured before Microsoft Foundry projects can securely use production data and model deployments?
Configure Foundry resources, project environments, managed identities, RBAC, networking, and access to dependent services.
GenAIOps infrastructure requires the same operational discipline as other production platforms. The project must have identity-based access to the resources it uses, and private networking must be configured when data or endpoints are restricted. A deployment that creates only a Foundry project without validating identity and network dependencies is incomplete. AI-300 expects candidates to identify the missing platform dependency in these scenarios.
Demand Score: 90
Exam Relevance Score: 97
When should Foundry infrastructure be deployed with Bicep templates and Azure CLI instead of manual portal configuration?
Use Bicep and Azure CLI when environments must be repeatable, reviewable, governed, and integrated with CI/CD automation.
Manual configuration is hard to audit and reproduce across development, test, and production environments. Infrastructure as code captures resource definitions, identity assignments, network settings, and deployment parameters in source control. This supports change review and repeatable promotion. In AI-300 exam scenarios, IaC is the right choice when the requirement emphasizes consistency, governance, secure deployment, or automated provisioning.
Demand Score: 85
Exam Relevance Score: 94
How should a team select a foundation model for a production generative AI workload?
Select the model by matching task quality, latency, cost, safety behavior, context requirements, and deployment constraints to the workload.
The best model is not always the largest or newest option. Production selection should be based on measured quality for the target task, acceptable response time, token cost, safety requirements, and availability of the desired deployment mode. AI-300 scenarios often include trade-offs between quality, throughput, and cost, so the correct answer usually evaluates model options against workload evidence instead of choosing by name alone.
Demand Score: 88
Exam Relevance Score: 95
When are provisioned throughput units appropriate for foundation model deployments?
Use provisioned throughput when predictable high-volume workloads require reserved capacity and more consistent performance.
Provisioned throughput is an operational capacity decision. It can help production workloads that need predictable throughput, but it may be unnecessary for small, variable, or experimental workloads. The exam may describe scaling pressure, rate limits, latency instability, or guaranteed capacity needs. The correct response weighs volume, performance requirements, and cost before selecting provisioned capacity.
Demand Score: 86
Exam Relevance Score: 94
Why should prompts be versioned in Git for GenAIOps workflows?
Prompt versioning makes prompt changes reviewable, comparable, reproducible, and linked to evaluation results.
Prompts are production artifacts. A small prompt change can affect quality, safety, cost, and latency, so teams need source control, review history, and a way to compare variants. Git-based prompt management also supports rollback and traceability from deployment behavior back to the exact prompt version. AI-300 includes prompt lifecycle questions because GenAI operations depend on controlled prompt changes as much as model changes.
Demand Score: 92
Exam Relevance Score: 97
How should prompt variants be compared before one is promoted to production?
Run the variants against representative evaluation datasets and compare quality, safety, cost, latency, and failure cases.
Prompt comparison should be evidence-driven. A prompt that looks better on a few examples may regress on groundedness, harmful content handling, response length, or token cost. Evaluation datasets and telemetry let teams decide which variant meets production goals. In exam scenarios, the strongest answer usually combines versioned prompts with measured comparison rather than relying on manual inspection alone.
Demand Score: 89
Exam Relevance Score: 96