Microscopic technical focus: Provisioning workspace dependencies, datastore bindings, compute targets, and IAM boundaries.
Beginner explanation: Think of the workspace as the operations control room. It records jobs and assets, but the actual data, secrets, images, and compute capacity live in dependent Azure resources that must be permissioned separately.
Operational split for this point: start with Machine Learning workspace, then verify Workspace managed identity and Datastore before trusting any production outcome. The exam is testing whether the candidate can locate the missing dependency, not whether the candidate recognizes every service name in the scenario.
For this knowledge point, the target objects are Machine Learning workspace, Workspace managed identity, Datastore, Compute cluster, Storage account, Private endpoint. The exam usually describes one broken link in that chain. The correct answer is the option that restores the missing operational dependency rather than the option that only describes the platform at a high level.
Why-layer: Machine Learning workspace becomes exam-relevant only when the surrounding dependency chain can run. In this topic, Training jobs queue or fail when compute quota, datastore identity, private DNS, or storage firewall dependencies are missing. The correct configuration matters because it changes the state that controls execution, authorization, resolution, evaluation, or observability; a nearby but unrelated action leaves the same failure mode in place.
Decision tree: if the scenario describes access failure, inspect identity and RBAC before changing compute or code; if it describes unresolved assets, inspect name, version, and scope; if it describes runtime failure, inspect logs, endpoint invocation, metrics, or evaluation output; if it describes quality degradation, inspect data, retrieval, evaluation, and monitoring evidence before changing the model.
Common mistakes: Confusing workspace Contributor with Storage Blob Data Contributor. Creating a datastore before verifying that the job identity can read the storage container. Forgetting private DNS when public access is disabled.
Practice question: A company creates an Azure Machine Learning workspace, but training jobs cannot mount data from a locked-down storage account. The question asks which workspace dependency or identity permission must be configured before jobs can run.
A. Assign Storage Blob Data Contributor to the workspace or compute managed identity on the storage scope, then validate the datastore with a smoke-test job. B. Assign Contributor on the Azure Machine Learning workspace to the data scientist group. C. Increase the compute cluster maximum node count to provide more training capacity. D. Enable Application Insights on the workspace and review request telemetry.
Correct Answer: A
Explanation: A is correct because the failure is data-plane storage authorization during job execution. B affects management-plane workspace operations, C affects capacity, and D improves observability but does not grant blob access.
The common decision point is: A contributor can create workspace objects but still fail data reads when the job identity lacks storage data-plane access. Therefore, read every scenario for the actor, the resource scope, the object version, the network path, the metric threshold, and the expected observable result.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Machine Learning workspace | Managed identity | System-assigned or user-assigned | Identity absent until enabled or assigned | Entra ID principal creation | Jobs cannot authenticate to protected datastores |
| Datastore | Credential mode | Account key, SAS, service principal, user identity, managed identity | Metadata only until credential or identity path works | Storage account, container, RBAC, firewall | Mount or download step fails with authorization error |
| Compute cluster | Node range | min=0 or greater, max within quota | Scaled down when idle | Regional VM quota and workspace | Job remains queued or provisioning fails |
| Private endpoint | DNS resolution | Approved private IP with linked private DNS zone | Public endpoint used unless restricted | VNet, subnet, DNS zone, firewall | Studio or job traffic cannot reach workspace or storage |
az ml workspace show -g rg-ai300 -n mlw-ai300-dev --query identity.principalId -o tsv
Command type: Azure ML CLI verification; confirm extension/version in the lab environment.
Reason: Read the workspace principal ID first because role assignment cannot target an identity that has not materialized in Entra ID.
Checkpoint: A non-empty principal ID is returned.
az role assignment create --assignee <principal-id> --role "Storage Blob Data Contributor" --scope <storage-id>
Command type: Azure CLI RBAC verification for Entra identity and Azure AI resource scope.
Reason: Assign data-plane storage access because workspace Contributor does not authorize blob reads inside the training container.
Checkpoint: Role assignment list shows Storage Blob Data Contributor at the storage account or container scope.
az ml datastore create --file datastore.yml -g rg-ai300 -w mlw-ai300-dev
Command type: Azure ML CLI verification; confirm extension/version in the lab environment.
Reason: Register the datastore after RBAC is available so the datastore reference resolves to a storage path the job identity can actually use.
Checkpoint: Datastore show output contains the expected account, container, and identity-based access mode.
az ml job create --file train-smoke-test.yml -g rg-ai300 -w mlw-ai300-dev
Command type: Azure ML CLI verification; confirm extension/version in the lab environment.
Reason: Submit a small smoke-test job because datastore creation alone does not prove runtime mount behavior.
Checkpoint: The job reaches Completed and logs show successful data access.
A user, workflow, or deployment command targets Machine Learning workspace and submits configuration to Azure control plane or a project runtime. Azure validates identity, resource scope, quota, version references, and network reachability because the runtime cannot safely use an object that is not authorized, versioned, reachable, or measurable. The configured object then participates in the runtime path through Workspace managed identity, Datastore, Compute cluster. This sequence works because each object unlocks the next dependency: identity allows access, versioning allows reproducibility, network resolution allows execution, and telemetry allows verification. When the workload executes, telemetry, status output, logs, API response, or evaluation metrics prove whether the chain is complete. If the chain breaks, the failure appears as the operational symptom described in the scenario: Training jobs queue or fail when compute quota, datastore identity, private DNS, or storage firewall dependencies are missing. An incorrect configuration creates the observed failure because it changes a nearby object while leaving the actual missing dependency unresolved.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Confirm workspace identity | az ml workspace show -g rg-ai300 -n mlw-ai300-dev --query identity.principalId -o tsv |
A principal ID is returned. Command type: Azure ML CLI verification; confirm extension/version in the lab environment. |
| Verify storage data-plane role | az role assignment list --assignee <principal-id> --scope <storage-id> -o table |
Storage Blob Data Contributor appears at the expected scope. Command type: Azure CLI RBAC verification for Entra identity and Azure AI resource scope. |
| Inspect datastore binding | az ml datastore show -n trainingdata -g rg-ai300 -w mlw-ai300-dev |
Account, container, and credential mode match the intended design. Command type: Azure ML CLI verification; confirm extension/version in the lab environment. |
| Prove runtime data access | az ml job show --name <smoke-test-job> -g rg-ai300 -w mlw-ai300-dev --query status |
Smoke-test job reaches Completed after reading the datastore. Command type: Azure ML CLI verification; confirm extension/version in the lab environment. |
Microscopic technical focus: Versioning data assets, environments, components, and registry-shared artifacts.
Beginner explanation: An asset is a versioned contract. A file path, Conda file, or script folder becomes exam-relevant only when Azure ML can resolve it by name, version, and scope.
Operational split for this point: start with Data asset, then verify Environment and Component before trusting any production outcome. The exam is testing whether the candidate can locate the missing dependency, not whether the candidate recognizes every service name in the scenario.
For this knowledge point, the target objects are Data asset, Environment, Component, Registry, Model asset, Workspace asset reference. The exam usually describes one broken link in that chain. The correct answer is the option that restores the missing operational dependency rather than the option that only describes the platform at a high level.
Why-layer: Data asset becomes exam-relevant only when the surrounding dependency chain can run. In this topic, Pipelines fail when an asset version is omitted, archived, or resolved from the wrong workspace scope. The correct configuration matters because it changes the state that controls execution, authorization, resolution, evaluation, or observability; a nearby but unrelated action leaves the same failure mode in place.
Decision tree: if the scenario describes access failure, inspect identity and RBAC before changing compute or code; if it describes unresolved assets, inspect name, version, and scope; if it describes runtime failure, inspect logs, endpoint invocation, metrics, or evaluation output; if it describes quality degradation, inspect data, retrieval, evaluation, and monitoring evidence before changing the model.
Common mistakes: Selecting a familiar Azure service without checking the missing dependency in the scenario. Treating a successful create operation as proof of runtime behavior. Choosing a monitoring action when the scenario asks for configuration or access remediation.
Practice question: A pipeline in a production workspace cannot reuse a component or environment created in a development workspace. The question asks how to make the asset version discoverable and reproducible across workspaces.
A. Publish the component or environment as a versioned asset in an Azure ML registry and reference that registry version from the production pipeline. B. Copy the component YAML file into the production repository and keep the same display name. C. Give the production workspace Reader access to the development workspace. D. Rename the pipeline job so it matches the development job name.
Correct Answer: A
Explanation: A is correct because cross-workspace reuse depends on asset name, version, and registry scope. B copies text but not a registered asset, C does not create a resolvable component contract, and D changes only metadata.
The common decision point is: A local workspace component cannot be reused in another workspace until it is published or referenced through a registry. Therefore, read every scenario for the actor, the resource scope, the object version, the network path, the metric threshold, and the expected observable result.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Data asset | Version | Immutable named version or latest label | Unversioned local path until created | Datastore or URI path | Pipeline input resolves to stale or missing data |
| Environment | Image and dependencies | Curated image, custom image, Conda specification | Draft YAML before registration | ACR/base image/package feed | Job fails during image build or import |
| Component | Interface contract | Inputs, outputs, command, environment | Local YAML until registered | Workspace or registry scope | Pipeline compilation cannot resolve component |
| Registry asset | Sharing scope | Registry name plus asset name/version | Unavailable outside workspace | Registry permissions and region support | Production workspace cannot consume approved asset |
az ml environment create --file environment.yml -g rg-ai300 -w mlw-ai300-dev
Command type: Azure ML CLI verification; confirm extension/version in the lab environment.
Reason: Register the environment first because component execution must reference a reproducible runtime image.
Checkpoint: Environment list shows the expected name and version.
az ml component create --file component.yml -g rg-ai300 -w mlw-ai300-dev
Command type: Azure ML CLI verification; confirm extension/version in the lab environment.
Reason: Create the component after its environment exists so the component contract can resolve its runtime dependency.
Checkpoint: Component show output contains inputs, outputs, command, and environment reference.
az ml component create --file component.yml --registry-name ai300registry
Command type: Azure ML CLI verification; confirm extension/version in the lab environment.
Reason: Publish to registry when another workspace must consume the same component without copying YAML by hand.
Checkpoint: Registry component list shows the approved version.
az ml job create --file pipeline.yml -g rg-ai300-prod -w mlw-ai300-prod
Command type: Azure ML CLI verification; confirm extension/version in the lab environment.
Reason: Run the pipeline in the consuming workspace to prove registry-scoped asset resolution works.
Checkpoint: Pipeline job graph resolves the registry component version and starts execution.
A user, workflow, or deployment command targets Data asset and submits configuration to Azure control plane or a project runtime. Azure validates identity, resource scope, quota, version references, and network reachability because the runtime cannot safely use an object that is not authorized, versioned, reachable, or measurable. The configured object then participates in the runtime path through Environment, Component, Registry. This sequence works because each object unlocks the next dependency: identity allows access, versioning allows reproducibility, network resolution allows execution, and telemetry allows verification. When the workload executes, telemetry, status output, logs, API response, or evaluation metrics prove whether the chain is complete. If the chain breaks, the failure appears as the operational symptom described in the scenario: Pipelines fail when an asset version is omitted, archived, or resolved from the wrong workspace scope. An incorrect configuration creates the observed failure because it changes a nearby object while leaving the actual missing dependency unresolved.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| List registered environment version | az ml environment show -n <env-name> --version <version> -g rg-ai300 -w mlw-ai300-dev |
Environment resolves with expected image and dependencies. Command type: Azure ML CLI verification; confirm extension/version in the lab environment. |
| Inspect component contract | az ml component show -n <component-name> --version <version> -g rg-ai300 -w mlw-ai300-dev |
Inputs, outputs, command, and environment reference are present. Command type: Azure ML CLI verification; confirm extension/version in the lab environment. |
| Verify registry asset | az ml component show -n <component-name> --version <version> --registry-name ai300registry |
Registry returns the shared component version. Command type: Azure ML CLI verification; confirm extension/version in the lab environment. |
| Confirm consuming pipeline state | az ml job show --name <pipeline-job> -g rg-ai300-prod -w mlw-ai300-prod --query status |
Pipeline compiles and starts with registry asset references. Command type: Azure ML CLI verification; confirm extension/version in the lab environment. |
Microscopic technical focus: Deploying secure Machine Learning infrastructure with Bicep, Azure CLI, GitHub Actions, and source control.
Beginner explanation: IaC means the environment can be recreated by a pipeline. The exam cares less about the template language itself and more about whether identity, network, and resource dependencies are repeatable.
Operational split for this point: start with Bicep module, then verify GitHub Actions workflow and Federated credential before trusting any production outcome. The exam is testing whether the candidate can locate the missing dependency, not whether the candidate recognizes every service name in the scenario.
For this knowledge point, the target objects are Bicep module, GitHub Actions workflow, Federated credential, Azure CLI deployment, Network rule, Git repository. The exam usually describes one broken link in that chain. The correct answer is the option that restores the missing operational dependency rather than the option that only describes the platform at a high level.
Why-layer: Bicep module becomes exam-relevant only when the surrounding dependency chain can run. In this topic, Automation fails before resource deployment when the workflow lacks id-token permission or the federated credential subject does not match the branch. The correct configuration matters because it changes the state that controls execution, authorization, resolution, evaluation, or observability; a nearby but unrelated action leaves the same failure mode in place.
Decision tree: if the scenario describes access failure, inspect identity and RBAC before changing compute or code; if it describes unresolved assets, inspect name, version, and scope; if it describes runtime failure, inspect logs, endpoint invocation, metrics, or evaluation output; if it describes quality degradation, inspect data, retrieval, evaluation, and monitoring evidence before changing the model.
Common mistakes: Selecting a familiar Azure service without checking the missing dependency in the scenario. Treating a successful create operation as proof of runtime behavior. Choosing a monitoring action when the scenario asks for configuration or access remediation.
Practice question: A team needs repeatable Azure ML environments and wants GitHub Actions to deploy workspaces, storage, network rules, and compute without storing long-lived secrets.
A. Use Bicep or ARM deployment from GitHub Actions with OIDC federation and validate the resource-group deployment output. B. Create the workspace manually in the portal and export screenshots for audit evidence. C. Store an Owner client secret in GitHub repository secrets and reuse it for every environment. D. Run Azure CLI commands locally from an engineer workstation after each merge.
Correct Answer: A
Explanation: A is correct because it provides repeatable infrastructure and short-lived CI identity. B and D are not repeatable pipeline controls, while C works technically but creates long-lived credential risk.
The common decision point is: Secret-based deployment can work but creates rotation and leakage risk; OIDC federation provides short-lived workflow authentication. Therefore, read every scenario for the actor, the resource scope, the object version, the network path, the metric threshold, and the expected observable result.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Bicep module | Resource graph | Workspace, storage, key vault, ACR, networking | Template only until deployed | Resource group and provider registration | Deployment fails or creates incomplete dependency graph |
| Federated credential | Subject claim | Repository, branch, environment, or workflow subject | No trust relationship | Entra app and GitHub OIDC token | azure/login fails before deployment |
| GitHub workflow | Permissions | id-token: write and contents: read | Token unavailable by default | OIDC federation and Azure login | Workflow cannot request Azure token |
| Deployment output | State evidence | Succeeded, failed, outputs JSON | No evidence until queried | Azure Resource Manager deployment record | Pipeline cannot prove created resources |
az deployment group validate -g rg-ai300 -f infra/main.bicep
Command type: Azure CLI verification; confirm parameters against the active Azure CLI version.
Reason: Validate before deployment because syntax and parameter errors should fail before any resource mutation occurs.
Checkpoint: Validation returns no template errors.
az deployment group create -g rg-ai300 -f infra/main.bicep
Command type: Azure CLI verification; confirm parameters against the active Azure CLI version.
Reason: Deploy the full dependency graph together so workspace resources and dependent services remain consistent.
Checkpoint: Deployment provisioningState is Succeeded.
az ad app federated-credential create --id <app-id> --parameters credential.json
Command type: Azure CLI verification; confirm parameters against the active Azure CLI version.
Reason: Create OIDC trust so GitHub can obtain short-lived Azure tokens without storing a client secret.
Checkpoint: The federated credential subject matches the repository branch or environment.
gh run view <run-id> --log
Command type: GitHub CLI workflow verification.
Reason: Inspect workflow logs because CI evidence must show the template was deployed by the pipeline identity.
Checkpoint: Logs show azure/login and deployment steps completed successfully.
A user, workflow, or deployment command targets Bicep module and submits configuration to Azure control plane or a project runtime. Azure validates identity, resource scope, quota, version references, and network reachability because the runtime cannot safely use an object that is not authorized, versioned, reachable, or measurable. The configured object then participates in the runtime path through GitHub Actions workflow, Federated credential, Azure CLI deployment. This sequence works because each object unlocks the next dependency: identity allows access, versioning allows reproducibility, network resolution allows execution, and telemetry allows verification. When the workload executes, telemetry, status output, logs, API response, or evaluation metrics prove whether the chain is complete. If the chain breaks, the failure appears as the operational symptom described in the scenario: Automation fails before resource deployment when the workflow lacks id-token permission or the federated credential subject does not match the branch. An incorrect configuration creates the observed failure because it changes a nearby object while leaving the actual missing dependency unresolved.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Validate template syntax | az deployment group validate -g rg-ai300 -f infra/main.bicep |
Validation completes without template errors. Command type: Azure CLI verification; confirm parameters against the active Azure CLI version. |
| Inspect deployment state | az deployment group show -g rg-ai300 -n main --query properties.provisioningState |
Provisioning state is Succeeded. Command type: Azure CLI verification; confirm parameters against the active Azure CLI version. |
| Verify federated credential | az ad app federated-credential list --id <app-id> -o table |
Credential subject matches GitHub repository branch or environment. Command type: Azure CLI verification; confirm parameters against the active Azure CLI version. |
| Inspect workflow evidence | gh run view <run-id> --log |
Log shows azure/login and deployment steps succeeded. Command type: GitHub CLI workflow verification. |
When training jobs cannot mount data from a secured storage account, what should be checked before changing the compute configuration?
Check the workspace or compute managed identity, storage data-plane RBAC, datastore configuration, firewall rules, and private DNS path.
Azure Machine Learning workspace Contributor permissions do not automatically grant blob data access during job execution. The job runtime needs a resolvable datastore and an identity with roles such as Storage Blob Data Contributor at the correct storage scope. If networking is restricted, the private endpoint and DNS path must also be valid. This is a common AI-300 scenario because the visible failure often appears as a training or mount issue, while the root cause is identity or network dependency.
Demand Score: 94
Exam Relevance Score: 98
Why should Azure Machine Learning assets be registered with explicit versions before they are used in production pipelines?
Explicit versions make data assets, environments, components, and models reproducible and resolvable by pipeline jobs.
Production pipelines should not depend on local files, implicit latest behavior, or unregistered runtime definitions. Versioned assets create stable contracts for inputs, environments, component interfaces, and model artifacts. They also make rollback, audit, and cross-workspace promotion possible. The exam often tests whether candidates can identify when a failed pipeline is caused by the wrong asset scope, missing version, or unregistered dependency.
Demand Score: 90
Exam Relevance Score: 96
When should an Azure ML registry be used instead of copying component YAML files between workspaces?
Use a registry when approved assets must be shared, versioned, and consumed consistently across multiple workspaces.
Copying YAML preserves text but does not create a shared, versioned asset contract. A registry lets teams publish components, environments, and models that production workspaces can resolve by name and version. This supports governance, reuse, and repeatable promotion from development to production. In exam scenarios, the correct action usually publishes or references the asset through the proper registry scope instead of manually copying files.
Demand Score: 86
Exam Relevance Score: 94
What should a GitHub Actions workflow prove after provisioning Azure Machine Learning infrastructure with Bicep and Azure CLI?
It should prove that the workspace, dependent resources, identities, network controls, and basic Azure ML operations are usable after deployment.
Infrastructure as code is not complete just because deployment commands succeed. A reliable workflow should validate the resulting workspace identity, datastore access, compute availability, role assignments, and network reachability. This matters because AI-300 focuses on operationalizing AI workloads, where provisioning must result in a platform that can actually run jobs, resolve assets, and support secure lifecycle automation.
Demand Score: 88
Exam Relevance Score: 95
How should restricted network access be handled for an Azure Machine Learning workspace that must use private resources?
Configure private endpoints, storage firewall access, private DNS zones, and identity permissions as one dependency chain.
Private networking failures are rarely solved by changing only one setting. The workspace, storage, container registry, key vault, and compute path may all need reachable private endpoints and correct DNS resolution. Even with the right private endpoint, jobs can still fail if the runtime identity lacks access to the protected resource. AI-300 expects candidates to connect network isolation with authentication and runtime validation.
Demand Score: 91
Exam Relevance Score: 97
What is the safest way to validate a newly configured datastore and compute target before using them in a production training pipeline?
Run a small smoke-test job that reads from the datastore on the intended compute target and verify successful completion.
Creating a datastore or compute cluster only proves the control-plane object exists. It does not prove that the job runtime can authenticate, mount or download data, resolve private network names, or obtain compute quota. A smoke-test job gives concrete evidence that the configured dependency chain works under runtime conditions. This aligns with the exam emphasis on observable operational readiness.
Demand Score: 89
Exam Relevance Score: 96