Fast review map for this domain:
| Exam signal | First object to inspect | Correct-answer pattern |
|---|---|---|
| Many jobs need the right execution shape | Workspace compute and warehouse configuration | Choose job, serverless, warehouse, classic, or shared compute based on workload isolation and runtime needs |
| The same team needs governed object layout | Unity Catalog catalog, schema, volume, table, view, materialized view | Create the namespace layer before granting or ingesting data |
| External data must be reachable without copying first | Foreign catalog connection and DDL boundary | Validate connection, object ownership, and managed versus external table behavior |
| Business users need discoverable data | AI/BI Genie instructions and object descriptions | Document semantic intent in Unity Catalog rather than relying on notebook comments |
flowchart LR
N1[Workspace] --> N2
N2[Compute] --> N3
N3[Unity Catalog namespace] --> N4
N4[Data objects] --> N5
N5[Discovery metadata]
Practice Question: A data engineering team runs scheduled ETL jobs and interactive ad hoc SQL. The ETL jobs must not inherit user notebook libraries, while analysts need low-latency SQL queries. What configuration should be selected first?
A. Run both workloads on one shared all-purpose cluster so libraries are already installed.
B. Use job compute for scheduled ETL and a SQL warehouse for analyst SQL workloads.
C. Store all source files as CSV so both workloads read the same format.
D. Grant all analysts CAN MANAGE permission on the job cluster.
Correct Answer: B.
Explanation: B is correct because the workload type owns the compute decision: scheduled ETL needs job-scoped execution and interactive SQL needs a warehouse. A mixes isolation boundaries. C changes storage format, not execution behavior. D broadens permissions and does not choose the correct compute shape. Exam Takeaway: Select the object that owns the dependency; the distractor pattern is an adjacent Databricks feature that is technically real but does not satisfy the scenario's first blocking condition.
Compute questions start with execution ownership. A job cluster, all-purpose cluster, SQL warehouse, serverless resource, or ML runtime is not just a size choice; it determines startup behavior, library visibility, user attachment, isolation, and which workload API is available.
The exam trap is to resize or reuse compute before proving that the workload is running in the correct execution boundary. A package installed interactively may not exist on job compute. A SQL warehouse cannot repair Spark notebook dependency state. A shared cluster can make a job pass during testing while hiding library or permission drift.
The operational drill is to read the workload type, match it to compute, then validate runtime, library, pool/autoscale, and permission state. Correct answers usually avoid broad CAN MANAGE grants and choose the narrow compute resource that can produce repeatable run evidence.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Job compute | Cluster lifecycle | Per-job ephemeral to reusable job cluster | Not created until job run or configured | Job task definition and workspace quota | Job fails to start or reuses an overprivileged shared cluster |
| Serverless compute | Execution boundary | Supported serverless SQL or notebook/job scenarios | Disabled or region-policy dependent | Workspace enablement and supported workload type | Scenario asks for fast startup but selected classic cluster adds management overhead |
| SQL warehouse | Size and scaling | 2X-Small through large multi-cluster ranges when available | Stopped or auto-stop | Warehouse permission and query workload profile | Interactive SQL users wait behind ETL jobs |
| Photon acceleration | Runtime feature | Enabled where supported by runtime and workload | Runtime dependent | Compatible Databricks Runtime and query pattern | Expected SQL/Delta acceleration is absent |
| Cluster pool | Warm instance reuse | Minimum and maximum idle instances | No pool | VM SKU availability and workspace policy | Job startup latency remains high because compute is cold |
Exam implementation pattern:
databricks clusters list or databricks warehouses list only as validation evidence. Command confidence note: Commands shown in this section are verification-oriented examples. Validate exact Databricks CLI syntax against the active CLI and workspace version before using it as an authoritative production procedure.
The chain starts when a user, job, or SQL query requests execution. Azure Databricks checks the selected compute resource, policy, permissions, runtime, installed libraries, and startup state before user code runs.
If the runtime and dependency layer match the workload, the notebook, task, or SQL query receives the expected Spark, SQL, or ML environment. If the dependency is only installed in a different session or the user lacks attach permission, execution fails before the transformation logic can prove anything.
This is why a compute answer must prove both capability and boundary: workload type, runtime feature, dependency installation, and access permission all participate in the same startup chain.
Exam Trap Summary: Do not resize compute until workload type, runtime version, Photon need, pool/autoscale behavior, and permission boundary are verified.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Validate compute inventory | Azure Databricks workspace > Compute; or Databricks CLI active-version validation: databricks clusters list |
Cluster purpose, policy, and state match the intended workload |
| Validate SQL warehouse state | Azure Databricks workspace > SQL Warehouses; or active-version CLI: databricks warehouses list |
Warehouse is running or stopped with the expected size and permissions |
| Validate runtime feature | Cluster details > Configuration > Databricks Runtime and Photon setting | Runtime supports the selected workload and feature state is visible |
| Validate permission boundary | Compute resource > Permissions | Only intended principals can attach, restart, manage, or use the resource |
Practice Question: A scheduled training notebook succeeds when an engineer manually installs a Python package, but the Lakeflow Job fails with ModuleNotFoundError on job compute. What should be fixed first?
A. Move the target table to CSV so the package is unnecessary.
B. Install the dependency as a job or cluster-scoped library and validate the runtime supports the ML workload.
C. Grant SELECT on every table in the catalog.
D. Increase the SQL warehouse size.
Correct Answer: B.
Explanation: B is correct because the failure is dependency availability on the execution compute, not data format, table permission, or SQL serving capacity. The package must be installed where the scheduled job actually runs. Exam Takeaway: Select the object that owns the dependency; the distractor pattern is an adjacent Databricks feature that is technically real but does not satisfy the scenario's first blocking condition.
Compute questions start with execution ownership. A job cluster, all-purpose cluster, SQL warehouse, serverless resource, or ML runtime is not just a size choice; it determines startup behavior, library visibility, user attachment, isolation, and which workload API is available.
The exam trap is to resize or reuse compute before proving that the workload is running in the correct execution boundary. A package installed interactively may not exist on job compute. A SQL warehouse cannot repair Spark notebook dependency state. A shared cluster can make a job pass during testing while hiding library or permission drift.
The operational drill is to read the workload type, match it to compute, then validate runtime, library, pool/autoscale, and permission state. Correct answers usually avoid broad CAN MANAGE grants and choose the narrow compute resource that can produce repeatable run evidence.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Library installation | Package source | Workspace file, PyPI, Maven, CRAN, wheel, or notebook-scoped package | No attached library unless configured | Compute permission and network/package repository access | Notebook imports fail even though the cluster is running |
| Notebook-scoped library | Session dependency | Installed inside the current notebook session | Absent at session start | Notebook execution order and package compatibility | Scheduled job fails because dependency was installed only interactively |
| Cluster-scoped library | Compute dependency | Attached to all sessions on a cluster | Not installed until library is attached and cluster restarts if required | CAN MANAGE or policy-permitted library install rights | Different users see different import behavior on shared compute |
| Machine learning runtime | Runtime feature set | Databricks Runtime ML or supported ML feature setting | Standard runtime unless selected | Compatible node type, runtime version, and workspace policy | ML libraries or feature store/client behavior is unavailable |
| Compute access permission | Attach/use/manage boundary | CAN ATTACH TO, CAN RESTART, CAN MANAGE, or workspace-supported equivalents | Creator or admin controlled | Workspace permission model and compute policy | Job or notebook execution cannot attach to the selected compute because the principal lacks the required compute permission |
Exam implementation pattern:
import <package>; print(<package>.__version__). Command confidence note: Commands shown in this section are verification-oriented examples. Validate exact Databricks CLI syntax against the active CLI and workspace version before using it as an authoritative production procedure.
The chain starts when a user, job, or SQL query requests execution. Azure Databricks checks the selected compute resource, policy, permissions, runtime, installed libraries, and startup state before user code runs.
If the runtime and dependency layer match the workload, the notebook, task, or SQL query receives the expected Spark, SQL, or ML environment. If the dependency is only installed in a different session or the user lacks attach permission, execution fails before the transformation logic can prove anything.
This is why a compute answer must prove both capability and boundary: workload type, runtime feature, dependency installation, and access permission all participate in the same startup chain.
Exam Trap Summary: Do not rely on notebook-scoped installs for scheduled jobs; put required packages on the job or cluster runtime that actually executes the task.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Validate attached libraries | Azure Databricks workspace > Compute > target compute > Libraries | Required package, version, and install status are visible |
| Validate notebook import | Local lab rehearsal in notebook: import <package>; print(<package>.__version__) |
Package imports in the same execution context used by the job |
| Validate ML runtime | Compute details > Databricks Runtime | Runtime or feature setting matches the ML workload requirement |
| Validate compute permission | Compute resource > Permissions | The principal has only the attach, restart, or manage permission required by the scenario |
Practice Question: A team wants separate development and production namespaces, governed file access for landing files, and table-level grants. Which object order best establishes the control boundary?
A. Create a catalog and schema, create a volume for landing files, then create tables and views under the schema.
B. Create notebooks first, then let users write tables into whichever schema exists.
C. Grant workspace admin to every engineer so namespace creation is not blocked.
D. Create a SQL warehouse before deciding the Unity Catalog namespace.
Correct Answer: A.
Explanation: A is correct because Unity Catalog namespaces and volumes define the governance boundary before data objects are created. B allows uncontrolled placement. C uses excessive privilege. D can run queries but does not establish data ownership or securable hierarchy. Exam Takeaway: Select the object that owns the dependency; the distractor pattern is an adjacent Databricks feature that is technically real but does not satisfy the scenario's first blocking condition.
Catalog, schema, volume, table, view, materialized view, and naming-boundary design must be studied as a concrete Azure Databricks operating path: identify the owning object, the prerequisite state, the change mechanism, and the verification signal.
The correct action is the smallest action that changes the controlling dependency while preserving governance, repeatability, and observable evidence.
Wrong options usually name real features at the wrong layer, so the learner should eliminate any option that skips parent scope, identity, data-state, run-state, or monitoring proof.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Catalog | Top-level namespace | Environment, domain, or sharing boundary | No custom catalog until created | Metastore assignment and CREATE CATALOG privilege | Tables are created in an uncontrolled default namespace |
| Schema | Second-level grouping | Application, subject area, or lifecycle layer | Absent until created | Catalog ownership and USE CATALOG privilege | Permissions cannot be scoped cleanly to a data product |
| Volume | File storage object | Managed or external volume path | Absent until created | Storage credential and external location when external | Files are accessed through unmanaged paths and bypass governance |
| Managed table | Storage ownership | Unity Catalog managed storage | Created when table DDL executes | Catalog and schema storage location | Drop semantics or lifecycle expectations are misunderstood |
| Materialized view | Precomputed query object | Supported refresh behavior | Not refreshed until scheduled or triggered | Base object permissions and refresh compute | Queries return stale or inaccessible data |
Exam implementation pattern:
SHOW CATALOGS, SHOW SCHEMAS, or Catalog Explorer. Command confidence note: Commands shown in this section are verification-oriented examples. Validate exact Databricks CLI syntax against the active CLI and workspace version before using it as an authoritative production procedure.
The chain starts with identity resolution: Azure Databricks maps the user, group, or service principal to Unity Catalog privileges or external Azure resource permissions.
Unity Catalog then evaluates parent namespace traversal, object action, and fine-grained policy. For external storage, the storage credential or managed identity must also be authorized on the cloud resource. A failure at any hop can look like a table problem even when the table definition is correct.
Correct remediation changes the failed hop and preserves auditability. Broad workspace admin grants can mask the failure, but they do not prove the securable object or cloud resource was governed correctly.
Exam Trap Summary: Do not create notebooks, warehouses, or tables before catalog, schema, volume, and storage boundaries are defined.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Validate catalog namespace | SQL verification: SHOW CATALOGS; |
Expected catalog appears and follows naming convention |
| Validate schema placement | SQL verification: SHOW SCHEMAS IN <catalog>; |
Schemas map to environment or domain requirements |
| Validate volume object | Catalog Explorer > catalog > schema > Volumes | Volume path and type match governed file-access requirement |
| Validate table ownership | SQL verification: DESCRIBE EXTENDED <catalog>.<schema>.<table>; |
Provider, location, owner, and comment match the design |
Practice Question: A scenario requires querying an external operational database through Unity Catalog without copying its data into Delta tables. Which object is the controlling requirement?
A. A foreign catalog backed by a configured connection.
B. A managed Delta table created with CTAS.
C. A cluster pool sized for the operational database.
D. A row filter on a local view.
Correct Answer: A.
Explanation: A is correct because federation uses a connection-backed foreign catalog to expose remote objects. B copies or materializes data locally. C affects compute startup, not federation. D controls local result visibility but does not connect to the external database. Exam Takeaway: Select the object that owns the dependency; the distractor pattern is an adjacent Databricks feature that is technically real but does not satisfy the scenario's first blocking condition.
Connection-backed federation, external locations, and table-definition control must be studied as a concrete Azure Databricks operating path: identify the owning object, the prerequisite state, the change mechanism, and the verification signal.
The correct action is the smallest action that changes the controlling dependency while preserving governance, repeatability, and observable evidence.
Wrong options usually name real features at the wrong layer, so the learner should eliminate any option that skips parent scope, identity, data-state, run-state, or monitoring proof.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Connection | External system binding | Supported federation source | Absent until configured | Credential, network reachability, and metastore permission | Foreign catalog cannot enumerate remote objects |
| Foreign catalog | Federated namespace | Remote database object map | Absent until created | Connection object and CREATE FOREIGN CATALOG privilege | Queries fail or expose the wrong remote database |
| External location | Cloud storage path authorization | ADLS Gen2 URL or supported cloud path | Unconfigured | Storage credential and Azure role assignment | External tables cannot safely reference files |
| DDL statement | Definition operation | CREATE, ALTER, DROP, COMMENT, GRANT | No object change until executed | Object ownership and schema privileges | Table metadata does not match source or governance need |
| Managed/external table choice | Storage lifecycle | Managed by Unity Catalog or external path | Scenario dependent | Storage policy and retention expectation | DROP behavior conflicts with data retention |
Exam implementation pattern:
SHOW CREATE TABLE, and DESCRIBE EXTENDED metadata. Command confidence note: Commands shown in this section are verification-oriented examples. Validate exact Databricks CLI syntax against the active CLI and workspace version before using it as an authoritative production procedure.
The chain follows connection-backed federation, external locations, and table-definition control from request to control-plane validation to runtime evidence.
A valid prerequisite lets the operation proceed; a missing prerequisite fails before the visible artifact can produce the expected result.
The exam answer should change the first failed dependency and confirm it with observable state.
Exam Trap Summary: Do not copy remote data when federation is required; choose the connection-backed foreign catalog before CTAS or managed-table materialization.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Validate federation connection | Catalog Explorer > External Data > Connections | Connection exists, owner is correct, and source type matches scenario |
| Validate foreign catalog | SQL verification: SHOW CATALOGS; then inspect catalog type in Catalog Explorer |
Catalog is foreign and linked to the intended connection |
| Validate external table metadata | SQL verification: DESCRIBE EXTENDED <catalog>.<schema>.<table>; |
Location references the approved external path |
| Validate DDL result | SQL verification: SHOW CREATE TABLE <catalog>.<schema>.<table>; |
Definition preserves expected columns, storage provider, and table properties |
Practice Question: Analysts using AI/BI features repeatedly confuse net revenue with gross revenue because the table columns are named similarly. What should the data engineer improve first?
A. Increase the SQL warehouse size.
B. Add table and column descriptions and configure AI/BI Genie instructions for the dataset.
C. Move the table from Delta to CSV.
D. Disable lineage tracking for the schema.
Correct Answer: B.
Explanation: B is correct because the failure is semantic discovery, not compute. A may speed queries but will not teach the meaning of measures. C weakens table functionality. D removes governance evidence. Exam Takeaway: Select the object that owns the dependency; the distractor pattern is an adjacent Databricks feature that is technically real but does not satisfy the scenario's first blocking condition.
Semantic instructions, object descriptions, and discovery evidence in Unity Catalog must be studied as a concrete Azure Databricks operating path: identify the owning object, the prerequisite state, the change mechanism, and the verification signal.
The correct action is the smallest action that changes the controlling dependency while preserving governance, repeatability, and observable evidence.
Wrong options usually name real features at the wrong layer, so the learner should eliminate any option that skips parent scope, identity, data-state, run-state, or monitoring proof.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Table comment | Human-readable definition | Business description and usage guidance | Blank unless provided | Object ownership or ALTER privilege | Users misinterpret columns or choose wrong source |
| Column comment | Field-level meaning | Metric, identifier, status, or timestamp definition | Blank unless provided | Table ownership and DDL permission | Generated analysis uses ambiguous column semantics |
| AI/BI Genie instruction | Conversational analytics guidance | Workspace-supported instruction text | Not configured | Relevant data object and supported AI/BI experience | Questions map to the wrong dimension or metric |
| Data lineage | Dependency signal | Upstream and downstream object relationships | Populated by supported operations | Supported query or pipeline execution | Impact analysis misses a dependent dataset |
| Owner metadata | Accountability field | User, group, or service principal | Creator or assigned owner | Governance role assignment | No accountable steward for fixes or explanations |
Exam implementation pattern:
Command confidence note: Commands shown in this section are verification-oriented examples. Validate exact Databricks CLI syntax against the active CLI and workspace version before using it as an authoritative production procedure.
The chain follows semantic instructions, object descriptions, and discovery evidence in unity catalog from request to control-plane validation to runtime evidence.
A valid prerequisite lets the operation proceed; a missing prerequisite fails before the visible artifact can produce the expected result.
The exam answer should change the first failed dependency and confirm it with observable state.
Exam Trap Summary: Do not treat semantic confusion as compute slowness; fix table comments, column metadata, lineage, or Genie instructions before resizing a warehouse.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Validate table description | Catalog Explorer > table > Overview; or SQL: DESCRIBE EXTENDED <catalog>.<schema>.<table>; |
Comment explains grain, purpose, and owner |
| Validate column definitions | Catalog Explorer > table > Columns | Important measures and dimensions include clear descriptions |
| Validate Genie instruction state | Supported AI/BI Genie configuration path for the data object | Instructions mention business synonyms and calculation constraints |
| Validate lineage evidence | Catalog Explorer > table > Lineage | Upstream and downstream objects are visible for supported operations |
When should DP-750 learners choose job compute instead of an all-purpose cluster for a scheduled Azure Databricks workload?
Choose job compute when the workload needs repeatable, isolated execution with controlled runtime, libraries, and permissions for each scheduled run.
Job compute is designed for automated workloads such as Lakeflow Jobs tasks. It avoids hidden dependencies from interactive notebook sessions and helps ensure that each run uses the intended runtime, library set, and identity boundary. An all-purpose cluster is better for collaborative exploration, but it can blur dependency and permission ownership in production scenarios.
Demand Score: 91
Exam Relevance Score: 97
Why is a SQL warehouse usually the right compute target for analyst self-service SQL instead of a notebook cluster?
A SQL warehouse is optimized for SQL query serving, BI concurrency, warehouse permissions, auto-stop behavior, and low-latency interactive analytics.
The exam often separates Spark job execution from SQL serving. Analysts who run dashboards or ad hoc SQL usually need a warehouse because it provides the SQL execution boundary and concurrency model expected by Databricks SQL. A notebook cluster may run SQL commands, but it does not provide the same serving model or operational separation from engineering jobs.
Demand Score: 89
Exam Relevance Score: 95
What should be checked first when a notebook works interactively but fails as a Lakeflow Job with a missing Python package?
Check whether the package is installed on the job or cluster compute that actually runs the scheduled task.
Notebook-scoped installs can disappear when a job starts on clean job compute. The fix is usually to attach the dependency to the executing compute, use an appropriate Databricks Runtime or ML runtime, and validate the import in the same execution context as the job. Resizing compute or granting table permissions does not solve a dependency that fails before data access.
Demand Score: 93
Exam Relevance Score: 98
How should catalogs, schemas, volumes, tables, and views be organized when a team needs governed development and production boundaries?
Create the catalog and schema boundaries first, then create governed volumes for files and tables or views under the correct schema.
Unity Catalog organizes securable objects in a hierarchy. Catalogs often represent environment, domain, or sharing boundaries, while schemas group data products or lifecycle layers. Volumes provide governed file access before data becomes queryable tables. Creating notebooks or warehouses first does not establish the governance boundary that later grants and lineage depend on.
Demand Score: 88
Exam Relevance Score: 96
When should a foreign catalog be used in Azure Databricks?
Use a foreign catalog when users need to query a supported external database through Unity Catalog without first copying the data into local Delta tables.
A foreign catalog is backed by a connection to a remote system and exposes remote objects through the Unity Catalog namespace. It is different from an external table over cloud storage and different from CTAS into a managed table. The controlling requirement is whether the data should remain remote while still being discoverable and queryable through governed metadata.
Demand Score: 84
Exam Relevance Score: 93