Practice question: A lightweight chat client returns a deployment-not-found error after the model works in the Foundry playground. What should the developer verify first?
A. Whether the system prompt includes more examples
B. Whether the temperature is lower than 0.2
C. Whether the application uses image input
D. Whether the client configuration uses the exact Foundry deployment name and endpoint
Correct Answer: D
Explanation: D is correct because the portal test proves the deployment works, so the client route is the likely failure. A and B affect response behavior after routing succeeds. C is unrelated unless the app sends images.
Foundry chat and agent questions are runtime-routing questions. The core objects are the system prompt or agent instruction, the deployment or agent ID, configured tools, and the client settings that send the request to the intended runtime object.
The drill is to validate in the portal before expanding code. If the playground or agent test works, a client failure usually points to endpoint, deployment name, agent ID, or credential mismatch. If the portal behavior is wrong, the instruction or tool boundary must be corrected first.
For exam transformation, treat every option as a proposed dependency. The correct option is the one that unlocks the blocked workflow and can be verified. Wrong options are often useful in another situation, but they fail here because they tune a later step, address a different modality, or repair a symptom without satisfying the scenario requirement.
For a beginner, this table means: prove the Foundry object works in the portal, then make sure the client is calling that exact object.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| System prompt | Behavior constraint | Role, boundaries, response format, safety rule | Empty or generic | Model request | Model produces off-scope or unstructured responses |
| Model deployment | Invocation target | Deployment name and endpoint | Not callable until created | Client SDK configuration | 404 deployment not found or wrong model family |
| Agent instruction | Task policy | Goal, allowed actions, tool limits | Undefined | Agent runtime | Agent chooses unsupported actions or ignores constraints |
| Client application | Configuration source | Endpoint, deployment/agent ID, credential | Local placeholder | Environment variables and SDK | Authentication or routing failure |
Suggested Lab Validation Ideas:
The following paths and commands are conceptual lab-style examples for practice. Adapt them to the current Microsoft documentation, SDK/API version, subscription permissions, and project environment before using them in a real implementation.
Portal path: Microsoft Foundry > Models + endpoints > Playground - portal test for model deployment response behavior. Portal path: Microsoft Foundry > Agents > Test - portal test for single-agent instruction and tool behavior. Python SDK rehearsal: python app.py - local client validation after setting endpoint, credential, and deployment or agent ID environment variables.The system prompt or agent instruction sets the behavior contract. The client request then targets a Foundry deployment or agent endpoint with credentials. The runtime loads the model and any configured agent tools, produces a response, and records evidence in the portal test surface or trace. If the deployment name, agent ID, or instruction is wrong, the failure appears as routing errors, unsupported behavior, or off-scope output.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Validate deployment identity | Microsoft Foundry portal > Models + endpoints > deployment details | Deployment name in code exactly matches the Foundry deployment |
| Validate agent behavior | Microsoft Foundry portal > Agents > Test conversation | Agent follows instruction boundaries and uses only configured tools |
| Validate client call path | Run lightweight client and inspect response/status | Client receives a successful response from the intended deployment or agent |
Practice question: A kiosk must listen to a visitor question and answer aloud. Which implementation sequence best matches the requirement?
A. Recognize speech to text, generate or retrieve the answer, then synthesize the answer to speech
B. Use sentiment analysis first, then generate an image response
C. Send the raw audio to a text-only chat model without transcription
D. Use text-to-speech before the question is converted to text
Correct Answer: A
Explanation: A is correct because the workflow requires audio input and audio output. B targets the wrong output. C breaks when the model cannot consume raw audio. D reverses the dependency.
Text and speech implementation questions are direction-sensitive. Spoken input must become text before a text-only reasoning step can use it, while a spoken answer requires generated or selected text to become audio.
The drill is to label the conversion direction and the expected response field. A transcript, sentiment label, key phrase list, synthesized audio stream, and spoken answer are different completion states, so the selected tool must create the one requested by the scenario.
For exam transformation, treat every option as a proposed dependency. The correct option is the one that unlocks the blocked workflow and can be verified. Wrong options are often useful in another situation, but they fail here because they tune a later step, address a different modality, or repair a symptom without satisfying the scenario requirement.
Read this table as a direction check: decide whether the app is converting speech to text, text to speech, or text to analysis output before writing code.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Text analysis call | Output type | Key phrases, entities, sentiment, summary | No analysis until requested | Text input and service capability | Application receives free-form text instead of required labels |
| Speech recognition | Direction | Audio to text | Not configured | Audio stream and language | Spoken input is not converted into prompt text |
| Speech synthesis | Direction | Text to audio | Not configured | Voice selection and output format | Application cannot return spoken responses |
| Audio format | Encoding/sample assumptions | Service-supported WAV/PCM or SDK stream | Client dependent | Speech call fails or returns poor recognition | undefined |
Suggested Lab Validation Ideas:
The following paths and commands are conceptual lab-style examples for practice. Adapt them to the current Microsoft documentation, SDK/API version, subscription permissions, and project environment before using them in a real implementation.
Portal path: Microsoft Foundry > Tools > Speech - portal verification of speech capability and configuration. Python SDK rehearsal: python speech_demo.py --input prompt.wav - local validation for audio-to-text or spoken-response flow; adapt to current SDK package names. Python SDK rehearsal: python text_analysis_demo.py --text reviews.txt - local validation for key phrase, entity, sentiment, or summary output.Audio and text solutions are directional. A microphone input must become text before a text-only model can reason over it, while a spoken answer requires generated text to be synthesized into audio. Text analysis tasks are evaluated by structured labels or extracted values. The exam answer must keep input direction, service capability, and expected output connected.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Validate speech direction | Inspect app requirement and Speech tool configuration | Configuration is speech-to-text for prompts or text-to-speech for spoken output |
| Validate text analysis contract | Run sample input through the text analysis call | Response contains the required entities, key phrases, sentiment, or summary fields |
| Validate audio compatibility | Inspect audio format or SDK stream settings | Audio input is accepted and produces transcript or playable output |
Practice question: An app must inspect a photo of equipment and explain whether a warning light is visible. Which capability is required?
A. An image generation model
B. A deployed multimodal model that can receive the image and prompt
C. Text-to-speech output
D. A document extraction schema
Correct Answer: B
Explanation: B is correct because the app must interpret an existing image. A creates new images. C changes text into audio. D extracts structured document fields and is not the first fit for visual reasoning.
Vision and image generation questions are modality-transformation questions. Vision interpretation starts from an existing visual input and returns labels or reasoning; image generation starts from a prompt and returns a new visual artifact.
The drill is to read scenario verbs carefully. Interpret, describe, identify, and detect point to visual understanding. Generate, create, or produce a picture points to image generation. The validation evidence must match that direction.
For exam transformation, treat every option as a proposed dependency. The correct option is the one that unlocks the blocked workflow and can be verified. Wrong options are often useful in another situation, but they fail here because they tune a later step, address a different modality, or repair a symptom without satisfying the scenario requirement.
For exam use, this table helps separate two easy-to-confuse choices: understanding an image that already exists versus generating a new image.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Visual input | Prompt attachment | Image plus text instruction | Not included unless sent | Multimodal deployment | Model answers without seeing image evidence |
| Vision interpretation | Result type | Description, classification, detected content | Depends on prompt and model | Image quality and capability | Wrong labels or missing visual reasoning |
| Image generation | Output artifact | Generated image from prompt | No image until model call | Generative image model and safety filters | No visual asset is produced |
| Safety filter | Content policy response | Allowed, revised, or blocked | Service controlled | Prompt and policy | Request is blocked or altered without handling |
Suggested Lab Validation Ideas:
The following paths and commands are conceptual lab-style examples for practice. Adapt them to the current Microsoft documentation, SDK/API version, subscription permissions, and project environment before using them in a real implementation.
Portal path: Microsoft Foundry > Playground > multimodal test - portal validation that image plus prompt input is accepted. Python SDK rehearsal: python vision_prompt.py --image meter.jpg --question 'What reading is visible?' - local validation for visual interpretation. Python SDK rehearsal: python image_generate.py --prompt 'product mockup on white background' - local validation for generated image output.Vision interpretation sends existing pixels to a model that can inspect visual features and combine them with the text instruction. Image generation starts with text and returns new pixels. The underlying request shape is therefore different. A correct exam choice follows the direction of transformation: image-to-answer for vision, prompt-to-image for generation.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Validate multimodal input | Run a visual prompt test with an attached image | Response uses visible image evidence rather than generic text |
| Validate generated asset | Inspect image generation response or output file path | A new image artifact is returned or a policy response is handled |
| Validate workload split | Compare scenario verb: interpret, describe, detect, generate, create | Selected service direction matches the requested transformation |
Practice question: A lightweight app must process recorded service calls and extract customer name, requested product, issue category, and promised follow-up date. What should the developer validate first?
A. That an image generation model can create a visual summary
B. That text-to-speech produces a natural voice
C. That a chat prompt can produce a paragraph summary
D. That Content Understanding returns the required structured fields from the audio
Correct Answer: D
Explanation: D is correct because the scenario requires structured extraction from audio. A and B target visual or spoken output. C may summarize but does not guarantee field-level output for the app.
Content Understanding questions are schema-evidence questions. The operational object is not a paragraph summary; it is the repeatable extraction result that an application can map into fields, tables, confidence checks, and review workflows.
The drill is to design from the target fields backward. If the app needs customer name, issue category, line items, dates, or amounts, the correct answer must preserve schema, status, confidence, and client mapping rather than only producing natural-language output.
For exam transformation, treat every option as a proposed dependency. The correct option is the one that unlocks the blocked workflow and can be verified. Wrong options are often useful in another situation, but they fail here because they tune a later step, address a different modality, or repair a symptom without satisfying the scenario requirement.
For this table, think like an app developer: the key question is whether the tool returns fields your program can store, check, and send for review.
| Object | Attribute | Value Range | Default State | Dependency | Failure State |
|---|---|---|---|---|---|
| Extraction schema | Target fields | Names, types, tables, confidence | Absent until defined | Business output contract | Response is unstructured and cannot populate the app |
| Content source | Media type | Document, image, audio, video | Not uploaded or referenced | Tool support and file quality | No extraction or partial extraction |
| Analyzer/test run | Validation state | Succeeded, failed, needs review | Untested | Sample content and schema | Fields are missing before client integration |
| Client mapping | Field binding | JSON field to app property | Manual or absent | Extraction response | Application stores wrong values or drops confidence signals |
Suggested Lab Validation Ideas:
The following paths and commands are conceptual lab-style examples for practice. Adapt them to the current Microsoft documentation, SDK/API version, subscription permissions, and project environment before using them in a real implementation.
Portal path: Microsoft Foundry > Tools > Content Understanding - portal validation for analyzer/schema and test results. Python SDK rehearsal: python extract_content.py --file invoice.pdf - local client validation for structured extraction; adapt to current Content Understanding SDK/API surface. API rehearsal: inspect JSON response fields, confidence scores, and status - response-shape validation rather than deployment command.The app sends a content item to an analyzer that applies a schema or extraction objective. The service processes the media, returns structured fields with status and confidence evidence, and the client maps those fields into business objects. If the answer uses generic summarization, the chain loses schema, confidence, and repeatable field binding.
| Task | Precise Command or Path | Verification Standard |
|---|---|---|
| Validate schema coverage | Microsoft Foundry > Tools > Content Understanding analyzer view | All required business fields are represented in the analyzer or extraction output |
| Validate extraction result | Run a sample file and inspect JSON/status output | Expected fields, values, and confidence indicators are present |
| Validate client mapping | Review application mapping from extraction response to data model | Each required field is stored correctly and low-confidence cases are handled |
When building a Foundry chat solution, what should be validated before expanding prompts or adding more examples?
Validate that the selected deployment can be invoked successfully and supports the required chat or agent capability.
Implementation questions usually start with a working dependency chain: deployment, endpoint, authentication, and supported capability. Prompt examples can improve behavior only after the app can call the correct model. In Foundry scenarios, a minimal test invocation is strong evidence that the configuration path is ready.
Demand Score: 92
Exam Relevance Score: 97
When should a Foundry single-agent solution be used instead of a simple one-turn chat response?
Use a single-agent solution when the scenario requires goal-driven interaction, tool use, or multi-step task handling within a controlled workflow.
A chat response is appropriate for direct text generation or question answering. An agent is more suitable when the solution must coordinate steps, use configured tools, or follow a task flow. AI-901 candidates should look for workflow cues rather than assuming every conversational requirement is just a basic chat completion.
Demand Score: 86
Exam Relevance Score: 93
What is the best Foundry workload choice when an application must convert spoken customer audio into text for analysis?
Use a speech-to-text capability, then process the resulting transcript as needed.
The required transformation is audio input to text output. Text-to-speech would solve the opposite problem, and generic chat does not by itself transcribe audio. Exam stems that use verbs such as transcribe, dictate, or capture spoken input usually indicate speech-to-text before later text analysis or summarization.
Demand Score: 89
Exam Relevance Score: 95
How should a learner choose between image generation and vision analysis in Foundry?
Choose image generation to create new images, and choose vision analysis or a multimodal model to interpret existing images.
The difference is the direction of the task. If the app needs a generated visual from a prompt, image generation fits. If it needs to detect, describe, classify, or answer questions about an uploaded image, a vision or multimodal capability is required. This distinction is a common AI-901 workload-pattern trap.
Demand Score: 88
Exam Relevance Score: 96
What should be used when a Foundry solution must return invoice date, vendor name, line items, and totals from uploaded documents?
Use Azure Content Understanding or an information extraction workflow with a schema for the required fields.
The output requirement is structured data from documents, not a freeform summary. Content Understanding and extraction workflows are designed to map document content into fields, tables, and entities. A generic chat answer may describe the invoice but would not reliably satisfy the business output contract.
Demand Score: 94
Exam Relevance Score: 98
Why should deterministic settings be considered for classification or extraction tasks in Foundry?
They help produce more consistent outputs when the task requires stable labels, fields, or decisions.
Classification and extraction usually need repeatable results, so high randomness can make outputs inconsistent. Creative generation can benefit from more variation, but structured tasks are judged by reliability and format adherence. AI-901 scenarios may frame this as choosing parameters that match the task purpose after the model capability is already correct.
Demand Score: 85
Exam Relevance Score: 92