Training and deploying machine learning models is a crucial part of the machine learning pipeline. It involves teaching the model to make predictions based on input data and then making the trained model accessible for real-world use.
In this section, we will explore the following topics in detail:
Model training is the process of teaching the machine learning model to learn patterns in data so that it can make predictions. During training, the model's parameters are adjusted to minimize errors in its predictions, often by using a loss function. Let's break down the essential components of model training.
Before training a model, the dataset needs to be divided into three parts:
from sklearn.model_selection import train_test_split
#Assume X (features) and y (labels) are defined
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)
print(f"Training set size: {len(X_train)}")
print(f"Validation set size: {len(X_val)}")
print(f"Test set size: {len(X_test)}")
Training a machine learning model depends on whether the data is labeled or not.
Gradient Descent is a popular optimization algorithm used to minimize the loss function by iteratively adjusting the model's parameters. Here's how it works:
There are different types of gradient descent based on how much data is used for each update:
Example: Gradient Descent Update
#Pseudo-code for updating weights using gradient descent
learning_rate = 0.01
weights = weights - learning_rate * gradient
Regularization is used to prevent overfitting, which happens when the model learns the training data too well, including noise. Regularization adds a penalty to the loss function to reduce the complexity of the model, making it more generalizable.
Some popular regularization techniques include:
Cross-validation is used to evaluate how well the model generalizes to unseen data. The most common methods are:
K-Fold Cross-Validation: Splits the data into K subsets or "folds." For each fold, the model is trained on the other K-1 folds and tested on the current fold. This ensures that the model is evaluated on all parts of the data.
Leave-One-Out Cross-Validation (LOOCV): A special case of cross-validation where K is set equal to the number of data points. This method is computationally expensive but can be useful for small datasets.
Example: K-Fold Cross-Validation
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
scores = cross_val_score(model, X, y, cv=5)
print(f"Cross-validation scores: {scores}")
Once the model is trained, it needs to be evaluated to understand how well it performs. Evaluation metrics vary depending on whether the task is classification or regression:
For Classification:
For Regression:
Example: Model Evaluation
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy}")
After the model has been trained and evaluated, the next step is deployment. This involves making the model accessible for use in real-world applications.
There are different strategies to deploy machine learning models depending on the application and resources available:
There are various ways to deploy models on cloud platforms like Azure:
Azure Kubernetes Service (AKS): A robust solution for deploying machine learning models as Docker containers, suitable for large-scale and high-availability applications.
Azure App Service: A simpler option for deploying models as RESTful APIs that can be accessed via HTTP requests, suitable for smaller-scale applications.
Azure Container Instances (ACI): A lightweight option for small-scale, quick deployments. Useful for less complex models.
Azure Functions: A serverless platform for deploying models in event-driven applications where you don’t need to manage the underlying infrastructure.
Once your model is deployed, it's essential to ensure it continues to perform well over time. This involves model versioning and monitoring.
Model Versioning: As you improve and update your model, it's crucial to track different versions of the model. This allows you to:
In platforms like Azure Machine Learning, model versioning allows you to store and manage different iterations of your model.
Model Monitoring: Over time, a model’s performance may degrade due to changes in the underlying data. For example, if new data has a different distribution than the data the model was trained on, it can cause performance issues. Continuous monitoring involves:
In Azure, there are built-in tools that monitor the performance of deployed models. Additionally, Azure can trigger a re-training process automatically when performance drops, ensuring the model stays up-to-date.
As your model is deployed and starts receiving requests, you'll likely face the challenge of scaling. Scaling ensures that your model can handle increasing workloads without performance degradation.
Scaling: This involves increasing the computational resources (e.g., more CPU, RAM, or GPUs) to handle more traffic or process more data. In cloud platforms like Azure, you can scale:
Scaling is crucial when your model needs to handle large-scale, high-throughput applications like real-time predictions for millions of users.
Cost Management: Managing the cost of your deployment is important, especially when you're scaling. Azure provides tools that let you:
You can set up alerts and automatic scaling to ensure your model is running efficiently without unnecessary overhead.
Training and deploying machine learning models is a multi-step process that involves not just building an effective model but also preparing it for real-world usage. Here's a recap of the key points covered:
Model Training:
Model Deployment:
By following these steps and strategies, you'll be able to not only build a well-performing machine learning model but also deploy it efficiently, ensuring its usefulness in real-world applications.
Deploying a machine learning model in Azure ML involves two main steps:
Model Registration: Saving the trained model in the Azure ML workspace for reuse or deployment.
Model Deployment: Making the registered model accessible as a web service.
Once a model is trained and saved (e.g., as a .pkl file), you can register it in your workspace.
from azureml.core.model import Model
model = Model.register(
workspace=ws,
model_path="outputs/model.pkl", # Path to the model file
model_name="credit_model" # Name to register the model as
)
workspace: The Azure ML workspace where the model is stored.
model_path: Path to the model artifact.
model_name: Logical name for the model version control.
You need to define how the model will process incoming data.
from azureml.core.environment import Environment
from azureml.core.model import InferenceConfig
myenv = Environment.get(workspace=ws, name="my-environment")
inference_config = InferenceConfig(
entry_script="score.py", # Contains init() and run() methods
environment=myenv
)
score.py should implement init() (model loading) and run() (prediction).
myenv contains environment specs (e.g., Conda dependencies, Python version).
For light, cost-effective deployments:
from azureml.core.webservice import AciWebservice
deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)
service = Model.deploy(
workspace=ws,
name="credit-service",
models=[model],
inference_config=inference_config,
deployment_config=deployment_config
)
service.wait_for_deployment(show_output=True)
DP-100 exam regularly tests your knowledge of model deployment workflows in Azure.
Understanding CLI/Python-based deployment gives you flexibility in real-world projects and exams.
In real applications, model performance can degrade over time (a phenomenon called data drift). Azure ML supports automated retraining pipelines that can be triggered when monitored metrics (like accuracy) fall below a threshold.
Set up model performance monitoring
Define retraining pipeline
Automated trigger setup
Re-deploy updated model
Ensures model remains accurate as data evolves.
Enables MLOps workflows for continuous integration and delivery (CI/CD) of ML models.
Helps meet business SLAs by reducing manual retraining overhead.
Suppose your deployed service tracks daily accuracy:
If accuracy < 0.88 → Azure Monitor triggers an event.
An Azure Logic App or Function runs the pipeline: preprocessing → training → evaluation.
New model replaces the old one in the production endpoint.
What advantage do Azure ML training pipelines provide compared to standalone training scripts?
Training pipelines enable automated, reusable, and scalable ML workflows consisting of multiple steps.
Pipelines allow data preparation, training, evaluation, and registration tasks to be defined as separate steps connected in a workflow. Each step can run on different compute resources and can be reused across experiments.
This modular approach improves maintainability and automation, particularly for production ML workflows. Standalone scripts typically run a single training process and lack orchestration capabilities.
Demand Score: 82
Exam Relevance Score: 88
What is the primary purpose of registering a model in Azure Machine Learning?
Model registration stores and versions trained models so they can be deployed, tracked, and reused.
After training, a model artifact is registered in the Azure ML model registry. This process assigns a version number and metadata, allowing teams to manage multiple model versions and maintain reproducibility.
Registered models can be easily deployed to endpoints or referenced in pipelines. Without registration, model artifacts remain temporary outputs of experiment runs.
Demand Score: 76
Exam Relevance Score: 84
What is a managed online endpoint in Azure Machine Learning?
A managed online endpoint is a fully managed REST API endpoint used for real-time model inference.
Managed online endpoints host deployed models and automatically handle scaling, load balancing, and infrastructure management. They allow applications to send requests to a REST API and receive predictions in real time.
Azure ML manages container deployment, monitoring, and scaling policies, which simplifies operational management compared to manual infrastructure setups.
Demand Score: 86
Exam Relevance Score: 90
How does batch deployment differ from online deployment in Azure Machine Learning?
Batch deployment processes large datasets asynchronously, while online deployment handles real-time prediction requests.
Batch endpoints run inference jobs on stored data such as files or tables and return predictions after processing is complete. They are commonly used for large-scale scoring tasks such as generating predictions for thousands of records.
Online endpoints respond instantly to API requests and are designed for interactive applications requiring low latency.
Demand Score: 80
Exam Relevance Score: 87
Why should model evaluation steps be included in an Azure ML pipeline before deployment?
Evaluation steps ensure that only models meeting defined performance criteria are deployed.
Including evaluation stages allows automated validation of model metrics such as accuracy, precision, or recall. Pipelines can include conditional steps that register or deploy models only if metrics exceed predefined thresholds.
This prevents poorly performing models from reaching production and supports automated CI/CD for machine learning workflows.
Demand Score: 73
Exam Relevance Score: 82
Why is containerization used when deploying Azure ML models?
Containerization ensures consistent runtime environments for model inference.
Azure ML packages models together with dependencies into Docker containers. This guarantees that the same libraries and runtime environment used during development are available during deployment.
Containers also simplify scaling and orchestration because they can run across multiple nodes in a managed environment.
Demand Score: 69
Exam Relevance Score: 78