Machine Learning Operations (MLOps) : Microsoft Azure

7 min readDec 26, 2022

Photo by Patrick Robert Doyle on Unsplash

MLOps or ML Ops is a set of practices that aims to deploy and maintain machine learning models in production reliably and efficiently. The word is a compound of “machine learning” and the continuous development practice of DevOps in the software field.

This project was made by following the excellent YouTube video by MG

YouTube Link :https://www.youtube.com/playlist?list=PLiQS6N-W1p3m9squzZ2cPgGdH5SBhjY6f
github link MG:https://github.com/MG-Microsoft/MLOps_Workshop.git
github link (me):https://github.com/Sabyasachi6215/MLOPs.git

Agenda :

Creation of a Devops Project.
Creation of resource group ,variable group, service connection in Azure.
Infrastructure as a code.
MLOps CI Pipeline.
Creation of CD Pipeline.

Create a Azure Devops Project :

Create resource group:

Create a Service Connection:

Project Settings : Service Connection

Create variable group:

It can be found on Azure DevOps (Pipelines → Library)

Create : Infrastructure as a code

Step1: Create Pipelines:

Step2 : Configure pipeline :

This pipeline is created by importing YAML file for Repo

Infrastructure as a code pipeline can be found at :

environment_setup/iac-create-environment-pipeline-arm.yml

In the file : “iac-create-environment-pipeline-arm.yml”

only variable- “group” has to be changed :

e.g. → group: mlops-wsh-vg ( Put your resource group name here)

# Output after running the infrastructure pipeline.

After running the pipeline please check in Resource Groups

The following infrastructures will be created :

Open Azure Machine Learning Workspace: in this case its depicted as MLOPSws1aml8

Create and Configuring a Compute :

select good compute because during deployment we will use “kubernetes” which needs higher config

Create a CI pipeline

To create CI pipeline two steps is required:

a) Create Agent :

we will use Use the classic editor to create a pipeline without YAML.

# installing requirements.txt

# testing the files :

script :

pytest training/train_test.py — doctest-modules — junitxml=junit/test-results.xml — cov=data_test — cov-report=xml — cov-report=html

# publishing the test results :

# Installing Azure CLI

script :

az extension add -n azure-cli-ml

# Creating azure ML workspace

script :

az ml workspace create -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -l $(azureml.location) — exist-ok — yes

# Azure ML compute cluster

Script :

az ml computetarget create amlcompute -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(amlcompute.clusterName) -s $(amlcompute.vmSize) — min-nodes $(amlcompute.minNodes) — max-nodes $(amlcompute.maxNodes) — idle-seconds-before-scaledown $(amlcompute.idleSecondsBeforeScaledown)

# upload data to data store

Script :

az ml datastore upload -w $(azureml.workspaceName) -g $(azureml.resourceGroup) -n $(az ml datastore show-default -w $(azureml.workspaceName) -g $(azureml.resourceGroup) — query name -o tsv) -p data -u insurance — overwrite true

# Make Meta data and Models directory

Script :

mkdir metadata && mkdir models

# Training Model

Script :

az ml run submit-script -g $(azureml.resourceGroup) -w $(azureml.workSpaceName) -e $(experiment.name) — ct $(amlcompute.clusterName) -d conda_dependencies.yml -c train_insurance -t ../metadata/run.json train_aml.py

# Azure Model registry

Script :

az ml model register -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(model.name) -f metadata/run.json — asset-path outputs/models/insurance_model.pkl -d “Classification Model for filing a claim prediction “ — tag “data”=”insurance” — tag “model”=”classification” — model-framework SickitLearn -t metadata/model.json

# Downloading the Model:

Script :

az ml model download -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -i $(jq -r .modelId metadata/model.json) -t ./models — overwrite

# Copy the files :

Contents:

*/metadata/*
**/models/*
**/deployment/*
**/tests/integration/*
**/package_requirement/*

# Publish Pipeline Artifact :

b) Configure variable names in pipeline:

# Below is an example :

# unit tests passed :

# After running the CI pipeline :

Create a Deployment Pipeline

There are 3 parts :

a) Add artifacts : Add from CI Pipeline

b)Deploy to Staging

c)Deploy to production

# Create a release pipeline :

Deploy to Staging Area

# Adding Python into the agent :

# Add ML Extension:

Script :

az extension add -n azure-cli-ml

# Deploy to Azure Container instance :

Script :

az ml model deploy -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(service.name.staging) -f ../metadata/model.json — dc aciDeploymentConfigStaging.yml — ic inferenceConfig.yml — overwrite

# bash : install requirements :

# Staging Test :

Script :

pytest staging_test.py — doctest-modules — junitxml=junit/test-results.xml — cov-report=xml — scoreurl $(az ml service show -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(service.name.staging) — query scoringUri -o tsv)

# Publish Staging test results :

b) Configure variable names in pipeline:

# Below is an example :

# After deployment to Staging area :

Deploy to prod Area :

# use Python installation :

# Install CLI ML extension :

Script :

az extension add -n azure-cli-ml

# Crate AKS (Kubernetes)

Script :

az ml computetarget create aks -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(aks.clusterName) -s $(aks.vmSize) -a $(aks.agentCount)

# Deploy AKS

Script :

az ml model deploy -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(service.name.prod) -f ../metadata/model.json — dc aksDeploymentConfigProd.yml — ic inferenceConfig.yml — ct $(aks.clusterName) — overwrite

# install python requirements

# Production test

Script :

pytest prod_test.py — doctest-modules — junitxml=junit/test-results.xml — cov =integration_test — cov-report=xml — cov-report=html — scoreurl $(az ml service show -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(service.name.prod) — query scoringUri -o tsv) — scorekey $(az ml service get-keys -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(service.name.prod) — query primaryKey -o tsv)