Machine Learning Operations (MLOps) : Microsoft Azure

Sabyasachi Ghosh
7 min readDec 26, 2022

--

Photo by Patrick Robert Doyle on Unsplash

MLOps or ML Ops is a set of practices that aims to deploy and maintain machine learning models in production reliably and efficiently. The word is a compound of “machine learning” and the continuous development practice of DevOps in the software field.

Mlops cycle

This project was made by following the excellent YouTube video by MG

YouTube Link :https://www.youtube.com/playlist?list=PLiQS6N-W1p3m9squzZ2cPgGdH5SBhjY6f

github link MG:https://github.com/MG-Microsoft/MLOps_Workshop.git

github link (me):https://github.com/Sabyasachi6215/MLOPs.git

Agenda :

  1. Creation of a Devops Project.
  2. Creation of resource group ,variable group, service connection in Azure.
  3. Infrastructure as a code.
  4. MLOps CI Pipeline.
  5. Creation of CD Pipeline.

Create a Azure Devops Project :

Create resource group:

Create a Service Connection:

Project Settings : Service Connection

Create variable group:

It can be found on Azure DevOps (Pipelines → Library)

Create : Infrastructure as a code

Step1: Create Pipelines:

Step2 : Configure pipeline :

This pipeline is created by importing YAML file for Repo

Infrastructure as a code pipeline can be found at :

environment_setup/iac-create-environment-pipeline-arm.yml

In the file : “iac-create-environment-pipeline-arm.yml

only variable- “group” has to be changed :

  • e.g. → group: mlops-wsh-vg ( Put your resource group name here)

# Output after running the infrastructure pipeline.

After running the pipeline please check in Resource Groups

The following infrastructures will be created :

infrastructure created

Open Azure Machine Learning Workspace: in this case its depicted as MLOPSws1aml8

Create and Configuring a Compute :

select good compute because during deployment we will use “kubernetes” which needs higher config

Create a CI pipeline

To create CI pipeline two steps is required:

a) Create Agent :

we will use Use the classic editor to create a pipeline without YAML.

select already pushed Azure Repos git
select an empty job

# installing requirements.txt

#bash scripting →choose the script path : “package_requirement/install_requirements.sh”

# testing the files :

script :

pytest training/train_test.py — doctest-modules — junitxml=junit/test-results.xml — cov=data_test — cov-report=xml — cov-report=html

# publishing the test results :

# Installing Azure CLI

script :

az extension add -n azure-cli-ml

# Creating azure ML workspace

script :

az ml workspace create -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -l $(azureml.location) — exist-ok — yes

azure ml workspace

# Azure ML compute cluster

Script :

az ml computetarget create amlcompute -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(amlcompute.clusterName) -s $(amlcompute.vmSize) — min-nodes $(amlcompute.minNodes) — max-nodes $(amlcompute.maxNodes) — idle-seconds-before-scaledown $(amlcompute.idleSecondsBeforeScaledown)

# upload data to data store

Script :

az ml datastore upload -w $(azureml.workspaceName) -g $(azureml.resourceGroup) -n $(az ml datastore show-default -w $(azureml.workspaceName) -g $(azureml.resourceGroup) — query name -o tsv) -p data -u insurance — overwrite true

# Make Meta data and Models directory

Script :

mkdir metadata && mkdir models

# Training Model

Script :

az ml run submit-script -g $(azureml.resourceGroup) -w $(azureml.workSpaceName) -e $(experiment.name) — ct $(amlcompute.clusterName) -d conda_dependencies.yml -c train_insurance -t ../metadata/run.json train_aml.py

# Azure Model registry

Script :

az ml model register -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(model.name) -f metadata/run.json — asset-path outputs/models/insurance_model.pkl -d “Classification Model for filing a claim prediction “ — tag “data”=”insurance” — tag “model”=”classification” — model-framework SickitLearn -t metadata/model.json

Model registry

# Downloading the Model:

Script :

az ml model download -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -i $(jq -r .modelId metadata/model.json) -t ./models — overwrite

Downloading the model

# Copy the files :

Contents:

  • */metadata/*
    **/models/*
    **/deployment/*
    **/tests/integration/*
    **/package_requirement/*

# Publish Pipeline Artifact :

b) Configure variable names in pipeline:

# Below is an example :

# unit tests passed :

# After running the CI pipeline :

Create a Deployment Pipeline

There are 3 parts :

a) Add artifacts : Add from CI Pipeline

b)Deploy to Staging

c)Deploy to production

# Create a release pipeline :

The entire deployment architecture

Deploy to Staging Area

# Adding Python into the agent :

# Add ML Extension:

Script :

az extension add -n azure-cli-ml

# Deploy to Azure Container instance :

Script :

az ml model deploy -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(service.name.staging) -f ../metadata/model.json — dc aciDeploymentConfigStaging.yml — ic inferenceConfig.yml — overwrite

# bash : install requirements :

# Staging Test :

Script :

pytest staging_test.py — doctest-modules — junitxml=junit/test-results.xml — cov-report=xml — scoreurl $(az ml service show -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(service.name.staging) — query scoringUri -o tsv)

# Publish Staging test results :

b) Configure variable names in pipeline:

# Below is an example :

# After deployment to Staging area :

Deploy to prod Area :

# use Python installation :

# Install CLI ML extension :

Script :

az extension add -n azure-cli-ml

# Crate AKS (Kubernetes)

Script :

az ml computetarget create aks -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(aks.clusterName) -s $(aks.vmSize) -a $(aks.agentCount)

# Deploy AKS

Script :

az ml model deploy -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(service.name.prod) -f ../metadata/model.json — dc aksDeploymentConfigProd.yml — ic inferenceConfig.yml — ct $(aks.clusterName) — overwrite

# install python requirements

# Production test

Script :

pytest prod_test.py — doctest-modules — junitxml=junit/test-results.xml — cov =integration_test — cov-report=xml — cov-report=html — scoreurl $(az ml service show -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(service.name.prod) — query scoringUri -o tsv) — scorekey $(az ml service get-keys -g $(azureml.resourceGroup) -w $(azureml.workspaceName) -n $(service.name.prod) — query primaryKey -o tsv)

# Publish test results :

b) Configure variable names in pipeline:

# Below is an example :

After deployment to Prod :

--

--

No responses yet