Kubeflow on Azure Training Course
Kubeflow is a framework for running Machine Learning workloads on Kubernetes. TensorFlow is one of the most popular machine learning libraries. Kubernetes is an orchestration platform for managing containerized applications.
This instructor-led, live training (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to Azure cloud.
By the end of this training, participants will be able to:
- Install and configure Kubernetes, Kubeflow and other needed software on Azure.
- Use Azure Kubernetes Service (AKS) to simplify the work of initializing a Kubernetes cluster on Azure.
- Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
- Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
- Leverage other AWS managed services to extend an ML application.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Course Outline
Introduction
- Kubeflow on Azure vs on-premise vs on other public cloud providers
Overview of Kubeflow Features and Architecture
Overview of the Deployment Process
Activating an Azure Account
Preparing and Launching GPU-enabled Virtual Machines
Setting up User Roles and Permissions
Preparing the Build Environment
Selecting a TensorFlow Model and Dataset
Packaging Code and Frameworks into a Docker Image
Setting up a Kubernetes Cluster Using AKS
Staging the Training and Validation Data
Configuring Kubeflow Pipelines
Launching a Training Job.
Visualizing the Training Job in Runtime
Cleaning up After the Job Completes
Troubleshooting
Summary and Conclusion
Requirements
- An understanding of machine learning concepts.
- Knowledge of cloud computing concepts.
- A general understanding of containers (Docker) and orchestration (Kubernetes).
- Some Python programming experience is helpful.
- Experience working with a command line.
Audience
- Data science engineers.
- DevOps engineers interesting in machine learning model deployment.
- Infrastructure engineers interested in machine learning model deployment.
- Software engineers wishing to automate the integration and deployment of machine learning features with their application.
Open Training Courses require 5+ participants.
Kubeflow on Azure Training Course - Booking
Kubeflow on Azure Training Course - Enquiry
Kubeflow on Azure - Consultancy Enquiry
Consultancy Enquiry
Testimonials (5)
It was very much what we asked for—and quite a balanced amount of content and exercises that covered the different profiles of the engineers in the company who participated.
Arturo Sanchez - INAIT SA
Course - Microsoft Azure Infrastructure and Deployment
I've got to try out resources that I've never used before.
Daniel - INIT GmbH
Course - Architecting Microsoft Azure Solutions
very friendly and helpful
Aktar Hossain - Unit4
Course - Building Microservices with Microsoft Azure Service Fabric (ASF)
the ML ecosystem not only MLFlow but Optuna, hyperops, docker , docker-compose
Guillaume GAUTIER - OLEA MEDICAL
Course - MLflow
The practical part, I was able to perform exercises and to test the Microsoft Azure features
Alex Bela - Continental Automotive Romania SRL
Course - Programming for IoT with Azure
Upcoming Courses
Related Courses
DeepSeek: Advanced Model Optimization and Deployment
14 HoursThis instructor-led, live training in Japan (online or onsite) is aimed at advanced-level AI engineers and data scientists with intermediate-to-advanced experience who wish to enhance DeepSeek model performance, minimize latency, and deploy AI solutions efficiently using modern MLOps practices.
By the end of this training, participants will be able to:
- Optimize DeepSeek models for efficiency, accuracy, and scalability.
- Implement best practices for MLOps and model versioning.
- Deploy DeepSeek models on cloud and on-premise infrastructure.
- Monitor, maintain, and scale AI solutions effectively.
Microsoft Azure Infrastructure and Deployment
35 HoursMicrosoft Azure Infrastructure and Deployment
Architecting Microsoft Azure Solutions
14 HoursThis training permits delegates to improve their Microsoft Azure solution design skills.
After this training the delegate will understand the features and capabilities of Azure services, to be able to identify trade-offs, and make decisions for designing public and hybrid cloud solutions.
During training the appropriate infrastructure and platform solutions to meet the required functional, operational, and deployment requirements through the solution life-cycle will be defined.
Building Microservices with Microsoft Azure Service Fabric (ASF)
21 HoursThis instructor-led, live training in Japan (online or onsite) is aimed at developers who wish to learn how to build microservices on Microsoft Azure Service Fabric (ASF).
By the end of this training, participants will be able to:
- Use ASF as a platform for building and managing microservices.
- Understand key microservices programming concepts and models.
- Create a cluster in Azure.
- Deploy microservices on premises or in the cloud.
- Debug and troubleshoot a live microservice application.
Developing Intelligent Bots with Azure
14 HoursThe Azure Bot Service combines the power of the Microsoft Bot Framework and Azure functions to enable rapid development of intelligent bots.
In this instructor-led, live training, participants will learn how to easily create an intelligent bot using Microsoft Azure
By the end of this training, participants will be able to:
- Learn the fundamentals of intelligent bots
- Learn how to create intelligent bots using cloud applications
- Understand how to use the Microsoft Bot Framework, the Bot Builder SDK, and the Azure Bot Service
- Understand how to design bots using bot patterns
- Develop their first intelligent bot using Microsoft Azure
Audience
- Developers
- Hobbyists
- Engineers
- IT Professionals
Format of the course
- Part lecture, part discussion, exercises and heavy hands-on practice
Introduction to Azure
7 HoursIn this instructor-led, live training in Japan (onsite or remote) participants will learn the fundamental concepts, components, and services of Microsoft Azure as they step through the creation of a sample cloud application.
By the end of this training, participants will be able to:
- Understand the basics of Microsoft Azure
- Understand the different Azure tools and services
- Learn how to use Azure for building cloud applications
Programming for IoT with Azure
14 HoursInternet of Things (IoT) is a network infrastructure that connects physical objects and software applications wirelessly, allowing them to communicate with each other and exchange data via network communications, cloud computing, and data capture. Azure is a comprehensive set of cloud services which offers an IoT Suite consisting of preconfigured solutions that help developers accelerate development of IoT projects.
In this instructor-led, live training, participants will learn how to develop IoT applications using Azure.
By the end of this training, participants will be able to:
- Understand the fundamentals of IoT architecture
- Install and configure Azure IoT Suite
- Learn the benefits of using Azure in programming IoT systems
- Implement various Azure IoT services (IoT Hub, Functions, Stream Analytics, Power BI, Cosmos DB, DocumentDB, IoT Device Management)
- Build, test, deploy, and troubleshoot an IoT system using Azure
Audience
- Developers
- Engineers
Format of the course
- Part lecture, part discussion, exercises and heavy hands-on practice
Note
- To request a customized training for this course, please contact us to arrange.
Kubeflow
35 HoursThis instructor-led, live training in Japan (online or onsite) is aimed at developers and data scientists who wish to build, deploy, and manage machine learning workflows on Kubernetes.
By the end of this training, participants will be able to:
- Install and configure Kubeflow on premise and in the cloud using AWS EKS (Elastic Kubernetes Service).
- Build, deploy, and manage ML workflows based on Docker containers and Kubernetes.
- Run entire machine learning pipelines on diverse architectures and cloud environments.
- Using Kubeflow to spawn and manage Jupyter notebooks.
- Build ML training, hyperparameter tuning, and serving workloads across multiple platforms.
Kubeflow Fundamentals
28 HoursThis instructor-led, live training in Japan (online or onsite) is aimed at developers and data scientists who wish to build, deploy, and manage machine learning workflows on Kubernetes.
By the end of this training, participants will be able to:
- Install and configure Kubeflow on premise and in the cloud.
- Build, deploy, and manage ML workflows based on Docker containers and Kubernetes.
- Run entire machine learning pipelines on diverse architectures and cloud environments.
- Using Kubeflow to spawn and manage Jupyter notebooks.
- Build ML training, hyperparameter tuning, and serving workloads across multiple platforms.
Kubeflow on AWS
28 HoursThis instructor-led, live training in Japan (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to an AWS EC2 server.
By the end of this training, participants will be able to:
- Install and configure Kubernetes, Kubeflow and other needed software on AWS.
- Use EKS (Elastic Kubernetes Service) to simplify the work of initializing a Kubernetes cluster on AWS.
- Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
- Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
- Leverage other AWS managed services to extend an ML application.
Kubernetes on Azure (AKS)
14 HoursIn this instructor-led, live training in Japan (online or onsite), participants will learn how to set up and manage a production-scale container environment using Kubernetes on AKS.
By the end of this training, participants will be able to:
- Configure and manage Kubernetes on AKS.
- Deploy, manage and scale a Kubernetes cluster.
- Deploy containerized (Docker) applications on Azure.
- Migrate an existing Kubernetes environment from on-premise to AKS cloud.
- Integrate Kubernetes with third-party continuous integration (CI) software.
- Ensure high availability and disaster recovery in Kubernetes.
MLflow
21 HoursThis instructor-led, live training in (online or onsite) is aimed at data scientists who wish to go beyond building ML models and optimize the ML model creation, tracking, and deployment process.
By the end of this training, participants will be able to:
- Install and configure MLflow and related ML libraries and frameworks.
- Appreciate the importance of trackability, reproducability and deployability of an ML model
- Deploy ML models to different public clouds, platforms, or on-premise servers.
- Scale the ML deployment process to accommodate multiple users collaborating on a project.
- Set up a central registry to experiment with, reproduce, and deploy ML models.
MLOps: CI/CD for Machine Learning
35 HoursThis instructor-led, live training in Japan (online or onsite) is aimed at engineers who wish to evaluate the approaches and tools available today to make an intelligent decision on the path forward in adopting MLOps within their organization.
By the end of this training, participants will be able to:
- Install and configure various MLOps frameworks and tools.
- Assemble the right kind of team with the right skills for constructing and supporting an MLOps system.
- Prepare, validate and version data for use by ML models.
- Understand the components of an ML Pipeline and the tools needed to build one.
- Experiment with different machine learning frameworks and servers for deploying to production.
- Operationalize the entire Machine Learning process so that it's reproduceable and maintainable.
MLOps for Azure Machine Learning
14 HoursThis instructor-led, live training in (online or onsite) is aimed at machine learning engineers who wish to use Azure Machine Learning and Azure DevOps to facilitate MLOps practices.
By the end of this training, participants will be able to:
- Build reproducible workflows and machine learning models.
- Manage the machine learning lifecycle.
- Track and report model version history, assets, and more.
- Deploy production ready machine learning models anywhere.