3rd Annual MLOps World Conference on Machine Learning in Production 2022

June 7, 2022

June 10, 2022

9:00 am To 6:00 pm

123 Queen Street West, Toronto, ON M5H 2M9

https://evbrw.se/3F5WZaq

$0 - $279

Event Description

An initiative for anybody who eventually, or currently, has ML/AI models in production. Come out for all, or a couple of sessions!

Thank you for joining us here.

The goal for MLOps World is to help companies put more models into production environments, effectively, responsibly, and efficiently. Whether you’re working towards a live production environment, or currently working in production, this is geared towards you on your journey.

We hope to see you soon.

-TMLS and MLOps World Team

Special Notes

**If you’d like to attend virtually only and would like to request a special pass, email us at info@mlopsworld.com.

**Some speakers may elect to not have their talk recorded

** Completion of workshop certificates available upon request

** Visa letters available upon request

**If you’d like to attend virtually only and would like to request a special pass, email us at info@mlopsworld.com.

**Some speakers may elect to not have their talk recorded

** Completion of workshop certifcates available upon request

** Visa letters available upon request

80 + speakers from including;

The team at Toronto Machine Learning Society (TMLS) An event dedicated to those working in ML/AI.

Each ticket includes;  

  • Access to 40+ virtual workshops to help build and deploy (Kubernetes, etc.) – June 7-8
  • Access to 80+ in-person talks – June 9 – 10
  • Access 90+ hours of recordings
  • Access to app-brain dates, parties and networking
  • Access to in-person Start-up Expo, Career Fair, Demo Sessions
  • Access to in-person Women in AI celebration
  • Network and connect through our event app 
  • Q+A with speakers
  • Channels to share your work with the community
  • Run your chat groups and virtual gatherings!

www.mlopsworld.com

Too few companies have effective AI leaders and an effective AI strategy. 

Taken from the real-life experiences of our community, the Steering Committee has selected the top applications, achievements and knowledge areas to highlight across this dynamic event.

Talk Tracks include:

– Real World Case Studies

– Business & Strategy

– Technical & Research (levels 1-7)

– Workshops (levels 1-7)

– In-person coding sessions

Top Industries Served:

  • Technology & Service
  • Computer Software
  • Banking & Financial Services
  • Insurance
  • Hospital & Health Care
  • Automotive
  • Telecommunications
  • Environmental Services
  • Food & Beverages
  • Marketing & Advertising

We believe these events should be as accessible as possible and set our ticket passes accordingly 

MLOps World is an international community group of practitioners trying to better understand the science of deploying ML models into live production environments, and everything both technical and non-technical that goes with it!

Created initially by the Toronto Machine Learning Society (TMLS) this initiative is intended to unite and support the wider AI Ecosystem, companies, practitioners, and academics and contributors to open-source communities operating within it.

With an explorative approach, our initiatives address the unique needs of our community of over 10,000+ ML researchers, professionals, entrepreneurs and engineers. Intended to empower its members and propel productionized ML. Our community gatherings and events attempt to re-imagining what it means to have a connected community; offering support, growth, inclusion for all participants.

FAQs

Q: This a virtual or an in-person conference

A portion is virtual and a portion in person (The full conference will not be completely hybrid)

– June 7-8th Bonus workshops days 1 and 2 held virtually (for ticket holders)

– June 9-10th Conference talks, expo, workshops, coding sessions (in person)

**If you’d like to attend virtually only, you can request a special pass by emailing info@mlopsworld.com with ONLINE ONLY PASS in the Subject Header.**

Q: What is your in-person conference policy:

We’re aware that everyone’s comfort levels, and risk tolerance can vary. We are working to support every attendee’s level of comfort with regard to interactions/socializing. That will be indicated through Green/Yellow/Red badge indicators.

We also take very seriously all Safety precautions and follow local health and safety guidelines in accordance with City of Toronto, and Marriot Hotels.

If you’re unsure or have personal requirements- message us! We’re happy to work with you to provide a safe and enjoyable experience!

Q: Which sessions are going to be recorded? When will the recordings be available and do I have access to them?

Most sessions will be recorded during the event (provided speaker permissions) and will be made available to attendees approximately 2-4 weeks after the event and be available for 12 months after release.

Q: Are there ID or minimum age requirements to enter the event? There is not. Everyone is welcome.

Q: How can I contact the organizer with any questions? Please email info@mlopsworld.com

Q: What’s the refund policy? Tickets are refundable up to 30 days before the event.

Q: Why should I attend ? From over 300+ submissions the committee has selected the top sessions to help your learning. From hands-on coding workshops to case-studies, you won’t find a conference gathering that packs as much information, and at such a low-cost of entry. Come join in our community and celebrate the major triumps of the year, as well as the main learning lessons. As well, aside from the sessions there will also be brain-dates via the app for networking and evening socials that provide opportunities to meet peers and build your network.

Q: Who will attend? Please see our Who Attends Section for a full breakdown. Participants range from data scientists, to engineers, business exectutives, and students. We’ll have multiple tracks and in app brain-dates to accommodate various vantage points and maturity levels.

Q: Can I speak at the event?

You can submit an abstract here. Submissions are reviewed by our committee.

*Content is non-commercial and speaking spots cannot be purchased. 

Q: Will you give out the attendee list? No, we do our best to ensure attendees are not inundated with messages, We allow attendees to stay in contact through our slack channel and follow-up monthly socials.

Q: Can my company have a display? Yes, there will be spaces for company displays. You can inquire at faraz@mlopsworld.com

Machine Learning Monitoring in Production: Lessons Learned from 30+ Use Cases

Lina Weichbrodt, Lead Machine Learning Engineer, DKB Bank

Abstract: 

Traditional software monitoring best practices are not enough to detect problems with machine learning stacks. How can you detect issues and be alerted in real-time?

This talk will give you a practical guide on how to do machine learning monitoring: which metrics should you implement and in which order? Can you use your team’s existing monitoring and dashboard tools, or do you need an MLOps Platform?

Technical Level: 5/7

What you will learn: 

  • Monitor the four golden signals + add machine learning monitoring
  • For ml monitoring prioritize monitoring the response of a service
  • You often don’t need a new tool, use the tools you already have and add a few metric

Implementing MLOps Practices on AWS using Amazon SageMaker

Shelbee Eigenbrode, Principal AI/ML Specialist Solutions Architect / Bobby Lindsey, AI/ML Specialist Solutions Architect / Kirit Thadaka, ML Solutions Architect,Amazon Web Services (AWS)

Abstract: 

In this workshop, attendees will get hands-on with SageMaker Pipelines to implement ML pipelines that incorporate CI/CD practices.

Technical Level: 5/7

What’s unique about this talk: 

The opportunity to get hands-on

What you will learn: 

Familiarity with end-to-end features of Amazon SageMaker used in implementing ML pipelines

Automated Machine Learning & Tuning with FLAML

Qingyun Wu, Assistant Professor, Penn State University, and  Chi Wang, Principal Researcher, Microsoft Research

Abstract: 

In this tutorial, we will provide an in-depth and hands-on training on Automated Machine Learning & Tuning with a fast python library FLAML. FLAML finds accurate machine learning models automatically, efficiently and economically. It frees users from selecting learners and hyperparameters for each learner.

Technical Level: 4/7

What’s unique about this talk: 

In addition to a set of hands-on examples, the speakers will also share some rule-of-thumbs, pitfalls, open problems, and challenges learned from AutoML practice.

What you will learn: 

  • How to use FLAML to find accurate ML models with low computational resources for common ML tasks.
  • How to leverage the flexible and rich customization choices provided in FLAML to customize your AutoML or tuning tasks.

Taking MLOps 0-60: How to Version Control, Unify Data and Manage Code Lifecycles

Jimmy Whitaker, Chief Scientist of AI,  Pachyderm

Abstract: 

Machine learning models are never done. The world is always changing and models rely on data to learn useful information about this world. In ML systems we need to be able to embrace change without sacrificing reliability. But how do we do it? MLOps. MLOps, the process of operationalizing your machine learning technology, is fundamental to any organization leveraging AI. However, the complexities of machine learning require managing two lifecycles: the code and the data. Pachyderm is a platform that provides the foundation for unifying these two lifecycles.

Technical Level: 2/7

Deploy High Scale ML Models Without the Hustle

Pavel Klushin, Head of Solutions Architecture, QWAK

Technical Level: 4/7

What you will learn: 

How to deploy ML models to production

How to MLEM Your Models to Production

Mikhail Sveshnikov, MLEM Lead Developer, Iterative

Abstract: 

New open-source product from Iterative called MLEM will help you store, access, package and deploy your models in different scenarios. I will present MLEM, we’ll go through simple tutorial and discuss other use cases where MLEM can help fellow MLOps engineers

Technical Level: 4/7

What you will learn: 

What MLEM is (beside popular meme) and how it’s features can be used in day-to-day MLOps engineer’s work like wrapping models into web services or dockerize them (but that is not all of it!). And probably a bit of knowledge of MLEM inner kitchen

Production ML for Mission-Critical Applications

Robert Crowe, TensorFlow Developer Engineer,  Google

Abstract: 

Deploying advanced Machine Learning technology to serve customers and/or business needs requires a rigorous approach and production-ready systems. This is especially true for maintaining and improving model performance over the lifetime of a production application. Unfortunately, the issues involved and approaches available are often poorly understood. An ML application in production must address all of the issues of modern software development methodology, as well as issues unique to ML and data science. Often ML applications are developed using tools and systems which suffer from inherent limitations in testability, scalability across clusters, training/serving skew, and the modularity and reusability of components. In addition, ML application measurement often emphasizes top level metrics, leading to issues in model fairness as well as predictive performance across user segments.

Rigorous analysis of model performance at a deep level, including edge and corner cases is a key requirement of mission-critical applications. Measuring and understanding model sensitivity is also part of any rigorous model development process.

We discuss the use of ML pipeline architectures for implementing production ML applications, and in particular we review Google’s experience with TensorFlow Extended (TFX), as well as available tooling for rigorous analysis of model performance and sensitivity. Google uses TFX for large scale ML applications, and offers an open-source version to the community. TFX scales to very large training sets and very high request volumes, and enables strong software methodology including testability, hot versioning, and deep performance analysis.

Technical Level: 5/7

What you will learn: 

  • How Production ML is fundamentally different from Research or Academic ML
  • Methods and architectures for creating an MLOps infrastructure that adapts to change
  • Review of several approaches to implementing MLOps in production settings

Using RIME to Eliminate AI Failures

Daniel Glogowski, Head of Product and Jerry Liu, Machine Learning Lead,  Robust Intelligence

Abstract: 

AI Fails, All the Time: AI Failure is when you train an ML model and it behaves poorly in production because of issues like novel corner case inputs, upstream ETL changes, and distributional drift. Data science teams constantly face these issues and more, spending time root causing and firefighting. Data science teams may optimize for a single performance metric like accuracy, but this is inadequate to prevent AI Failure. Combatting AI Failure takes time and energy. Robust Intelligence helps to prevent AI Failure so that you can focus on what truly matters.

RIME Prevents AI Failures: The Robust Intelligence Model Engine (RIME) helps your team accelerate your AI lifecycle. Detect Weaknesses: Train a candidate model, and automatically discover its individual weaknesses with AI Stress Testing. Go beyond simply optimizing for model performance. Improve the model with automatic suggestions. Compare with other candidate models. Establish and enforce standards across your organization. Prevent AI Failure: Confidently deploy the best model into production with AI Firewall with one line of code. Observe your model in production and automate the discovery and remediation of any issues that occur post-deployment. Automatically flag, block, or impute erroneous data in real-time.

Technical Level: 4/7

What you will learn: 

How to help prevent AI Failure so that you can focus on what truly matters.

Implementing a Parallel MLOps Test Pipeline for Open Source Development

Miguel Gonzalez-Fierro, Principal Data Science Manager,  Microsoft

Abstract: 

GitHub has become a hugely popular service for building software, open source or private. As part of the continuous development and integration process, frequent, reliable and efficient testing of repository code is necessary. GitHub provides functionality and resources for automating testing workflows (GitHub Workflows), which allow for both managed and self-hosted test machines.

However, managed hosts are of computational size that is limited for many machine learning workloads. Moreover, they don’t include GPU hosts currently. As for self-hosted machines, there is the inconvenience and cost of keeping machines online 24 x 7. Another issue is that it is cumbersome to distribute test jobs to multiple machines.

Our goal is to leverage Azure Machine Learning along with GitHub Workflows in order to address these issues. With AzureML, we can access powerful compute with both CPU and GPU. Bringing the compute online is automatic and on demand for all the testing jobs. Moreover, we can easily distribute testing jobs to multiple hosts, in order to limit the end-to-end execution time of the workflow.

We show a configuration for achieving the above programmatically, which we have developed as part of the Microsoft Recommenders repository (https://github.com/microsoft/recommenders/), which is a popular open-source repository that we maintain and develop. In our setting, we have three workflows that trigger nightly runs as well as a workflow triggered by pull requests.

Nightly workflows, in particular, include smoke and integration tests and are long (more than 6 hours) if run sequentially. Using our parallelized approach on AzureML, we have managed to bring the end-to-end time down to less than 1 hour. We also discuss how to divide the tests into groups in order to maximize machine utilization.

We also talk about how we retrieve the logs associated with runs from AzureML and register them as artifacts on GitHub. This allows one to view the progress of testing jobs from the GitHub Actions dashboard, which makes monitoring and debugging of errors easier.

Technical Level: 5/7

What’s unique about this talk: 

We have one of the most sofisticated test pipelines of GitHub repositories related to machine learning.

What you will learn: 

  • People who attend this session will learn about
  • Best practices on testing GitHub repositories of Python code, which are based on our experience with the Microsoft/Recommenders repository
  • Guidelines on testing in economical ways
  • How to use GitHub workflows for setting up their testing pipelines
  • How to benefit from Azure Machine Learning capabilities in order to automate testing jobs that run in parallel.

MLOps Beyond Training: Simplifying and Automating the Operational Pipeline

Yaron Haviv, Co-Founder & CTO,  Iguazio

Abstract: 

Most data science teams start their AI journey from what they perceive to be the logical beginning: building AI models using manually extracted datasets. Operationalizing machine learning, in the sense of considering all the requirements of the business; handling online and federated data sources, scale, performance, security, continuous operations, etc. comes as an afterthought, making it hard and resource-intensive to create real business value with AI.

Technical Level: 5/7

What you will learn: 

How to simplify and automate your production pipeline to bring data science to production faster and more efficiently

How to Treat Your Data Platform Like a Product: 5 Key Best Practices

Barr Moses, CEO & Co-Founder, Monte Carlo

Abstract: 

Your team just migrated to a data mesh (or so they think). Your CTO is all in on this “modern data stack,” or as he calls it: “The Enterprise Data Discovery.” To satisfy your company’s insatiable appetite for data, you may even be building a complex, multi-layered data ecosystem: in other words, a data platform. Still, it’s one thing to build a data platform, but how do you ensure it actually drives value for your business?

In this fireside chat, Barr Moses, CEO & co-founder of Monte Carlo, will walk through why best in class data teams are treating their data platforms like product software and how to get started with reliability and scale in mind.

Technical Level: 3/7

What’s unique about this talk: 

I’ve never discussed these best practices before at a public talk or in a blog article, and they’re pulled from my own experience at Monte Carlo working with 100s of data teams attempt to build their own data platforms.

What you will learn: 

5 best practices (across technology, processes, and culture) for treating your data platform like a scalable, measurable product with machine learning and automation.

WarpDrive: Orders-of-Magnitude Faster Multi-Agent Deep RL on a GPU

Stephan Zheng, Lead Research Scientist; Tian Lan, Senior Applied Scientist; Sunil Srinivasa, Research Engineer, Salesforce

Abstract: 

Reinforcement learning is a powerful tool that has enabled big technical successes in AI, including superhuman gameplay, optimizing data center cooling, nuclear fusion control, economic policy analysis, etc. For wider real-world deployment, users need to be able to run RL workflows efficiently and quickly. WarpDrive is an open-source framework that runs multi-agent deep RL end-to-end on a GPU. This enables orders of magnitude faster RL.

In this talk, we will review how WarpDrive works and several new features introduced since its first release in Sep 2021. These include automatic GPU utilization tuning, distributed training on multiple GPUs, and sharing multiple GPU blocks across a simulation. These features result in throughput scaling linearly with the number of devices, to a scale of millions of agents. WarpDrive also provides several utility functions that improve quality-of-life and enable users to quickly implement and train RL workflows.

Technical Level: 6/7

What’s unique about this talk: 

Accessible explanations of the latest features, demos, and future roadmap.

What you will learn: 

How WarpDrive enables you to run reinforcement learning orders of magnitude faster.

Supercharging MLOps with the Petuum Platform

Aurick Qiao, Ph.D., CEO and Tong Wen, Director of Engineering, Petuum

Abstract: 

Today’s widespread practice of ad hoc integration between many fragmented ML tools leaves hard-to-fill gaps in end-to-end automation, scalability, and management of AI/ML applications. With the Petuum Platform, ML applications and infrastructure can be composed quickly and flexibly from standardized and reusable building blocks, thus transforming MLOps from craft production into a repeatable assembly-line process. We will discuss new innovations in Composable, Automatic, and Scalable ML (CASL), developed in collaboration with CMU, UC Berkeley, and Stanford, and how they play a pivotal role in the Petuum Platform.

Technical Level: 3/7

What you will learn: 

This workshop will show how your team can easily compose, manage, and monitor AI/ML infrastructure across multiple systems on a single pane of glass, seamlessly scale ML pipelines from local development to batch execution and online serving, and optimize end-to-end ML pipelines in an automatic and cost-efficient way.

Scaling ML Embedding Models to Serve a Billion Queries

Senthilkumar Gopal, Senior Engineering Manager (Search ML), eBay Inc.

Abstract: 

This talk is aimed at providing a deeper insight into the scale, challenges and solutions formulated for powering embeddings based visual search in eBay. This talk walks the audience through the model architecture, application archite for serving the users, the workflow pipelines produced for building the embeddings to be used by Cassini, eBay’s search engine and the unique challenges faced during this journey. This talk provides key insights specific to embedding handling and how to scale systems to provide real time clustering based solutions for users.

Technical Level: 5/7

What’s unique about this talk: 

Most of the online content, dwells on pieces of the infrastructure required without providing an end to end coherent picture. Most critically, the content does not relate to the model architecture and how the pipelines and model architecture/parameters are influenced by each other. This talk also goes into the aspects of a large scale search engine and how the application architecture influences the operational aspects to enable the scale required.

What you will learn: 

The audience will learn how to productionize embedding based data pipelines, key challenges and potential solutions, introduction to different quantization algorithms and their advantages/disadvantages. The audience will also get a deeper view on how data pipelines and workflows are modeled for optimal scale.

Personalized Recommendations and Search with Retrieval and Ranking at scale on Hopsworks

Jim Dowling, CEO,  Hopsworks

Abstract: 

Personalized recommendations and personalized search systems at scale are increasingly being built on retrieval and ranking architectures based on the two-tower embedding model. This architecture requires a lot of infrastructure. A single user query will cause a large fanout of traffic to the backend, with hundreds of database lookups in a feature store, similarity search in an embedding store, and model outputs from both a query embedding model and a ranking model. You will also need to index your items in the embedding store using an item embedding model, and instrument your existing systems to store observations of user queries and the items they select.

Technical Level: 6/7

What’s unique about this talk: 

The only integrated open-source platform for scalable retrieval and ranking systems.

What you will learn: 

How to build a state-of-the-art two tower model for personalized recommendations that scales with Hopsworks.

Accelerating Transformers with Hugging Face Optimum and Infinity

Philipp Schmid, Machine Learning Engineer and Lewis Tunstall, Machine Learning Engineer, Hugging Face

Abstract: 

Since their introduction in 2017, Transformers have become the de facto standard for tackling a wide range of NLP tasks in both academia and industry. However, in many situations accuracy is not enough — your state-of-the-art model is not very useful if it’s too slow or large to meet the business requirements of your application.

Technical Level: 5/7

What you will learn: 

How Hugging Face Optimum and Infinity provide developers with the tools to easily optimize Transformers with techniques such as quantization and pruning.

Parallelizing Your ETL with Dask on Kubeflow

Jacob Tomlinson, Senior Software Engineer, NVIDIA

Abstract: 

Kubeflow is a popular MLOps platform built on Kubernetes for designing and running Machine Learning pipelines for training models and providing inference services. Kubeflow has a notebook service that lets you launch interactive Jupyter servers (and more) on your Kubernetes cluster. Kubeflow also has a pipelines service with a DSL library written in Python for designing and building repeatable workflows that can be executed on your cluster, either ad-hoc or on a schedule. It also has tools for hyperparameter tuning and running model inference servers, everything you need to build a robust ML service.

Technical Level: 5/7

What’s unique about this talk: 

It’s common to talk about parallelism and GPU acceleration at the model training stage, but we are working hard to also accelerate ETL stages. There isn’t a huge amount of content online about this yet.

What you will learn: 

Data Scientists commonly use Python tools like Pandas on their laptops with CPU compute. Production systems are usually distributed multi-node GPU setups. Dask is an open source Python library that takes the pain out of scaling up from laptop to production.

What’s in the box? Automatic ML Model Containerization

 Clayton Davis, Head of Data Science, Modzy

Abstract: 

This talk will include a deep dive on building machine learning (ML) models into container images to run in production for inference. Based on our experience setting up ML container builds for many customers, we’ll share a set of best practices for ensuring secure, multi-tenant image builds that avoid lock-in, and we’ll also cover some tooling (chassis.ml) and a standard (Open Model Interface (OMI)) to execute this process. Data scientists and developers will walk away with an understanding of the merits of a standard container specification that allows for interoperability, portability, and security for models to seamlessly be integrated into production applications.

Technical Level: 5/7

What you will learn: 

Prerequiste: Basic familiarity with ML models and/or common ML frameworks (pytorch, scikit learn, etc.)

A Zero-Downtime Set-up for Model: How and Why

Anouk Dutrée, Product Owner, UbiOps

Abstract: 

When a model is in production you ideally want zero-downtime. Whenever the model is needed it should be ready to respond. This issue is two-sided, on one hand you need to make sure that there is no down-time when updating your model, on the other hand you need to ensure that a request can be processed even if your model itself fails. In this talk we will take you through the set-up we use to ensure zero-downtime when updating models, and how this set-up can be expanded to ensure you can handle failing models as well.

Technical Level: 4/7

What’s unique about this talk: 

I personally find most of the articles on this topic to be to specific to one part of the chain. In this talk I want to go over the entire process as a whole, and cover the two sides of downtime. (i.e. downtime caused by maintenance and downtime because the model fails).

What you will learn: 

  • How to create an easy to work with zero-downtime set-up for data science models using smart routing
  • How to expand this set-up to a champion challenger set-up to ensure there is always a model available that can take over if a model fails unexpectedly
  • What a champion challenger set-up is

MLOps is Just HPC in Disguise:A Real-World, No Nonsense Guide to Upgrading Your Workflow

Victor Sonck, Evangelist,  ClearML

Abstract: 

Does the following sound familiar to you? Overwriting existing plots and model files, having to put a model in production in 10 days or running out of GPU availability again. If it does, this workshop is for you, you’ll end up with a set of tools and workflows that can make your life so much easier. Increase your productivity by automating mundane tasks.

Technical Level: 4/7

What you will learn: 

A set of tools, tips, tricks and example workflows they can use in their own life to help alleviate common data science challenges.

Critical Use of MLOps in Finance: Using Cloud-managed ML Services that Scale

Vinnie Saini, Director and Senior Principal , Enterprise Data Architecture & Cloud Strategy, Scotiabank

Abstract: 

With ML Engineering being a superset of Software Engineering, treating Data as a first class citizen is key to ML Engineering.The talk will be focused on how leveraging MLOps is a key to improve the quality and consistency of machine learning solutions, managing the lifecycle of your models with the goal of:- Faster experimentation and development of models- Faster deployment of models into production- Quality assurance and end-to-end lineage tracking

With trained machine learning models deployed as web services in the cloud or locally, we’ll see how deployments use CPU, GPU, or field-programmable gate arrays (FPGA) for inferencing- using different compute targets:- Container Instance- Kubernetes Service- development environment

Technical Level: 5/7

What you will learn: 

This talk is intended for technology leaders and enterprise architects who want to understand the details about what MLOps in practice: Capture the governance data for the end-to-end ML lifecycle. Monitor ML applications for operational and ML-related issues. Compare model inputs between training and inference, explore model-specific metrics, and provide monitoring and alerts on your ML infrastructure. Automate the end-to-end ML lifecycle with Pipeline to continuously roll out new ML models alongside your other applications and services.

Building Real-Time ML Features with Feast, Spark, Redis, and Kafka

Danny Chiao, Engineering Lead and Achal Shah, Software Engineer, Tecton/Feast

Abstract: 

This workshop will focus on the core concepts underlying Feast, the open-source feature store. We’ll explain how Feast integrates with underlying data infrastructure including Spark, Redis, and Kafka, to provide an interface between models and data.

Technical Level: 4/7

What you will learn: 

We’ll provide coding examples to showcase how Feast can be used to:

  • Curate features in online and offline storage
  • Process features in real-time
  • Ensure data consistency between training and serving environments
  • Serve feature data online for real-time inference
  • Quickly create training datasets
  • Share and re-use features across models

Generalizing Diversity: Machine Learning Operationalization for Pharma Research

Daniel Butnaru, Principal Architect & Head of Scientific Software Engineering & Architecture, Roche

Abstract: 

Many machine learning use cases in pharma research are transitioning from a one-off scenario, where the model is built once and ran few times, to repeated usage of the same model in critical research workflows. This shift significantly raises the bar on the quality and setup necessary to train and deploy ML models. Given the number and diversity of ML models how does a larger enterprise go about leveraging an MLOPS platform? How does one ensure seamless operational embedding of ML models in a heterogeneous enterprise operational landscape?

Technical Level: 4/7

What’s unique about this talk: 

is shows MLOPS scenarios in early pharma research. Also some of the presented models consist of 100s of individual models that need to be delivered as one. This is a rather unique setup.

What you will learn: 

  • how Roche Pharma Research operationalizes molecular property predictors (100s of models)
  • why the embedding of the model in operational systems needs to be considered from the start
  • implementation patterns for formalizing the exchange between data scientist, ML engineer and data engineer

Lessons Learned from DAG-based Workflow Orchestration

Kevin Kho,Senior Open Source Community Engineer,  Prefect

Abstract: 

Workflow orchestration has traditionally been closely coupled to the concept of Directed Acyclic Graphs (DAGs). Building data pipelines involved registering a static graph containing all the tasks and their respective dependencies. During workflow execution, this graph would be traversed and executed. The orchestration engine would then be responsible for determining which tasks to trigger based on the success and failure of upstream tasks.

This system was sufficient for standard batch processing-oriented data engineering pipelines but proved to be constraining for some emerging common use cases. Data professionals would have to compromise their vision to get their workflow to fit in a DAG.

For example,

1. How do I re-run a part of my workflow based on a downstream condition?

2. How do I execute a long-running workflow? 3. How do I dynamically add tasks to the DAG during runtime? This has led to the development of Prefect Orion (Prefect 2.0), a DAG-less workflow orchestration system that emphasizes runtime flexbility and an enhanced developer experience. By removing the DAG constraint, Orion offers an interface to workflow orchestration that feels more Pythonic than ever. Developers only need to wrap as little code as they want to get observability into a specific task of the workflows.

Technical Level: 5/7

What’s unique about this talk: 

A lot of the content here will come from supporting the Prefect community over the last 3 years and the difficulties we recognized with traditional orchestration systems. There are not a lot of people with the experience of supporting thousands of use cases and extracting insight from that.

What you will learn: 

They will learn about workflow orchestration, and why pinning it to the Directed Acylic Graph concept proved to be limiting. They will learn how to spin up their own free open-source orchestrator.

Defending Against Decision Degradation with Full-Spectrum Model Monitoring : Case Study and AMA

Mihir Mathur, Product Manager, Machine Learning, Lyft

Abstract: 

ML models at Lyft make millions of high stakes decisions per day including decisions for real-time pricing, physical safety classification, fraud detection, and much more. Preventing models from degrading and making ineffective decisions is therefore critical. Over the past two years, we’ve invested in building a full-spectrum model monitoring solution to catch and prevent model degradation.

In this talk, we’ll discuss our suite of approaches for model monitoring including real-time feature validation, performance drift detection, anomaly detection, and model score monitoring as well as the cultural change needed to get ML practitioners to effectively monitor their models. We’ll also discuss the impact our monitoring system delivered by catching problems.

Technical Level: 4/7

What you will learn: 

  • Why it’s needed
  • Challenges in building a model monitoring system
  • How to prioritize among a plethora of things that can be built
  • Overview of Lyft’s model monitoring architecture
  • How to cause cultural change at a company for better AI/ML practices

Leaner, Greener and Faster Pytorch Inference with Quantization

Suraj Subramanian, Developer Advocate, PyTorch

Abstract: 

Quantization refers to the practice of taking a neural network’s painstakingly-tuned FP32 parameters and rounding that to an integer – without destroying accuracies, while actually making the model leaner, greener and faster. In this session, we’ll learn more about this sorcery from first principles and see how this is implemented in PyTorch. We’ll break down all of the available approaches to quantize your model, their benefits and pitfalls, and most importantly how you can make an informed decision for your use case. Finally, we put our learnings to the test on a large non-academic model to see how this works in the real world.

Technical Level: 4/7

What you will learn: 

  • Foundations of quantization in deep learning
  • Summary of current research in this area
  • Approaches to quantization, their benefits and pitfalls
  • How to debug issues with your quantized model
  • Choosing the quantization workflow for your particular use case

Scale and Accelerate the Distributed Model Training in Kubernetes Cluster

Jack Jin, Lead ML Infra Engineer, Zoom

Abstract: 

In order to orchestrate Deep Learning workloads that scale across multiple GPUs and nodes, Kubernetes offers a compelling solution. With Kubernetes and Kubeflow PytorchJob, we can easily schedule and track a distributed training job on multi-GPU single-node, and multi-GPU multi-nodes in a shared GPU resource pool. To accelerate deep learning training at Zoom, we enable RDMA, RoCE to bypass the CPU kernel and offload the TCP/IP protocol. We apply this technology in Kubernetes with SRIOV by NVIDIA Network Operator in a heterogenous GPUs cluster with 4 GPU s

Organizers

Toronto Machine Learning Society (TMLS)

https://torontomachinelearning.com/

About the Organizers

TMLS events bring together business leaders, researchers and applied ML practitioners. TMLS is a community of over 5,000 practitioners, researchers, entrepreneurs and executives. We work to highlight global opportunities and foster growth in local ecosystems.

Visited 17 times, 1 Visit today