Schedule
1:00pm ET
ML Highlights from 2021 and lessons for 2022
Oren Etzioni, CEO at Allen Institute for Artificial Intelligence (AI2)
2021 was a year full of advances in machine learning, natural language processing, and computer vision. Inspired by Sebastian Ruder’s blog post, ML and NLP Research Highlights of 2021, this talk will summarize 15 highlights and suggest lessons for 2022 and beyond
Technical Track
Business Track
1:30pm ET
Testing ML models for production
Shivika K Bisen, Lead Data Scientist at PAXAFE
Machine learning models are an integral part of our lives and are now becoming indispensable for decision-making process in many businesses. When ML algorithms make a mistake, it can not only adversely affect the user trust but can also cause loss of businesses and in some sectors – loss of life (health). How do you know that the model you’ve been developing is reliable enough to be deployed in the real world? In this talk, we are going to have a closer look at the Testing ML model for production. Main components of the talk will be :- a) Unit testing b) API Integration testing c) Simulation testing for ML model
Recommendation systems: From A/B testing to deep learning
Uri Goren, Head of Recommendation at Argmax
Recommendation systems got a lot of focus in recent times due to the increase in online shopping. Recommendation always goes hand in hand with measurement and experimentation. In this talk we would cover contextual-bandits, a technique that combines both aspects and bakes machine/deep learning into the process. Contextual bandits are increasingly adopted in the industry, and is being used by recommendation giants such as Netflix, Facebook, Expedia, and many more.
2:00pm ET
Talk by Sanjay Yermalkar
Sanjay Yermalkar, Sr. Director, Data Science Engineering at Anthem
Abstract Coming Soon
Stop Making Data Scientists Do Systems
Emily Curtin, Senior Machine Learning Engineer at Mailchimp
Data Scientists aren’t Systems Engineers, so why do our tools expect them to understand arcane k8s errors? Why do our people systems effectively model them as weird web developers? Many organizations are lacking in a practical understanding of the Data Scientist persona from a UX perspective. By defining what Data Scientists are good at, and more importantly what they’re not good at, we as MLOps professionals and organizational leaders can build on that understanding and let Data Scientists do their best work.
3 Key Takeaways
- The best tools for Data Scientists are low/no-systems, not low/no-code.
- Velocity comes from good tooling; quality comes from good incentives.
- Infrastructure abstraction should be a top priority for MLOps professionals.
2:30pm ET
It's The Data, Stupid! How Improving ML Datasets Is The Best Way To Improve Model Performance
Peter Gao, CEO at Aquarium
When working to improve an ML model, many teams will immediately turn to fancy models or hyperparameter tuning to eke out small performance gains. However, the majority of model improvement can come from holding the model code fixed and properly curating the data it’s trained on! In this talk, Peter discusses why data curation is a key part of model iteration, some common data and model problems, then discusses how to build workflows + team structures to efficiently identify and fix these problems in order to improve your model performance.
Informed Guesser, Minimum Viable Model, Heuristic First: Using ML to solve the Right Problems
Eduardo Bonet, Staff Full Stack Engineer – MLOps at Gitlab
As Machine Learning passes its hype, the industry now enters a more mature scene where ML is not perceived anymore as a magical wand, but as a risky, yet powerful, tool to solve a new set of problems, that requires heavy investments in people and infrastructure. In this product-focused talk, we will be looking at steps we can take to decrease the risk of Machine Learning solution dying on the prototype phase: what types of problems are best fit, ideas on how to handle stakeholder expectations, how to translate Business Metrics into Model Metrics, and how to be more confident if we are solving the right problems.
3:00pm ET
15 min break
3:15pm ET
Panel Discussion: How to put ML successfully intro production
Shivika K Bisen, Lead Data Scientist at PAXAFE
Emily Curtin, Senior Machine Learning Engineer at Mailchimp
Eduardo Bonet, Staff Full Stack Engineer – MLOps at Gitlab
Niko Laskaris, Head of Strategic Projects at Comet
Technical Track
Business Track
4:00pm ET
Talk by Resham Sarkar
Resham Sarkar, Sr Manager – Data Science at Slice
Abstract Coming Soon
External Data: You only own 1% of the data, what about the rest?
Alexander Izydorczyk, Head of Data Science at Coatue Management
Abstract Coming Soon
4:30pm ET
How Feature Stores Enable Operational ML
Kevin Stumpf, Co-Founder and CTO at Tecton
Getting Machine Learning applications into production is hard. When those applications are core to the business and need to run in real-time, the challenge becomes even harder. Feature Stores are designed to solve the data engineering challenges of production ML applications, tackling four key problems:
1. Real-time and streaming data are difficult to incorporate into ML models
2. ML teams are stuck building complex data pipelines
3. Feature engineering is duplicated across the organization
4. Data issues break models in production
Talk by Resham Sarkar
Abstract coming soon.
5:00pm ET
Building Interactive Machine Learning Demos Fast
Abubakar Abid, Machine Learning Team Lead at Hugging Face
Building machine learning demos is important so that non-technical collaborators and endpoint users (e.g. customers, business teams, quality testers) can provide feedback on model development. However, it can be a time consuming process as it involves front end engineering, design experience, and model deployment. In this presentation, we will talk about an open-source Python package, Gradio, which allows machine learning engineers to quickly generate a visual interface for their ML models entirely in Python. Gradio makes accessing any ML model as easy as opening a URL in your browser. We will provide a technical overview of Gradio and discuss real world use cases in which Gradio has been used to accelerate machine learning workflows.
Talk by Gideon Mendels
Abstract Coming Soon
Schedule
Option 2
1:00pm ET
Introduction/Talk by Gideon
Technical Track
Business Track
1:30pm ET
Testing ML models for production
Shivika K Bisen, Lead Data Scientist at PAXAFE
Machine learning models are an integral part of our lives and are now becoming indispensable for decision-making process in many businesses. When ML algorithms make a mistake, it can not only adversely affect the user trust but can also cause loss of businesses and in some sectors – loss of life (health). How do you know that the model you’ve been developing is reliable enough to be deployed in the real world? In this talk, we are going to have a closer look at the Testing ML model for production. Main components of the talk will be :- a) Unit testing b) API Integration testing c) Simulation testing for ML model
Recommendation systems: From A/B testing to deep learning
Uri Goren, Head of Recommendation at Argmax
Recommendation systems got a lot of focus in recent times due to the increase in online shopping. Recommendation always goes hand in hand with measurement and experimentation. In this talk we would cover contextual-bandits, a technique that combines both aspects and bakes machine/deep learning into the process. Contextual bandits are increasingly adopted in the industry, and is being used by recommendation giants such as Netflix, Facebook, Expedia, and many more.
2:00pm ET
Talk by Sanjay Yermalkar
Sanjay Yermalkar, Sr. Director, Data Science Engineering at Anthem
Abstract Coming Soon
Stop Making Data Scientists Do Systems
Emily Curtin, Senior Machine Learning Engineer at Mailchimp
Data Scientists aren’t Systems Engineers, so why do our tools expect them to understand arcane k8s errors? Why do our people systems effectively model them as weird web developers? Many organizations are lacking in a practical understanding of the Data Scientist persona from a UX perspective. By defining what Data Scientists are good at, and more importantly what they’re not good at, we as MLOps professionals and organizational leaders can build on that understanding and let Data Scientists do their best work.
3 Key Takeaways
- The best tools for Data Scientists are low/no-systems, not low/no-code.
- Velocity comes from good tooling; quality comes from good incentives.
- Infrastructure abstraction should be a top priority for MLOps professionals.
2:30pm ET
It's The Data, Stupid! How Improving ML Datasets Is The Best Way To Improve Model Performance
Peter Gao, CEO at Aquarium
When working to improve an ML model, many teams will immediately turn to fancy models or hyperparameter tuning to eke out small performance gains. However, the majority of model improvement can come from holding the model code fixed and properly curating the data it’s trained on! In this talk, Peter discusses why data curation is a key part of model iteration, some common data and model problems, then discusses how to build workflows + team structures to efficiently identify and fix these problems in order to improve your model performance.
Informed Guesser, Minimum Viable Model, Heuristic First: Using ML to solve the Right Problems
Eduardo Bonet, Staff Full Stack Engineer – MLOps at Gitlab
As Machine Learning passes its hype, the industry now enters a more mature scene where ML is not perceived anymore as a magical wand, but as a risky, yet powerful, tool to solve a new set of problems, that requires heavy investments in people and infrastructure. In this product-focused talk, we will be looking at steps we can take to decrease the risk of Machine Learning solution dying on the prototype phase: what types of problems are best fit, ideas on how to handle stakeholder expectations, how to translate Business Metrics into Model Metrics, and how to be more confident if we are solving the right problems.
3:00pm ET
15 min break
3:15pm ET
Panel Discussion: How to put ML successfully into production
Technical Track
Business Track
4:00pm ET
Talk by Resham Sarkar
Resham Sarkar, Sr Manager – Data Science at Slice
Abstract Coming Soon
What's missing from the Modern Data Stack?
Alexander Izydorczyk, Head of Data Science at Coatue Management
Abstract Coming Soon
4:30pm ET
Talk by Kevin Stumpf
Kevin Stumpf, Co-Founder and CTO at Tecton
Abstract Coming Soon
ML Highlights from 2021 with thanks to Sebastian Ruder
Oren Etzioni, CEO at Allen Institute for Artificial Intelligence (AI2)
2021 was a year full of advances in machine learning, natural language processing, and computer vision. Inspired by Sebastian Ruder’s blog post, ML and NLP Research Highlights of 2021, this talk will summarize 15 highlights and takeaways.
5:00pm ET
Talk by Abubakar Abid
Abubakar Abid, Machine Learning Team Lead at Hugging Face
Abstract Coming Soon
Talk by Saira Kazmi
Saira Kazmi, Senior Director – Enterprise Data Engineering Strategy and AI at CVS Health
Abstract Coming Soon