Back

Session detail

LLM Evaluations & Reinforcement Learning for Shopify Sidekick on Rails

LLMs
AI
AI Agents
Architecture
Rails
Machine Learning
This talk explores building production LLM systems through Shopify Sidekick's Rails architecture, covering orchestration patterns and tool integration strategies. We'll establish statistically rigorous LLM-based evaluation frameworks that move beyond subjective "vibe testing." Finally, we'll demonstrate how robust evaluation systems become critical infrastructure for reinforcement learning pipelines, while exploring how RL can learn to hack evaluations and strategies to mitigate this.

Sep 05 - 15:45 to 16:15

Track 1

About the speakers

Andrew McNamara

Director Applied ML, Shopify

Andrew McNamara is the Director of Applied ML at Shopify, where he leads Sidekick, an AI assistant that helps merchants run their businesses more effectively. With over 15 years of experience in conversational AI, Andrew began licensing NLP technology to Samsung and LG in 2011 at startup Maluuba, which was later acquired by Microsoft. After working at Microsoft Research, he played a key role in launching Bing Chat (now Copilot), helping pioneer conversational search. At Shopify, Andrew focuses on building AI systems that empower merchants to grow and manage their businesses with greater ease and efficiency.

Agenda My Schedule Profile Notifications About