Building AI On-Call Assistants & Data Pipelines with Paarth Chotani at Uber

Audio Brief

Show transcript
This episode covers Uber's development of generative AI tools, focusing on their internal on-call assistant called Genie, and the broader shift from Retrieval-Augmented Generation to advanced agentic AI systems. There are four key takeaways from this discussion. First, successful AI initiatives often begin by addressing a clear, focused problem. Uber's Genie originated as a hackathon project, designed to improve on-call engineer productivity and reduce reliance on human experts. It functions as a central knowledge base, ingesting and processing data from diverse internal sources like Slack, wikis, Jira, and source code to efficiently answer complex technical questions. Second, enterprise-scale Retrieval-Augmented Generation, or RAG, demands a deeply customized and multi-faceted approach. Simple, off-the-shelf RAG solutions are insufficient for achieving high accuracy and reliability at scale. Success requires developing custom data pipelines for ingesting varied internal data, sophisticated chunking strategies, and specialized retrieval methods tailored to specific use cases. This entire process often leverages technologies like Spark for large-scale data processing, with embeddings stored in hosted vector databases such as OpenSearch. Third, the next evolution in AI-powered tooling is moving decisively toward agentic systems. This signifies a shift beyond basic chatbots that primarily retrieve information, to intelligent agents capable of understanding user intent, breaking down complex problems into sub-tasks, and executing actions across disparate internal systems. This fundamental change transforms user interaction with infrastructure, enabling automated tasks like updating permissions or provisioning resources, thus vastly improving operational efficiency. Fourth, the current landscape for building with large language models mirrors the early, pioneering days of the Hadoop ecosystem and Big Data. Engineers are effectively operating from a blank slate, engaged in the foundational work of constructing new infrastructure, discovering novel applications, and defining best practices from the ground up. This era presents both challenges and immense opportunities for innovation. This transformative period for generative AI is fundamentally reshaping how software is developed, deployed, and utilized across the enterprise.

Episode Overview

  • An introduction to Paarth Chothani, a Staff Software Engineer at Uber, and his work on the Michelangelo ML platform and Gen AI applications.
  • A deep dive into "Genie," an internal on-call assistant at Uber that evolved from a hackathon project to a sophisticated, agent-based system.
  • Discussion on the challenges of building Retrieval-Augmented Generation (RAG) systems at scale, including data ingestion, accuracy, and moving beyond simple information retrieval.
  • Exploration of the shift from basic chatbots to "agentic" AI that can understand intent, break down problems, and take actions on behalf of users.
  • A look at the future of Gen AI in software development, comparing the current innovation cycle to the early days of Hadoop and Big Data.

Key Concepts

  • Genie at Uber: An internal on-call assistant built to improve engineer productivity. It acts as a central knowledge base, ingesting data from sources like Slack, wikis, Jira, Google Docs, and source code to answer technical questions and reduce dependency on human experts.
  • From RAG to Agentic AI: The discussion highlights the limitations of a "plain RAG" system. The team evolved Genie into an "Agentic Genie" by building a framework of sub-agents and supervisor agents that can understand user intent, break down complex problems into sub-problems, and take actions (e.g., updating permissions, provisioning resources).
  • Data Ingestion Pipeline: Genie is powered by a data pipeline that ingests information from diverse internal sources. This process involves using technologies like Spark for large-scale data processing and chunking, creating embeddings, and storing them in a hosted vector database like OpenSearch.
  • Customization is Key: A central theme is that off-the-shelf Gen AI solutions are not enough for enterprise needs. Success depends on customizing each layer of the stack, from data ingestion and retrieval strategies to building specialized agents for specific use cases.
  • The New Frontier of Development: The hosts and guest draw parallels between the current state of building with LLMs and the early, formative days of the Hadoop ecosystem. It's a "blank slate" era where engineers are building new infrastructure and discovering novel applications from the ground up.

Quotes

  • At 00:00 - "This is gonna absolutely change how we build software and we're not even getting into infrastructure." - Host Eldad Farkash provides the opening context, emphasizing the transformative potential of Gen AI beyond just application development.
  • At 03:06 - "Genie is like, think of it as like your on-call assistant... you go to a Slack channel, let's say you have a problem with Spark... and because these technologies are widely used, you have to wait a lot." - Paarth Chothani explains the core business problem Genie solves: reducing the time engineers spend waiting for support from infrastructure teams.
  • At 08:05 - "We'll also start taking actions on your behalf, right? And that's really where I see... agents become so powerful that you can have different sub-agents which can take a specific action for you." - Paarth Chothani describes the evolution of Genie from an information retrieval tool to an agent that can execute tasks.
  • At 07:55 - "And does it feel like doing the Hadoop days at the beginning again? So it's kind of like, okay, basically blank sheet, we need to come up with the whole stack from scratch." - Host Eldad Farkash compares the current, fast-moving landscape of building Gen AI applications to the pioneering era of Big Data.

Takeaways

  • Start with a clear, focused problem. Uber's Genie began as a hackathon project to solve a tangible pain point for on-call engineers, which provided a clear path for development and adoption.
  • Enterprise RAG requires a multi-faceted approach. A simple RAG setup is often insufficient; achieving high accuracy at scale demands a custom data pipeline, the ability to ingest from diverse sources (wikis, Jira, code), and sophisticated retrieval strategies tailored to specific use cases.
  • The next evolution in AI-powered tooling is agentic systems. Moving beyond chatbots that only retrieve information, the goal is to build agents that can understand intent, break down tasks, and execute actions across different systems, fundamentally changing how users interact with internal infrastructure.