Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: DELTA LAKE
Skill Level: INTERMEDIATE
Duration: 20 MIN
In Bruce’s career in cyber warfare and enterprise cybersecurity, he worked on many of the highest profile botnet and nation state takedowns in history. He also helped build the tech in one of the world’s most advanced SOCs. Bruce will explain what he learned from that experience and why it prompted him to leave early retirement, sell his beloved sports car and co-found ziggiz. We all know there’s more data than ever. Anyone close to cybersecurity also knows that SIEMs, typically at the center of enterprise cybersecurity operations, have become too expensive even at the highest levels of government and Fortune 100s.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: APACHE SPARK, DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated.Is stream processing the future? We think so — and we’re building it with you using the latest capabilities in Apache Spark™ Structured Streaming. If you're a power user, this session is for you: we’ll demo new advanced features, from state transformations to real-time mode. If you prefer simplicity, this session is also for you: we’ll show how Lakeflow Declarative Pipelines simplifies managing streaming pipelines. And if you’re somewhere in between, we’ve got you covered — we’ll explain when to use your own streaming jobs versus Lakeflow Declarative Pipelines.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
SMBC, a major Japanese multinational financial services institution, has embarked on an initiative to build a GenAI-powered, modern and well-governed cloud data platform on Azure/Databricks. This initiative aims to build an enterprise data foundation encompassing loans, deposits, securities, derivatives, and other data domains. Its primary goals are:
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA LAKE, MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Retail and CPG companies face growing pressure to better forecast demand, optimize pricing and manage inventory — yet traditional approaches take months to deploy and often require extensive engineering support. In this session, we will showcase Workcloud Modeling Studio, a low-code/no-code ML platform designed for data scientists working in retail and CPG. Learn how this tool improves forecasting accuracy and accelerates time-to-value from months to hours. We will walk through a real-world use case of demand forecasting for a retailer using Zebra's Modeling Studio. This talk will demonstrate how to build, train and deploy an ML forecasting pipeline — without reinventing the wheel.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: EDUCATION
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS WORKFLOWS
Skill Level: BEGINNER
Duration: 20 MIN
The demand for skilled Databricks data engineers continues to rise as enterprises accelerate their adoption of the Databricks platform. However, navigating the complex ecosystem of data engineering tools, frameworks and best practices can be overwhelming. This session provides a structured roadmap to becoming an expert Databricks data engineer, offering a clear progression from foundational skills to advanced capabilities. Acadford, a leading training provider, has successfully trained thousands of data engineers on Databricks, equipping them with the skills needed to excel in their careers and obtain professional certifications. Drawing on this experience, we will guide attendees through the most in-demand skills and knowledge areas through a combination of structured learning and practical insights. Key takeaways:
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS WORKFLOWS, DLT
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. Databricks Serverless revolutionizes data engineering and analytics by eliminating the complexities of infrastructure management. This talk will provide an overview of this powerful serverless compute option, highlighting how it enables practitioners to focus solely on building robust data pipelines. We'll explore the core benefits, including automatic scaling, cost optimization and seamless integration with the Databricks ecosystem. Learn how serverless workflows simplify the orchestration of various data tasks, from ingestion to dashboards, ultimately accelerating time-to-insight and boosting productivity. This session is ideal for data engineers, data scientists and analysts looking to leverage the agility and efficiency of serverless computing in their data workflows.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: HEALTH AND LIFE SCIENCES
Technologies: DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Health Catalyst (HCAT) transformed its CI/CD strategy by replacing a rigid, internal deployment tool with Databricks Asset Bundles (DABs), unlocking greater agility and efficiency. This shift streamlined deployments across both customer workspaces and HCAT's core platform, accelerating time to insights and driving continuous innovation. By adopting DABs, HCAT ensures feature parity, standardizes metric stores across clients, and rapidly delivers tailored analytics solutions. Attendees will gain practical insights into modernizing CI/CD pipelines for healthcare analytics, leveraging Databricks to scale data-driven improvements. HCAT's next-generation platform, Health Catalyst Ignite™, integrates healthcare-specific data models, self-service analytics, and domain expertise—powering faster, smarter decision-making.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT, PUBLIC SECTOR
Technologies: APACHE ICEBERG, MOSAIC AI
Skill Level: ADVANCED
Duration: 40 MIN
This session introduces ByteDance’s challenges in data management and model training, and addresses them by Magnus (enhanced Apache Iceberg) and Byted Streaming (customized Mosaic Streaming). Magnus uses Iceberg’s branch/tag to manage massive datasets/checkpoints efficiently. With enhanced metadata and a custom C++ data reader, Magnus achieves optimal sharding, shuffling and data loading. Flexible table migration, detailed metrics and built-in full-text indexes on Iceberg tables further ensure training reliability. When training with ultra-large datasets, ByteDance faced scalability and performance issues. Given Streaming's scalability in distributed training and good code structure, the team chose and customized it to resolve challenges like slow startup, high resource consumption, and limited data source compatibility. In this session, we will explore Magnus and Byted Streaming, discuss their enhancements and demonstrate how they enable efficient and robust distributed training.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, TRAVEL AND HOSPITALITY
Technologies: MLFLOW, DSPY, MOSAIC AI
Skill Level: ADVANCED
Duration: 40 MIN
A production-ready GenAI application is more than the framework itself. Like ML, you need a unified platform to create an end-to-end workflow for production quality applications.Below is an example of how this works on Databricks: In this session, learn how to build agents to access all your data and models through function calling. Then, learn how DSPy enables agent interaction with each other to ensure the question is answered correctly. We will demonstrate a chatbot, powered by multiple agents, to be able to answer questions and reason answers the base LLM does not know and very specialized topics.ow and very specialized topics.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. Did you know that you can integrate with your favorite BI tools directly from Databricks SQL? You don’t even need to stand up an additional warehouse. This session shows the integrations with Microsoft Power Platform, Power BI, Tableau and dbt so you can have a seamless integration experience. Directly connect your Databricks workspace with Fabric and Power BI workspaces or Tableau to publish and sync data models, with defined primary and foreign keys, between the two platforms.
Type: LIGHTNING TALK
Track: DATA WAREHOUSING
Industry: ENERGY AND UTILITIES, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
In this session, we will share NCS’s approach to implementing a Databricks Lakehouse architecture, focusing on key lessons learned and best practices from our recent implementations. By integrating Databricks SQL Warehouse, the DBT Transform framework and our innovative test automation framework, we’ve optimized performance and scalability, while ensuring data quality. We’ll dive into how Unity Catalog enabled robust data governance, empowering business units with self-serve analytical workspaces to create insights while maintaining control. Through the use of solution accelerators, rapid environment deployment and pattern-driven ELT frameworks, we’ve fast-tracked time-to-value and fostered a culture of innovation. Attendees will gain valuable insights into accelerating data transformation, governance and scaling analytics with Databricks.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: DATA MARKETPLACE, AI/BI, DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
Growth in capital markets thrives on innovation, agility and real-time insights. This session highlights how leading firms use Databricks’ Data Intelligence Platform to uncover opportunities, optimize trading strategies and deliver personalized client experiences. Learn how advanced analytics and AI help organizations expand their reach, improve decision-making and unlock new revenue streams. Industry leaders share how unified data platforms break down silos, deepen insights and drive success in a fast-changing market. Key takeaways: Discover how data intelligence empowers capital markets firms to thrive in today’s competitive landscape!
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, MOSAIC AI, PYTORCH
Skill Level: INTERMEDIATE
Duration: 40 MIN
Scaling large language models (LLMs) and multimodal architectures requires efficient data management and computational power. NVIDIA NeMo Framework Megatron-LM on Databricks is an open source solution that integrates GPU acceleration and advanced parallelism with Databricks Delta Lakehouse, streamlining workflows for pre-training and fine-tuning models at scale. This session highlights context parallelism, a unique NeMo capability for parallelizing over sequence lengths, making it ideal for video datasets with large embeddings. Through the case study of TwelveLabs’ Pegasus-1 model, learn how NeMo empowers scalable multimodal AI development, from text to video processing, setting a new standard for LLM workflows.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: EDUCATION, PUBLIC SECTOR
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 40 MIN
Government leaders overwhelmingly recognize the potential benefits of AI as critical to long-term strategic goals of efficiency, but implementation challenges and security concerns could be obstacles to success.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, MANUFACTURING
Technologies: DELTA LAKE, MLFLOW, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join for an insightful presentation on creating a robust data architecture to drive business outcomes in the age of Generative AI. Santosh Kudva, GE Vernova Chief Data Officer and Kevin Tollison, EY AI Consulting Partner, will share their expertise on transforming data strategies to unleash the full potential of AI. Learn how GE Vernova, a dynamic enterprise born from the 2024 spin-off of GE, revamped its diverse landscape. They will provide a look into how they integrated the pre-spin-off Finance Data Platform into the GE Vernova Enterprise Data & Analytics ecosystem utilizing Databricks to enable high-performance AI-led analytics. Key insights include: Don't miss this opportunity to hear from industry leaders and gain valuable insights to elevate your data strategy and AI success.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session will explore how Adobe uses a sophisticated data security architecture built on the Databricks Data Intelligence Platform, along with the Open Cybersecurity Schema Framework (OCSF), to enable scalable, real-time threat detection across more than 10 PB of security data. We’ll compare different approaches to OCSF implementation and demonstrate how Adobe processes massive security datasets efficiently — reducing query times by 18%, maintaining 99.4% SLA compliance, and supporting 286 security users across 17 teams with over 4,500 daily queries. By using Databricks' Platform for serverless compute, scalable architecture, and LLM-powered recommendations, Adobe has significantly improved processing speed and efficiency, resulting in substantial cost savings. We’ll also highlight how OCSF enables advanced cross-tool analytics and automation, streamlining investigations. Finally, we’ll introduce Databricks’ new open-source OCSF toolkit for scalable security data normalization and invite the community to contribute.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: DELTA SHARING, UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
As data-driven companies scale from small startups to global enterprises, managing secure data access becomes increasingly complex. Traditional access control models fall short at enterprise scale, where dynamic, purpose-driven access is essential. In this talk, we explore how our “Just-in-Time” Purpose-Based Access Control (PBAC) platform addresses the evolving challenges of data privacy and compliance, maintaining least privilege while ensuring productivity. Using features like Unity Catalog, Delta Sharing & Databricks Apps, the platform delivers real-time, context-aware data governance. Leveraging JIT PBAC keeps your data secure, your engineers productive, your legal & security teams happy and your organization future-proof in the ever-evolving compliance landscape.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Explore advanced governance and authentication patterns for building secure, enterprise-grade apps with Databricks Apps. Learn how to configure complex permissions and manage access control using Unity Catalog. We’ll dive into “on-behalf-of-user” authentication — allowing agents to enforce user-specific access controls — and cover API-based authentication, including PATs and OAuth flows for external integrations. We’ll also highlight how Addepar uses these capabilities to securely build and scale applications that handle sensitive financial data. Whether you're building internal tools or customer-facing apps, this session will equip you with the patterns and tools to ensure robust, secure access in your Databricks apps.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK
Skill Level: ADVANCED
Duration: 40 MIN
This session explores advanced JSON Schema handing(inference and evolving), and event DemuxingTopics include:
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
The most common RAG systems rely on a frozen RAG system — one where there’s a single embedding model and single vector index. We’ve achieved a modicum of success with that, but when it comes to increasing accuracy for production systems there is only so much this approach solves. In this session we will explore how to move from the frozen systems to adaptive RAG systems which produce more tailored outputs with higher accuracy. Databricks services: Lakehouse, Unity Catalog, Mosaic, Sweeps, Vector Search, Agent Evaluation, Managed Evaluation, Inference Tables
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Learn how to build sophisticated systems that enable natural language interactions with both your structured databases and unstructured document collections. This session explores advanced techniques for creating unified and governed AI systems that can seamlessly interpret questions, retrieve relevant information and generate accurate answers across your entire data ecosystem. Key takeaways include:
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: EDUCATION, HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, AI/BI, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Medical providers often receive less than 15 minutes of instruction in how to interact with patients during emotionally charged end of life interactions. Continuing education for clinicians is critical to hone these skills but is difficult to scale traditional approaches that require professional patients and instructors. Here, we describe a custom chatbot that plays the role of patient and coach to provide a scaling learning experience. A critical challenge was how to mitigate the persistently cheerful and helpful tone which results from standard pretraining in the Patient Persona AI. We accomplished this by implementing a multi-agent architecture based upon a graphical model of the conversation. System prompts reflecting the patient’s cognitive state are dynamically updated as the conversation progresses. Future extensions of the work are intended to focus on additional custom model fine-tuning in the Mosaic AI platform to further improve the realism of the conversation.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: RETAIL AND CPG - FOOD
Technologies: MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Marketing professionals build campaigns, create content and use effective copywriting to tell a good story to promote a product/offer. All of this requires a thorough and meticulous process for every individual campaign. In order to assist marketing professionals at 7-Eleven, we built a multi-purpose assistant that could: We will walk you through how we created multiple agents as different personas with LangGraph and Mosaic AI to create a chat assistant that assumes a different persona based on the user query. We will also explain our evaluation methodology in choosing models and prompts and how we implemented guardrails for high reliability with sensitive marketing content. This assistant by 7-Eleven was showcased at the Databricks booth at NRF earlier this year.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: DELTA LAKE, DLT, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
LLM agents aren’t just answering questions — they’re running entire workflows. In this talk, we’ll show how agents can autonomously ingest, process and structure unstructured data using Unstructured, with outputs flowing directly into Databricks. Powered by the Model Context Protocol (MCP), agents can interface with Unstructured’s full suite of capabilities — discovering documents across sources, building ephemeral workflows and exporting structured insights into Delta tables. We’ll walk through a demo where an agent responds to a natural language request, dynamically pulls relevant documents, transforms them into usable data and surfaces insights — fast. Join us for a sneak peek into the future of AI-native data workflows, where LLMs don’t just assist — they operate.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. Improving healthcare impacts us all. We highlight how Premier Inc. took risk-adjusted patient data from more than 1,300 member hospitals across America, applying a natural language interface using AI/BI Genie, allowing our users to discover new insights. The stakes are high, new insights surfaced represent potential care improvement and lives positively impacted. Using Genie and our AI-ready data in Unity Catalog, our team was able to stand up a Genie instance in three short days, bypassing costs and time of custom modeling and application development. Additionally, Genie allowed our internal teams to generate complex SQL, as much as 10 times faster than writing it by hand. As Genie and lakehouse apps continue to advance rapidly, we are excited to leverage these features by introducing Genie to as many as 20,000 users across hundreds of hospitals. This will support our members’ ongoing mission to enhance the care they provide to the communities they serve.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: AI/BI, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: AI/BI, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This high-velocity workshop is designed for data and AI leaders seeking to rapidly develop a comprehensive AI strategy tailored to their organization's needs. In just 30 minutes, participants will engage in a focused, interactive session that delivers actionable insights and a strategic framework for AI implementation. Key components of the workshop include: By the end of this intensive session, you will have the foundation of a robust AI strategy and guidance on roadmap execution.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL, MOSAIC AI
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. Integrating AI into existing data workflows can be challenging, often requiring specialized knowledge and complex infrastructure. In this session, we'll share how SQL users can leverage AI/ML to access large language models (LLMs) and traditional machine learning directly from within SQL, simplifying the process of incorporating AI into data workflows. We will demonstrate how to use Databricks SQL for natural language processing, traditional machine learning, retrieval augmented generation and more. You'll learn about best practices and see examples of solving common use cases such as opinion mining, sentiment analysis, forecasting and other common AI/ML tasks.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT
Technologies: DELTA LAKE, MLFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us to hear about how Epsilon Data Management migrated Epsilon’s unique, AI-powered marketing identity solution from multi-petabyte on-prem Hadoop and data warehouse systems to a unified Databricks Lakehouse platform. This transition enabled Epsilon to further scale its Decision Sciences solution and enable new cloud-based AI research capabilities on time and within budget, without being bottlenecked by the resource constraints of on-prem systems. Learn how Delta Lake, Unity Catalog, MLflow and LLM endpoints powered massive data volume, reduced data duplication, improved lineage visibility, accelerated Data Science and AI, and enabled new data to be immediately available for consumption by the entire Epsilon platform in a privacy-safe way. Using the Databricks platform as the base for AI and Data Science at global internet scale, Epsilon deploys marketing solutions across multiple cloud providers and multiple regions for many customers.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 20 MIN
Explore how AI is transforming business intelligence and data analytics across the Databricks platform. This session offers a comprehensive overview of AI-assisted capabilities, from generating dashboards and visualizations to integrating Genie on dashboards for conversational analytics. Whether you’re a data engineer, analyst or BI developer, this session will equip you to leverage AI with BI for better, smarter decisions.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. In the race to revolutionize healthcare and drug discovery, biopharma companies are turning to AI to streamline workflows and unlock new scientific insights. This session, we will explore how NVIDIA BioNeMo, combined with Databricks Delta Lakehouse, can be used for advancing drug discovery for critical applications like molecular structure modeling, protein folding and diagnostics. We’ll demonstrate how BioNeMo pre-trained models can run inference on data securely stored in Delta Lake, delivering actionable insights. By leveraging containerized solutions on Databricks’ ML Runtime with GPU acceleration, users can achieve significant performance gains compared to traditional CPU-based computation.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. In today’s data landscape, the challenge isn’t just storing or processing data — it’s enabling every user, from data stewards to analysts, to find and trust the right data, fast. This session explores how Databricks is reimagining data discovery with the new Discover Page Experience — an intuitive, curated interface showcasing key data and workspace assets. We’ll dive into AI-assisted governance and AI-powered discovery features like AI-generated metadata, AI-assisted lineage and natural language data exploration in Unity Catalog. Plus, see how new certifications and deprecations bring clarity to complex data environments. Whether you’re a data steward highlighting trusted assets or an analyst navigating data without deep schema knowledge, this session will show how Databricks is making data discovery seamless for everyone.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA LAKE, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Marketing teams struggle with ‘dirty data’ — incomplete, inconsistent, and inaccurate information that limits campaign effectiveness and reduces the accuracy of AI agents. Our AI-powered marketing data management platform, built on Databricks, solves this with anomaly detection, ML-driven transformations and the built-in Acxiom Referential Real ID Graph with Data Hygiene.We’ll showcase how Delta Lake, Unity Catalog and Lakeflow Declarative Pipelines power our multi-tenant architecture, enabling secure governance and 75% faster data processing. Our privacy-first design ensures compliance with GDPR, CCPA and HIPAA through role-based access, encryption key management and fine-grained data controls.Join us for a live demo and Q&A, where we’ll share real-world results and lessons learned in building a scalable, AI-driven marketing data solution with Databricks.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: MANUFACTURING
Technologies: AI/BI, DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join this session to hear from two incredible companies, Xylem and Joby Aviation. Xylem shares their successful journey from fragmented legacy systems to a unified Enterprise Data Platform, demonstrating how they integrated complex ERP data across four business segments to achieve breakthrough improvements in parts management and operational efficiency. Following Xylem's story, learn how Joby Aviation leveraged Databricks to automate and accelerate flight test data checks, cutting processing times from over two hours to under thirty minutes. This session highlights how advanced cloud tools empower engineers to quickly build and run custom data checks, improving both speed and safety in flight test operations.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: HEALTH AND LIFE SCIENCES, PROFESSIONAL SERVICES, RETAIL AND CPG - FOOD
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Databricks announced two new features in 2024: AI/BI Dashboards and AI/BI Genie. Dashboards is a redesigned dashboarding experience for your regular reporting needs, while Genie provides a natural language experience for your last-mile analytics. In this session, Databricks Solutions Architect and content creator Youssef Mrini will present alongside Databricks MVP and content creator Josue A. Bogran on how you can get the most value from these tools for your organization. Content covered includes: Fluff-free, full of practical tips, and geared to help you deliver immediate impact with these new Databricks capabilities.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: RETAIL AND CPG - FOOD
Technologies: AI/BI, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Conagra is a global food manufacturer with $12.2B in revenue, 18K+ employees, 45+ plants in US, Canada and Mexico. Conagra's Supply Chain organization is heavily focused on delivering results in productivity, waste reduction, inventory rationalization, safety and customer service levels. By migrating the Supply Chain reporting suite to Databricks over the past 2 years, Conagra's Supply Chain Analytics & Data Science team has been able to deliver new AI solutions which complement traditional BI platforms and lay the foundation for additional AI/ML applications in the future. With Databricks Genie integrated within traditional BI reports, Conagra Supply Chain users can now go from insight to action faster and with fewer clicks, enabling speed to value in a complex Supply Chain. The Databricks platform also allows the team to curate data products to be consumed by traditional BI applications today as well as the ability to rapidly scale for the AI/ML applications of tomorrow.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: ADVANCED
Duration: 40 MIN
Go beyond the user interface and explore the cutting-edge technology driving AI/BI Genie. This session breaks down the AI/BI Genie architecture, showcasing how LLMs, retrieval-augmented generation (RAG) and finely tuned knowledge bases work together to deliver fast, accurate responses. We’ll also explore how AI agents orchestrate workflows, optimize query performance and continuously refine their understanding. Ideal for those who want to geek out about the tech stack behind Genie, this session offers a rare look at the magic under the hood.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: TRAVEL AND HOSPITALITY
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
American Airlines migrated from Hive Metastore to Unity Catalog using automated processes with Databricks APIs and GitHub Actions. This automation streamlined the migration for many applications within AA, ensuring consistency, efficiency and minimal disruption while enhancing data governance and disaster recovery capabilities.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR
Technologies: DELTA LAKE, MLFLOW, LLAMA
Skill Level: INTERMEDIATE
Duration: 40 MIN
Crisis Text Line has been innovating for ten years in text-based mental health crisis intervention and is now leading the next wave of GenAI use cases in the space. With over 300 million messages exchanged since 2013 and a decade of expertise, Crisis Text Line is unlocking the potential of AI to amplify human connection at a global scale.We will discuss how we leveraged our bedrock application to co-navigate crisis care through a set of early AI agent workflows. First, a simulator that reproduces texter behavior to train responders in taking conversations ranging in difficulty where the texter is in imminent risk of suicide or self-harm. Second, a tool that automatically monitors clinical quality of conversations. Third, predicted summarization to capture key context before conversations are transferred. Through the power of suggestion, this compound system aims to reduce burden and drive efficiency, such that our responders can focus on what they do best — support people in need.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: RETAIL AND CPG - FOOD
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Analysts often begin their Databricks journey by running familiar SQL queries in the SQL Editor, but that’s just the start. In this session, I’ll share the roadmap I followed to expand beyond ad-hoc querying into SQL Editor/notebook-driven development to scheduled data pipelines producing interactive dashboards — all powered by Databricks SQL and Unity Catalog. You’ll learn how to organize tables with primary-key/foreign-key relationships along with creating table and column comments to form the semantic model, utilizing DBSQL features like RELY constraints. I’ll also show how parameterized dashboards can be set up to empower self-service analytics and feed into Genie Spaces. Attendees will walk away with best practices for starting out with building a robust BI platform on Databricks, including tips for table design and metadata enrichment. Whether you’re a data analyst or BI developer, this talk will help you unlock powerful, AI-enhanced analytics workflows.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: RETAIL AND CPG - FOOD
Technologies: APACHE SPARK, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Table formats like Delta Lake and Iceberg have been game changers for pushing lakehouse architecture into modern Enterprises. The acquisition of Tabular added Iceberg to the Databricks ecosystem, an open format that was already well supported by processing engines across the industry. At HelloFresh we are building a lakehouse architecture that integrates many touchpoints and technologies all across the organization. As such we chose Iceberg as the table format to bridge the gaps in our decentralized managed tech landscape. We are leveraging Unity Catalog as the Iceberg REST catalog of choice for storing metadata and managing tables. In this talk we will outline our architectural setup between Databricks, Spark, Flink and Snowflake and will explain the native Unity Iceberg REST catalog, as well as catalog federation towards connected engines. We will highlight the impact on our business and discuss the advantages and lessons learned from our early adopter experience.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: MLFLOW, AI/BI, PYTORCH
Skill Level: INTERMEDIATE
Duration: 40 MIN
We present AT&T AutoClassify, built jointly between AT&T's Chief Data Office (CDO) and Databricks professional services, a novel end-to-end system for automatic multi-head binary classifications from unlabeled text data. Our approach automates the challenge of creating labeled datasets and training multi-head binary classifiers with minimal human intervention. Starting only from a corpus of unlabeled text and a list of desired labels, AT&T AutoClassify leverages advanced natural language processing techniques to automatically mine relevant examples from raw text, fine-tune embedding models and train individual classifier heads for multiple true/false labels. This solution can reduce LLM classification costs by 1,000x, making it an efficient solution in operational costs. The end result is a highly optimized and low-cost model servable in Databricks capable of taking raw text and producing multiple binary classifications. An example use case using call transcripts will be examined.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DLT, LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
We’re introducing a new developer experience for Lakeflow Declarative Pipelines designed for data practitioners who prefer a code-first approach and expect robust developer tooling. The new multi-file editor brings an IDE-like environment to declarative pipeline development, making it easy to structure transformation logic, configure pipelines throughout the development lifecycle and iterate efficiently.Features like contextual data previews and selective table updates enable step-by-step development. UI-driven tools, such as DAG previews and DAG-based actions, enhance productivity for experienced users and provide a bridge for those transitioning to declarative workflows.In this session, we’ll showcase the new editor in action, highlighting how these enhancements simplify declarative coding and improve development for production-ready data pipelines. Whether you’re an experienced developer or new to declarative data engineering, join us to see how Lakeflow Declarative Pipelines can enhance your data practice.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: APACHE SPARK, DATABRICKS APPS
Skill Level: BEGINNER
Duration: 20 MIN
The demand for data engineering keeps growing, but data teams are bored by repetitive tasks, stumped by growing complexity and endlessly harassed by an unrelenting need for speed. What if AI could take the heavy lifting off your hands? What if we make the move away from code-generation and into config-generation — how much more could we achieve? In this session, we’ll explore how AI is revolutionizing data engineering, turning pain points into innovation. Whether you’re grappling with manual schema generation or struggling to ensure data quality, this session offers practical solutions to help you work smarter, not harder. You’ll walk away with a good idea of where AI is going to disrupt the data engineering workload, some good tips around how to accelerate your own workflows and an impending sense of doom around the future of the industry!
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: MANUFACTURING, RETAIL AND CPG - FOOD, TRAVEL AND HOSPITALITY
Technologies: LLAMA, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Taxonomy generation is a challenge across industries such as retail, manufacturing and e-commerce. Incomplete or inconsistent taxonomies can lead to fragmented data insights, missed monetization opportunities and stalled revenue growth. In this session, we will explore a modern approach to solving this problem by leveraging Databricks platform to build a scalable compound AI architecture for automated taxonomy generation. The first half of the session will walk you through the business significance and implications of taxonomy, followed by a technical deep dive in building an architecture for taxonomy implementation on the Databricks platform using a compound AI architecture. We will walk attendees through the anatomy of taxonomy generation, showcasing an innovative solution that combines multimodal and text-based LLMs, internal data sources and external API calls. This ensemble approach ensures more accurate, comprehensive and adaptable taxonomies that align with business needs.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Autonomous AI agents are transforming industries by enabling systems to perform tasks, make decisions and adapt in real time without human intervention. In this talk, I will delve into the architecture and design principles required to build these agents within scalable AI infrastructure. Key topics will include constructing modular, reusable frameworks, optimizing resource allocation and enabling interoperability between agents and data pipelines. I will discuss practical use cases in which attendees will learn how to leverage containerization and orchestration techniques to enhance the flexibility and performance of these agents while ensuring low-latency decision-making. This session will also highlight challenges like ensuring robustness, ethical considerations and strategies for real-time feedback loops. Participants will gain actionable insights into building autonomous AI agents that drive efficiency, scalability and innovation in modern AI ecosystems.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: HEALTH AND LIFE SCIENCES
Technologies: DLT, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
Bayada is transforming its data ecosystem by consolidating Matillion+Snowflake and SSIS+SQL Server into a unified Enterprise Data Platform powered by Databricks. Using Databricks' Medallion architecture, this platform enables seamless data integration, advanced analytics and machine learning across critical domains like general ledger, recruitment and activity-based costing. Databricks was selected for its scalability, real-time analytics and ability to handle both structured and unstructured data, positioning Bayada for future growth. The migration aims to reduce data processing times by 35%, improve reporting accuracy and cut reconciliation efforts by 40%. Operational costs are projected to decrease by 20%, while real-time analytics is expected to boost efficiency by 15%. Join this session to learn how Bayada is leveraging Databricks to build a high-performance data platform that accelerates insights, drives efficiency and fosters innovation organization-wide.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY, MANUFACTURING
Technologies: AI/BI, MOSAIC AI, DATABRICKS APPS
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. Integrating AI agents into business systems requires tailored approaches for different maturity levels (crawl-walk-run) that balance scalability, accuracy and usability. This session addresses the critical challenge of making AI agents accessible to business users. We will explore four key integration methods: We'll compare these approaches, discussing their strengths, challenges and ideal use cases to help businesses select the most suitable integration strategy for their specific needs.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES, HEALTH AND LIFE SCIENCES, MANUFACTURING
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Are you ready to unlock the full power of Unity Catalog managed tables? This session delivers actionable insights for transitioning to UC managed tables. Learn why managed tables are the default for performance and ease of use, and how automatic feature upgrades future-proof your architecture. Whether you manage thousands of tables or want to streamline operations, you’ll gain the tools and strategies to thrive in the era of intelligent data management. Join us and discover how easy it is to move to UC managed tables!
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. AI is transforming industries, enhancing customer experiences and automating decisions. As organizations integrate AI into core operations, robust security is essential. The Databricks Security team collaborated with top cybersecurity researchers from OWASP, Gartner, NIST, HITRUST and Fortune 100 companies to evolve the Databricks AI Security Framework (DASF) to version 2.0. In this session, we’ll cover an AI security architecture using Unity Catalog, MLflow, egress controls, and AI gateway. Learn how security teams, AI practitioners and data engineers can secure AI applications on Databricks. Walk away with:• A reference architecture for securing AI applications• A worksheet with AI risks and controls mapped to industry standards like MITRE, OWASP, NIST and HITRUST• A DASF AI assistant tool to test your AI security
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: RETAIL AND CPG - FOOD
Technologies: APACHE SPARK, DELTA LAKE, DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
Traditional streaming works great when your data source is append-only, but what if your data source includes updates and deletes? At 84.51 we used Lakeflow Declarative Pipelines and Delta Lake to build a streaming data flow that consumes inserts, updates and deletes while still taking advantage of streaming checkpoints. We combined this flow with a materialized view and Enzyme incremental refresh for a low-code, efficient and robust end-to-end data flow.We process around 8 million sales transactions each day with 80 million items purchased. This flow not only handles new transactions but also handles updates to previous transactions.Join us to learn how 84.51 combined change data feed, data streaming and materialized views to deliver a “better together” solution.84.51 is a retail insights, media & marketing company. We use first-party retail data from 60 million households sourced through a loyalty card program to drive Kroger’s customer-centric journey.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: MLFLOW, LLAMA, MOSAIC AI
Skill Level: ADVANCED
Duration: 40 MIN
Generic LLM metrics are useless until it meets your business needs.In this session we will dive deep into creating bespoke custom state-of-the-art AI metrics that matters to you. Discuss best practices on LLM evaluation strategies, when to use LLM judge vs. statistical metrics and many more. Through a live demo using Mosaic AI Framework, we will showcase: By the end of this session, you'll be equipped to create AI solutions that are not only powerful but also relevant to your organizations needs. Join us to transform your AI strategy and make a tangible impact on your business!
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The insurance industry is at the crossroads of digital transformation, facing challenges from market competition and customer expectations. While conventional ML applications have historically provided capabilities in this domain, the emergence of Agentic AI frameworks presents a revolutionary opportunity to build truly autonomous insurance applications. We will address issues related to data governance and quality while discussing how to monitor/evaluate fine-tune models. We'll demonstrate the application of the agentic framework in the insurance context and how these autonomous agents can work collaboratively to handle complex insurance workflows — from submission intake and risk evaluation to expedited quote generation. This session demonstrates how to architect intelligent insurance solutions using Databricks Mosaic AI agentic core components including Unity Catalog, Playground, model evaluation/guardrails, privacy filters, AI functions and AI/BI Genie.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. Mosaic AI Vector Search is powering high-accuracy retrieval systems in production across a wide range of use cases — including RAG applications, entity resolution, recommendation systems and search. Fully integrated with the Databricks Data Intelligence Platform, it eliminates pipeline maintenance by automatically syncing data from source to index. Over the past year, customers have asked for greater scale, better quality out-of-the-box and cost-efficient performance. This session delivers on those needs — showcasing best practices for implementing high-quality retrieval systems and revealing major product advancements that improve scalability, efficiency and relevance. What you’ll learn:
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: LLAMA, PYTORCH
Skill Level: INTERMEDIATE
Duration: 20 MIN
As organizations increasingly leverage sensitive data for AI applications, generating high-quality synthetic data with mathematical guarantees of privacy has become crucial. This talk explores the use of Gretel Safe Synthetics (now part of NVIDIA) to generate differentially private synthetic data that maintains high fidelity to the source data and high utility on downstream tasks across heterogeneous datasets. Our analysis presents a framework for privacy-preserving synthetic data generation with two use cases: e-commerce reviews and doctor’s notes. We reveal nuanced strategies for:
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. Want to accelerate your team's data science workflow? This session reveals how Databricks Notebooks can transform your productivity through an optimized environment designed specifically for data science and AI work. Discover how notebooks serve as a central collaboration hub where code, visualizations, documentation and results coexist seamlessly, enabling faster iteration and development. Key takeaways: You'll leave with practical techniques to enhance your notebook-based workflow and deliver AI projects faster with higher-quality results.
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, TRAVEL AND HOSPITALITY
Technologies: DATA MARKETPLACE, DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 20 MIN
AccuWeather harnesses cutting-edge technology, industry-leading weather data, and expert insights to empower businesses and individuals worldwide. In this session, we will explore how AccuWeather’s comprehensive datasets—ranging from historical and current conditions to forecasts and climate normals—can drive real-world impact across diverse industries. By showcasing scenario-based examples, we’ll demonstrate how AccuWeather’s hourly and daily weather data can address the unique needs of your organization, whether for operational planning, risk management, or strategic decision-making. This session is ideal for both newcomers to AccuWeather’s offerings and experienced users seeking to unlock the full potential of our weather data to optimize performance, improve efficiency, and boost overall success.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: APACHE SPARK, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Building a custom Spark data source connector once required Java or Scala expertise, making it complex and limiting. This left many proprietary data sources without public SDKs disconnected from Spark. Additionally, data sources with Python SDKs couldn't harness Spark’s distributed power. Spark 4.0 changes this with a new Python API for data source connectors, allowing developers to build fully functional connectors without Java or Scala. This unlocks new possibilities, from integrating proprietary systems to leveraging untapped data sources. Supporting both batch and streaming, this API makes data ingestion more flexible than ever. In this talk, we’ll demonstrate how to build a Spark connector for Excel using Python, showcasing schema inference, data reads/writes and streaming support. Whether you're a data engineer or Spark enthusiast, you’ll gain the knowledge to integrate Spark with any data source — entirely in Python.
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: HEALTH AND LIFE SCIENCES
Technologies: APACHE SPARK, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
As data ecosystems grow increasingly complex, the ability to share data securely, seamlessly, and in real time has become a strategic differentiator. In this session, Cigna will showcase how Delta Sharing on Databricks has enabled them to modernize data delivery, reduce operational overhead, and unlock new market opportunities. Learn how Cigna achieved significant savings by streamlining operations, compute, and platform overhead for just one use case. Explore how decentralizing data ownership—transitioning from hyper-centralized teams to empowered product owners—has simplified delivery and accelerated innovation. Most importantly, see how this modern open data-sharing framework has positioned Cigna to win contracts they previously couldn’t, by enabling real-time, cross-organizational data collaboration with external partners. Join us to hear how Cigna is using Delta Sharing not just as a technical enabler, but as a business catalyst.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENERGY AND UTILITIES
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
As data ecosystems grow more complex, organizations often struggle with siloed platforms and fragmented governance. In this session, we’ll explore how our team made Databricks the central hub for cross-platform interoperability, enabling seamless Snowflake integration through Unity Catalog and the Iceberg REST API. We’ll cover: By leveraging Uniform, Delta, and Iceberg, we created a flexible, vendor-agnostic architecture that bridges Databricks and Snowflake without compromising performance or security.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: MANUFACTURING, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
We’re excited to share with you how SAP Business Data Cloud supports Delta Sharing to share SAP data securely and seamlessly with Databricks—no complex ETL or data duplication required. This enables organizations to securely share SAP data for analytics and AI in Databricks while also supporting bidirectional data sharing back to SAP.In this session, we’ll demonstrate the integration in action, followed by a discussion of how the global beauty group, Natura, will leverage this solution. Whether you’re looking to bring SAP data into Databricks for advanced analytics or build AI models on top of trusted SAP datasets, this session will show you how to get started — securely and efficiently.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK
Skill Level: ADVANCED
Duration: 40 MIN
This session explains how we've made our Apache Spark™ versionless for end users by introducing a stable client API, environment versioning and automatic remediation. These capabilities have enabled auto-upgrade of hundreds of millions of workloads with minimal disruption for Serverless Notebooks and Jobs. We'll also introduce a new approach to dependency management using environments. Admins will learn how to speed up package installation with Default Base Environments, and users will see how to manage custom environments for their own workloads.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
In the rapidly-evolving field of data analytics, (AI/BI) dashboards and Power BI stand out as two formidable approaches, each offering unique strengths and catering to specific use cases. Power BI has earned its reputation for delivering user-friendly, highly customisable visualisations and reports for data analysis. On the other hand, AI/BI dashboards have gained good traction due to their seamless integration with the Databricks platform, making them an attractive option for data practitioners. This session will provide a comparison of these two tools, highlighting their respective features, strengths and potential limitations. Understanding the nuances between these tools is crucial for organizations aiming to make informed decisions about their data analytics strategy. This session will equip participants with the knowledge needed to select the most appropriate tool or combination of tools to meet their data analysis requirements and drive data-informed decision-making processes.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, AI/BI
Skill Level: BEGINNER
Duration: 20 MIN
PySpark has long been a cornerstone of big data processing, excelling in data preparation, analytics and machine learning tasks within traditional data lakes. However, the rise of multimodal AI and vector search introduces challenges beyond its capabilities. Spark’s new Python data source API enables integration with emerging AI data lakes built on the multi-modal Lance format. Lance delivers unparalleled value with its zero-copy schema evolution capability and robust support for large record-size data (e.g., images, tensors, embeddings, etc), simplifying multimodal data storage. Its advanced indexing for semantic and full-text search, combined with rapid random access, enables high-performance AI data analytics to the level of SQL. By unifying PySpark's robust processing capabilities with Lance's AI-optimized storage, data engineers and scientists can efficiently manage and analyze the diverse data types required for cutting-edge AI applications within a familiar big data framework.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR
Technologies: APACHE SPARK, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
AI is moving from pilots to production, but many organizations still struggle to connect boardroom ambitions with operational reality. Palantir’s Artificial Intelligence Platform (AIP) and the Databricks Data Intelligence Platform now form a single, open architecture that closes this gap by pairing Palantir’s operational decision empowering Ontology- with Databricks’ industry-leading scale, governance and Lakehouse economics. The result: real-time, AI-powered, autonomous workflows that are already powering mission-critical outcomes for the U.S. Department of Defense, bp and other joint customers across the public and private sectors. In this technically grounded but business-focused session you will see the new reference architecture in action. We will walk through how Unity Catalog and Palantir Virtual Tables provide governed, zero-copy access to Lakehouse data and back mission-critical operational workflows on top of Palantir’s semantic ontology and agentic AI capabilities. We will also explore how Palantir’s no-code and pro-code tooling integrates with Databricks compute to orchestrate builds and write tables to Unity Catalog. Come hear from customers currently using this architecture to drive critical business outcomes seamlessly across Databricks and Palantir.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Discover how to build and deploy AI-powered applications natively on the Databricks Data Intelligence Platform. This session introduces best practices and a standard reference architecture for developing production-ready apps using popular frameworks like Dash, Shiny, Gradio, Streamlit and Flask. Learn how to leverage agents for orchestration and explore primary use cases supported by Databricks Apps, including data visualization, AI applications, self-service analytics and data quality monitoring. With serverless deployment and built-in governance through Unity Catalog, Databricks Apps enables seamless integration with your data and AI models, allowing you to focus on delivering impactful solutions without the complexities of infrastructure management. Whether you're a data engineer or an app developer, this session will equip you with the knowledge to create secure, scalable and efficient applications within a Databricks environment.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: EDUCATION, ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 40 MIN
Many studies have indicated that having a strong Data & AI culture helps our businesses be more successful. This can lead to better business performance, becoming more profitable and being more competitive compared to your peer companies as well as attaining and retaining top talent. What does it mean to have a Data & AI culture? It’s the ability for an organization to make data-driven decisions. It means using insights to improve your business results and using data ultimately allows you to enable AI. It tends to be the people that get in the way of having and sustaining an effective Data & AI culture. Do you have people already in your teams that can help you build your Data & AI culture? Can you attract and retain that talent to your organization? Can you help integrate that great talent into your organization to promote a Data & AI culture? It’s also ensuring that you fundamentally change the way you/your teams/organizations work.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, PUBLIC SECTOR
Technologies: DELTA LAKE, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
There are many challenges to making a data platform actually a platform, something that hides complexity. Data engineers and scientists are looking for a simple and intuitive abstraction to focus on their work, not where it runs to maintain compliance, what credentials it uses to access data or how it generates operational telemetry. At Databricks we’ve developed a data-centric approach to workload development and deployment that enables data workers to stop doing migrations and instead develop with confidence. Attend this session to learn how to run simple, secure and compliant global multi-cloud workloads at scale on Databricks.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Discover how Dodo Brands, a global pizza and coffee business with over 1,200 retail locations and 40k employees, revolutionized their analytics infrastructure by creating a self-service data platform. This session explores the approach to empowering analysts, data scientists and ML engineers to independently build analytical pipelines with minimal involvement from data engineers. By leveraging Databricks as the backbone of their platform, the team developed automated tools like a "job-generator" that uses Jinja templates to streamline the creation of data jobs. This approach minimized manual coding and enabled non-data engineers to create over 1,420 data jobs — 90% of which were auto-generated by user configurations. Supporting thousands of weekly active users via tools like Apache Superset. This session provides actionable insights for organizations seeking to scale their analytics capabilities efficiently without expanding their data engineering teams.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: LLAMA
Skill Level: INTERMEDIATE
Duration: 40 MIN
Regulated or restricted fields like Health Care make collecting training data complicated. We all want to do the right thing, but how? This talk will look at how Fight Health Insurance used de-identified public and proprietary information to create a semi-synthetic training set for use in fine-tuning machine learning models to power Fight Paperwork. We'll explore how to incorporate the latest "reasoning" techniques in fine tuning as well as how to make models that you can afford to serve — think single GPU inference instead of a cluster of A100s. In addition to the talk we have the code used in a public GitHub repo — although it is a little rough, so you might want to use it more as a source of inspiration rather than directly forking it.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: MLFLOW, AI/BI, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Discover how Tahoe Therapeutics (formerly Vevo) is generating gigascale single-cell data that map how drugs interact with cells from cancer patients. They are using that to find better therapeutics, and to build AI models that can predict drug-patient interactions on Databricks. Their technology enabled the landmark Tahoe-100M atlas, the world’s largest dataset of drug responses-profiling 100 million cells across 60,000 conditions. Learn how we use Databricks to process this massive data, enabling AI models that predict drug efficacy and resistance at the cellular level. Recognized as the Grand Prize Winner of the Databricks Generative AI Startup Challenge, Tahoe sets a new standard for scalable, data-driven drug discovery.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Ready to go beyond the basics of Mosaic AI? This session will walk you through how to architect and scale production-grade AI systems on the Databricks Data Intelligence Platform. We’ll cover practical techniques for building end-to-end AI pipelines — from processing structured and unstructured data to applying Mosaic AI tools and functions for model development, deployment and monitoring. You’ll learn how to integrate experiment tracking with MLflow, apply performance tuning and use built-in frameworks to manage the full AI lifecycle. By the end, you’ll be equipped to design, deploy and maintain AI systems that deliver measurable outcomes at enterprise scale.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. Explore how Anthropic's frontier models power AI agents in Databricks Mosaic AI Agent Framework. Learn to leverage Claude's state-of-the-art capabilities for complex agentic workflows while benefiting from Databricks unified governance, credential management and evaluation tools. We'll demonstrate how Anthropic's models integrate seamlessly to create production-ready applications that combine Claude's reasoning with Databricks data intelligence capabilities.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, AI/BI, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. One of the biggest promises for LLM agents is automating all knowledge work over unstructured data — we call these "knowledge agents". To date, while there are fragmented tools around data connectors, storage and agent orchestration, AI engineers have trouble building and shipping production-grade agents beyond basic chatbots. In this session, we first outline the highest-value knowledge agent use cases we see being built and deployed at various enterprises. These are: We then define the core architectural components around knowledge management and agent orchestration required to build these use cases. By the end you'll not only have an understanding of the core technical concepts, but also an appreciation of the ROI you can generate for end-users by shipping these use cases to production.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK, AI/BI, DLT
Skill Level: BEGINNER
Duration: 40 MIN
In the dynamic world of sports betting, precision and adaptability are key. Sports traders must navigate risk management, limitations of data feeds, and much more to prevent small model miscalculations from causing significant losses. To ensure accurate real-time pricing of hundreds of interdependent markets, traders provide key inputs such as player skill-level adjustments, whilst maintaining precise correlations. Black-box models aren’t enough— constant feedback loops drive informed, accurate decisions. Join DraftKings as we showcase how we expose real-time metrics from our simulation engine, to empower traders with deeper insights into how their inputs shape the model. Using Spark Structured Streaming, Kafka, and Databricks dashboards, we transform raw simulation outputs into actionable data. This transparency into our engines enables fine-grained control over pricing― leading to more accurate odds, a more efficient sportsbook, and an elevated customer experience.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, DLT, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Barclays Post Trade real-time trade monitoring platform was historically built on a complex set of legacy technologies including Java, Solace, and custom micro-services.This session will demonstrate how the power of Lakeflow Declarative Pipelines' new real-time mode, in conjunction with the foreach_batch_sink, can enable simple, cost-effective streaming pipelines that can load high volumes of data into Databricks new Serverless OLTP database with very low latency.Once in our OLTP database, this can be used to update real-time trading dashboards, securely hosted in Databricks Apps, with the latest stock trades - enabling better, more responsive decision-making and alerting.The session will walk-through the architecture, and demonstrate how simple it is to create and manage the pipelines and apps within the Databricks environment.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 20 MIN
Agentic AI is the next evolution in artificial intelligence, with the potential to revolutionize the industry. However, its potential is matched only by its risk: without high-quality, trustworthy data, agentic AI can be exponentially dangerous. Join Barr Moses, CEO and Co-Founder of Monte Carlo, to explore how to leverage Databricks' powerful platform to ensure your agentic AI initiatives are underpinned by reliable, high-quality data. Barr will share: How data quality impacts agentic AI performance at every stage of the pipeline Strategies for implementing data observability to detect and resolve data issues in real-time Best practices for building robust, error-resilient agentic AI models on Databricks. Real-world examples of businesses harnessing Databricks' scalability and Monte Carlo’s observability to drive trustworthy AI outcomes Learn how your organization can deliver more reliable agentic AI and turn the promise of autonomous intelligence into a strategic advantage.Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: FINANCIAL SERVICES
Technologies: AI/BI, MOSAIC AI, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
This presentation explores how Databricks' Data Intelligence Platform supports the development and deployment of responsible AI in credit decisioning, ensuring fairness, transparency and regulatory compliance. Key areas include bias and fairness monitoring using Lakehouse Monitoring to track demographic metrics and automated alerts for fairness thresholds. Transparency and explainability are enhanced through the Mosaic AI Agent Framework, SHAP values and LIME for feature importance auditing. Regulatory alignment is achieved via Unity Catalog for data lineage and AIBI dashboards for compliance monitoring. Additionally, LLM reliability and security are ensured through AI guardrails and synthetic datasets to validate model outputs and prevent discriminatory patterns. The platform integrates real-time SME and user feedback via Databricks Apps and AI/BI Genie Space.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
GenAI & machine learning are reshaping industries, driving innovation and redefining business strategies. As organizations embrace these technologies, they face significant challenges in managing AI initiatives effectively, such as balancing innovation with ethical integrity, operational resilience and regulatory compliance. This presentation introduces the Databricks AI Governance Framework (DAGF), a practical framework designed to empower organizations to navigate the complexities of AI. It provides strategies for building scalable, responsible AI programs that deliver measurable value, foster innovation and achieve long-term success. By examining the framework's five foundational pillars — AI organization, ethics, legal and regulatory compliance, transparency and interpretability, AI operations and infrastructure and AI security — this session highlights how AI governance aligns programs with the organization's strategic goals, mitigates risks and builds trust across stakeholders.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Want to create AI agents that can do more than just generate text? Join us to explore how combining Databricks' Mosaic AI Agent Framework with the Model Context Protocol (MCP) unlocks powerful tool-calling capabilities. We'll show you how MCP provides a standardized way for AI agents to interact with external tools, data and APIs, solving the headache of fragmented integration approaches. Learn to build agents that can retrieve both structured and unstructured data, execute custom code and tackle real enterprise challenges. Key takeaways: Whether you're building customer service bots or data analysis assistants, you'll leave with practical know-how to create powerful, governed AI agents.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: MLFLOW, LLAMA, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Unlock the truth behind data modeling in Databricks. This session will tackle the top 10 myths surrounding relational and dimensional data modeling. Attendees will gain a clear understanding of what Databricks Lakehouse truly supports today, including how to leverage primary and foreign keys, identity columns for surrogate keys, column-level data quality constraints and much more. This session will talk through the lens of medallion architecture, explaining how to implement data models across bronze, silver, and gold tables. Whether you’re migrating from a legacy warehouse or building new analytics solutions, you’ll leave equipped to fully leverage Databricks’ capabilities, and design scalable, high-performance data models for enterprise analytics.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS WORKFLOWS, DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 20 MIN
Addepar possesses an enormous private investment data set with 40% of the $7T assets on the platform allocated to alternatives. Leveraging the Addepar Data Lakehouse (ADL), built on Databricks, we have built a scalable data pipeline that assesses millions of private fund investment cash flows and translates it to a private fund benchmarks data offering. Investors on the Addepar platform can leverage this data seamlessly integrated against their portfolio investments and obtain actionable investment insights. At a high-level, this data offering consists of an extensive data aggregation, filtering, and construction logic that dynamically updates for clients through the Databricks job workflows. This derived dataset has gone through several iterations with investment strategists and academics that leveraged delta shared tables. Irrespective of the data source, the data pipeline coalesces all relevant cash flow activity against a unique identifier before constructing the benchmarks.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: APACHE SPARK, LAKEFLOW, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Ingesting data from SaaS systems sounds straightforward—until you hit API limits, miss SLAs, or accidentally ingest PII. Sound familiar? In this talk, we’ll share how Databricks evolved from scrappy ingestion scripts to a unified, secure, and scalable ingestion platform. Along the way, we’ll highlight the hard lessons, the surprising pitfalls, and the tools that helped us level up. Whether you’re just starting to wrangle third-party data or looking to scale while handling governance and compliance, this session will help you think beyond pipelines and toward platform thinking.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: APACHE SPARK, DATABRICKS WORKFLOWS, DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated.Databricks Asset Bundles (DABs) provide a way to use the command line to deploy and run a set of Databricks assets — like notebooks, Python code, Lakeflow Declarative Pipelines and workflows. To automate deployments, you create a deployment pipeline that uses the power of DABs along with other validation steps to ensure high quality deployments.In this session you will learn how to automate CI/CD processes for Databricks while following best practices to keep deployments easy to scale and maintain. After a brief explanation of why Databricks Asset Bundles are a good option for CI/CD, we will walk through a working project including advanced variables, target-specific overrides, linting, integration testing and automatic deployment upon code review approval. You will leave the session clear on how to build your first GitHub Action using DABs.ub Action using DABs.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
ClickHouse is a C++ based, column-oriented database built for real-time analytics. While it has its own internal storage format, the rise of open lakehouse architectures has created a growing need for seamless interoperability. In response, we have developed integrations with your favorite lakehouse ecosystem to enhance compatibility, performance and governance. From integrating with Unity Catalog to embedding the Delta Kernel into ClickHouse, this session will explore the key design considerations behind these integrations, their benefits to the community, the lessons learned and future opportunities for improved compatibility and seamless integration.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: RETAIL AND CPG - FOOD
Technologies: DELTA LAKE
Skill Level: BEGINNER
Duration: 40 MIN
As first-party data becomes increasingly invaluable to organizations, Walmart Data Ventures is dedicated to bringing to life new applications of Walmart’s first-party data to better serve its customers. Through Scintilla, its integrated insights ecosystem, Walmart Data Ventures continues to expand its offerings to deliver insights and analytics that drive collaboration between our merchants, suppliers, and operators.Scintilla users can now access Walmart data using Cloud Feeds, based on Databricks Delta Sharing technologies. In the past, Walmart used API-based data sharing models, which required users to possess certain skills and technical attributes that weren’t always available. Now, with Cloud Feeds, Scintilla users can more easily access data without a dedicated technical team behind the scenes making it happen. Attendees will gain valuable insights into how Walmart has built its robust data sharing architecture and strategies to design scalable and collaborative data sharing architectures in their own organizations.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: FINANCIAL SERVICES
Technologies: DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: AI/BI, LLAMA
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. For most companies, building compound AI systems remains aspirational. LLMs are powerful, but imperfect, and their non-deterministic nature makes steering them to high accuracy a challenge. In this session, we’ll demonstrate how to build compound AI systems using SLMs and highly accurate mini-agents that can be integrated into agentic workflows. You'll learn about breakthrough techniques, including: memory RAG, an embedding algorithm that reduces hallucinations using embed-time compute to generate contextual embeddings, improving indexing and retrieval, and memory tuning, a finetuning algorithm that reduces hallucinations using a Mixture of Memory Experts (MoME) to specialize models with proprietary data. We’ll also share real-world examples (text-to-SQL, factual reasoning, function calling, code analysis and more) across various industries. With these building blocks, we’ll demonstrate how to create high accuracy mini-agents that can be composed into larger AI systems.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Given that data is the new oil, it must be treated as such. Organizations that pursue greater insight into their businesses and their customers must manage, govern, protect and observe the use of the data that drives these insights in an efficient, cost-effective, compliant and auditable manner without degrading access to that data. Azure Data Lake Storage offers many features which allow customers to apply such controls and protections to their critical data assets. Understanding how these features behave, the granularity, cost and scale implications and the degree of control or protection that they apply are essential to implement a data lake that reflects the value contained within. In this session, the various data protection, governance and management capabilities available now and upcoming in ADLS will be discussed. This will include how deep integration with Azure Databricks can provide a more comprehensive, end-to-end coverage for these concerns, yielding a highly efficient and effective data governance solution.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. Databricks has a free, comprehensive solution for migrating legacy data warehouses from a wide range of source systems. See how we accelerate migrations from legacy data warehouses to Databricks SQL, achieving 50% faster migration than traditional methods. We'll cover the tool’s automated migration process: This comprehensive approach increases the predictability of migration projects, allowing businesses to plan and execute migrations with greater confidence.
Type: DEEP DIVE
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 90 MIN
This in-depth session explores advanced MLOps practices for implementing production-grade machine learning workflows on Databricks. We'll examine the complete MLOps journey from foundational principles to sophisticated implementation patterns, covering essential tools including MLflow, Unity Catalog, Feature Stores and version control with Git. Dive into Databricks' latest MLOps capabilities including MLflow 3.0, which enhances the entire ML lifecycle from development to deployment with particular focus on generative AI applications. Key session takeaways include:
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this session we’ll dive into the SQL kitchen and use a combination of SQL staples and nouvelle cuisine such as recursive queries, temporary tables, and stored procedures. We’ll leave you with well-scripted recipes to execute immediately or store for later consumption in your Unity Catalog. Think of this session as building your go-to cookbook of SQL techniques. Bon appétit!
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
In this session you'll learn how to onboard to Databricks in a way that ensures you can effectively measure and manage ROI of using Databricks down the line. We will show you how to set-up workspaces and compute, decide on a tagging strategy and how to utilize policies to enforce best practices and make future you a happy camper.
Type: LIGHTNING TALK
Track: DATA WAREHOUSING
Industry: MEDIA AND ENTERTAINMENT
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
FunPlus's journey to building a cost-effective and efficient data platform with Databricks: exploring how FunPlus leveraged Databricks to tackle key challenges, enhance data engineering and ML efficiency, and showcasing best practices and their impact on game development and operations.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: FINANCIAL SERVICES
Technologies: AI/BI, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
In this session, we will share how we are transforming the way organizations process unstructured and non-standard documents using Mosaic AI and agentic patterns within the Databricks ecosystem. We have developed a scalable pipeline that turns complex legal and regulatory content into structured, tabular data.We will walk through the full architecture, which includes Unity Catalog for secure and governed data access, Databricks Vector Search for intelligent indexing and retrieval and Databricks Apps to deliver clear insights to business users. The solution supports multiple languages and formats, making it suitable for teams working across different regions. We will also discuss some of the key technical challenges we addressed, including handling parsing inconsistencies, grounding model responses and ensuring traceability across the entire process. If you are exploring how to apply GenAI and large language models, this session is for you. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
At Haleon, we've leveraged Databricks APIs and serverless compute to develop customer-facing applications for our business. This innovative solution enables us to efficiently deliver SAP invoice and order management data through front-end applications developed and served via our API Gateway. The Databricks lakehouse architecture has been instrumental in eliminating the friction associated with directly accessing SAP data from operational systems, while enhancing our performance capabilities. Our system acheived response times of less than 3 seconds from API call, with ongoing efforts to optimise this performance. This architecture not only streamlines our data and application ecosystem but also paves the way for integrating GenAI capabilities with robust governance measures for our future infrastructure. The implementation of this solution has yielded significant benefits, including a 15% reduction in customer service costs and a 28% increase in productivity for our customer support team.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, TRAVEL AND HOSPITALITY
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS SQL
Skill Level: ADVANCED
Duration: 20 MIN
PySpark supports many data sources out of the box, such as Apache Kafka, JDBC, ODBC, Delta Lake, etc. However, some older systems, such as systems that use JMS protocol, are not supported by default and require considerable extra work for developers to read from them. One such example is ActiveMQ for streaming. Traditionally, users of ActiveMQ have to use a middle-man in order to read the stream with Spark (such as writing to a MySQL DB using Java code and reading that table with Spark JDBC). With PySpark 4.0’s custom data sources (supported in DBR 15.3+) we are able to cut out the middle-man processing using batch or Spark Streaming and consume the queues directly from PySpark, saving developers considerable time and complexity in getting source data into your Delta Lake and governed by Unity Catalog and orchestrated with Databricks Workflows.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. Measuring the effectiveness of domain-specific AI agents requires specialized evaluation frameworks that go beyond standard LLM benchmarks. This session explores methodologies for assessing agent quality across specialized knowledge domains, tailored workflows, and task-specific objectives. We'll demonstrate practical approaches to designing robust LLM judges that align with your business goals and provide meaningful insights into agent capabilities and limitations. Key session takeaways include: Join us to learn how proper evaluation methodologies can transform your domain-specific agents from experimental tools to trusted enterprise solutions with measurable business value.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: MANUFACTURING
Technologies: DELTA LAKE, APACHE ICEBERG, DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this presentation, we'll show how we achieved a unified development experience for teams working on Mercedes-Benz Data Platforms in AWS and Azure. We will demonstrate how we implemented Azure to AWS and AWS to Azure data product sharing (using Delta Sharing and Cloud Tokens), integration with AWS Glue Iceberg tables through UniForm and automation to drive everything using Azure DevOps Pipelines and DABs. We will also show how to monitor and track cloud egress costs and how we present a consolidated view of all the data products and relevant cost information. The end goal is to show how customers can offer the same user experience to their engineers and not have to worry about which cloud or region the Data Product lives in. Instead, they can enroll in the data product through self-service and have it available to them in minutes, regardless of where it originates.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
AI for enterprises, particularly in the era of GenAI, requires rapid experimentation and the ability to productionize models and agents quickly and at scale. Compliance, resilience and commercial flexibility drive the need to serve models across regions. As cloud providers struggle with rising demand for GPUs in environments, VM shortages have become commonplace, and add to the pressure of general cloud outages. Enterprises that can quickly leverage GPU capacity in other cloud regions will be better equipped to capitalize on the promise of AI, while staying flexible to serve distinct user bases and complying with regulations. In this presentation we will show and discuss how to implement AI deployments across cloud regions, deploying a model across regions and using a load balancer to determine where to best route a user request.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
In today’s fast-evolving crypto landscape, organizations require fast, reliable intelligence to manage risk, investigate financial crime, and stay ahead of evolving threats. In this session we will discover how Elliptic built a scalable, high-performance Data Intelligence Platform that delivers real-time, actionable Blockchain insights to their customers. We’ll walk you through some of the key components of the Elliptic Platform, including the Elliptic Entity Graph and our User-Facing Analytics. Our focus will be put on the evolution of our User-Facing Analytics capabilities, and specifically how components from the Databricks ecosystem such as Structured Streaming, Delta Lake, and SQL Warehouse have played a vital role. We’ll also share some of the optimizations we’ve made to our streaming jobs to maximize performance and ensure Data Completeness. Whether you’re looking to enhance your streaming capabilities, expand your knowledge of how crypto analytics works or simply discover novel approaches to data processing at scale, this session will provide concrete strategies and valuable lessons learned.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENERGY AND UTILITIES, PUBLIC SECTOR, TRAVEL AND HOSPITALITY
Technologies: DATABRICKS SQL, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
As Databricks transforms data processing, analytics and machine learning, managing platform costs has become crucial for organizations aiming to maximize value while staying within budget. While Databricks offers unmatched scalability and performance, inefficient usage can lead to unexpected cost overruns. This presentation will explore common challenges organizations face in controlling Databricks costs and provide actionable best practices for optimizing resource allocation, preventing over-provisioning and eliminating underutilization. Drawing from NTT DATA’s experience, I'll share how we reduced Databricks costs by up to 50% through strategies like choosing the right compute resource, leveraging manage tables and using Unity Catalog features, such as system tables, to monitor consumption. Join this session to gain practical insights and tools that will empower your team to optimize Databricks without overspending.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Modern data organizations have moved beyond big data analytics to also incorporate advanced AI/ML data workloads. These workflows often involve multimodal datasets containing documents, images, long-form text, embeddings, URLs and more. Unity Catalog is an ideal solution for organizing and governing this data at scale. When paired with the Daft open source data engine, you can build a truly multimodal, AI-ready data lakehouse. In this session, we’ll explore how Daft integrates with Unity Catalog’s core features (such as volumes and functions) to enable efficient, AI-driven data lakehouses. You will learn how to ingest and process multimodal data (images, text and videos), run AI/ML transformations and feature extractions at scale, and maintain full control and visibility over your data with Unity Catalog’s fine-grained governance.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS WORKFLOWS, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 60 MIN
Join cybersecurity leaders from SAP, Anvilogic, Capital One, Wiz, and Databricks to explore how modern data intelligence is transforming security operations. Discover how SAP adopted a modular, AI-powered detection engineering lifecycle using Anvilogic on Databricks. Learn how Capital One built a detection and correlation engine leveraging Delta Lake, Apache Spark Streaming, and Databricks to process millions of cybersecurity events per second. Finally, see how Wiz and Databricks’ partnership enhances cloud security with seamless threat visibility. Through expert insights and live demos, gain strategies to build scalable, efficient cybersecurity powered by data and AI.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: MEDIA AND ENTERTAINMENT
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This talk dives into leveraging GenAI to scale sophisticated decision intelligence. Learn how an AI copilot interface simplifies running complex Bayesian probabilistic models, accelerating insight generation, and accurate decision making at the enterprise level. We talk through techniques for deploying AI agents at scale to simulate market dynamics or product feature impacts, providing robust, data-driven foresight for high-stakes innovation and strategy directly within your Databricks environment. For marketing teams, this approach will help you leverage autonomous AI agents to dynamically manage media channel allocation while simulating real-world consumer behavior through synthetic testing environments.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: APACHE SPARK, DATABRICKS SQL, DLT
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. In today’s data-driven world, the Data Lakehouse has emerged as a powerful architectural paradigm that unifies the flexibility of data lakes with the reliability and structure of traditional data warehouses. However, organizations must adopt the right data modeling techniques to unlock its full potential to ensure scalability, maintainability and efficiency. This session is designed for beginners looking to demystify the complexities of data modeling for the lakehouse and make informed design decisions. We’ll break down Medallion Architecture, explore key data modeling techniques and walk through the maturity stages of a successful data platform — transitioning from raw, unstructured data to well-organized, query-efficient models.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS WORKFLOWS, LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
Lakeflow Jobs is the production-ready fully managed orchestrator for the entire Lakehouse with 99.95% uptime. Join us for a dive into how you can orchestrate your enterprise data operations, from triggering your jobs only when your data is ready to advanced control flow with conditionals, looping and job modularity — with demos! Attendees will gain practical insights into optimizing their data operations by orchestrating with Lakeflow Jobs:
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Unity Catalog support for Apache Iceberg™ brings open, interoperable table formats to the heart of the Databricks Lakehouse. In this session, we’ll introduce new capabilities that allow you to write Iceberg tables from any REST-compatible engine, apply fine-grained governance across all data, and unify access to external Iceberg catalogs like AWS Glue, Hive Metastore, and Snowflake Horizon. Learn how Databricks is eliminating data silos, simplifying performance with Predictive Optimization, and advancing a truly open lakehouse architecture with Delta and Iceberg side by side.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: DATABRICKS APPS
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. In this session, we present an overview of the GA release of Databricks Apps, the new app hosting platform that integrates all the Databricks services necessary to build production-ready data and AI applications. With Apps, data and developer teams can build new interfaces into the data intelligence platform, further democratizing the transformative power of data and AI across the organization. We'll cover common use cases, including RAG chat apps, interactive visualizations and custom workflow builders, as well as look at several best practices and design patterns when building apps. Finally, we'll look ahead with the vision, strategy and roadmap for the year ahead.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE ICEBERG, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
As machine learning (ML) models scale in complexity and impact, organizations must establish a robust MLOps foundation to ensure seamless model deployment, monitoring and retraining. In this session, we’ll share how we leverage Databricks as the backbone of our MLOps ecosystem — handling everything from workflow orchestration to large-scale inference. We’ll walk through our journey of transitioning from fragmented workflows to an integrated, scalable system powered by Databricks Workflows. You’ll learn how we built an automated pipeline that streamlines model development, inference and monitoring while ensuring reliability in production. We’ll also discuss key challenges we faced, lessons learned and best practices for organizations looking to operationalize ML with Databricks.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
Erste Group's transition to Azure Databricks marked a significant upgrade from a legacy system to a secure, scalable and cost-effective cloud platform. The initial architecture, characterized by a complex hub-spoke design and stringent compliance regulations, was replaced with a more efficient solution. The phased migration addressed high network costs and operational inefficiencies, resulting in a 60% reduction in networking costs and a 30% reduction in compute costs for the central team. This transformation, completed over a year, now supports real-time analytics, advanced machine learning and GenAI while ensuring compliance with European regulations. The new platform features a Unity Catalogue, separate data catalogs and dedicated workspaces, demonstrating a successful shift to a cloud-based machine learning environment with significant improvements in cost, performance and security.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: DATABRICKS WORKFLOWS, DLT, LAKEFLOW
Skill Level: BEGINNER
Duration: 40 MIN
Every analytics, BI and AI project relies on high-quality data. This is why data engineering, the practice of building reliable data pipelines that ingest and transform data, is consequential to the success of these projects. In this session, we'll show how you can use Lakeflow to accelerate innovation in multiple parts of the organization. We'll review real-world examples of Databricks customers using Lakeflow in different industries such as automotive, healthcare and retail. We'll touch on how the foundational data engineering capabilities Lakeflow provides help power initiatives that improve customer experiences, make real-time decisions and drive business results.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, LAKEFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This presentation outlines the evolution of our marketing data strategy, focusing on how we’ve built a strong foundation using the Databricks Lakehouse. We will explore key advancements across data ingestion, strategy, and insights, highlighting the transition from legacy systems to a more scalable and intelligent infrastructure. Through real-world applications, we will showcase how unified Customer 360 insights drive personalization, predictive analytics enhance campaign effectiveness, and GenAI optimizes content creation and marketing execution. Looking ahead, we will demonstrate the next phase of our CDP, the shift toward an end-user-first analytics model powered by AIBI, Genie and Matik, and the growing importance of clean rooms for secure data collaboration. This is just the beginning, and we are poised to unlock even greater capabilities in the future.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Databricks is transforming its sales experience with a GenAI agent — built and deployed entirely on Databricks — to automate tasks, streamline data retrieval, summarize content, and enable conversational AI for over 4,000 sellers. This agent leverages the AgentEval framework, AI Bricks, and Model Serving to process both structured and unstructured data within Databricks, unlocking deep sales insights. The agent seamlessly integrates across multiple data sources including Salesforce, Google Drive, and Glean securely via OAuth. This session includes a live demonstration and explores the business impact, architecture as well as agent development and evaluation strategies, providing a blueprint for deploying secure, scalable GenAI agents in large enterprises.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
You shouldn’t have to sacrifice data governance just to leverage the tools your business needs. In this session, we will give practical tips on how you can cut through the data sprawl and get a unified view of your data estate in Unity Catalog without disrupting existing workloads. We will walk through how to set up federation with Glue, Hive Metastore, and other catalogs like Snowflake, and show you how powerful new tools help you adopt Databricks at your own pace with no downtime and full interoperability.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Databricks is the bestest platform ever where everything is perfect and nothing else could ever make it any better, right? …right? You and I know, this is not true. Don’t get me wrong, there are features that I absolutely love, but there are also some that require powering through the papercuts. And then there are those that I pretend don’t exist. I’ll be opening up to give my honest take on three of each category, why I do (or don’t) like them, and then telling you which talks to attend to find out more.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: DELTA LAKE
Skill Level: INTERMEDIATE
Duration: 20 MIN
Quantum Capital Group (QCG) screens hundreds of deals across the global Sustainable Energy Ecosystem, requiring deep technical due diligence. With over 1.5 billion records sourced from public, premium and proprietary datasets, their challenge was how to efficiently curate, analyze and share this data to drive smarter investment decisions. QCG partnered with Databricks & Tiger Analytics to modernize its data landscape. Using Delta tables, Spark SQL, and Unity Catalog, the team built a golden dataset that powers proprietary evaluation models and automates complex workflows. Data is now seamlessly curated, enriched and distributed — both internally and to external stakeholders — in a secure, governed and scalable way. This session explores how QCG’s investment in data intelligence has turned an overwhelming volume of information into a competitive advantage, transforming deal evaluation into a faster, more strategic process.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: RETAIL AND CPG - FOOD
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: EDUCATION, PROFESSIONAL SERVICES, PUBLIC SECTOR
Technologies: APACHE SPARK
Skill Level: BEGINNER
Duration: 20 MIN
Join us for an insightful Ask Me Anything (AMA) session on Declarative Pipelines — a powerful approach to simplify and optimize data workflows. Learn how to define data transformations using high-level, SQL-like semantics, reducing boilerplate code while improving performance and maintainability. Whether you're building ETL processes, feature engineering pipelines, or analytical workflows, this session will cover best practices, real-world use cases and how Declarative Pipelines can streamline your data applications. Bring your questions and discover how to make your data processing more intuitive and efficient!
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, DLT
Skill Level: BEGINNER
Duration: 40 MIN
Lakeflow Declarative Pipelines has made it dramatically easier to build production-grade Spark pipelines, using a framework that abstracts away orchestration and complexity. It’s become a go-to solution for teams who want reliable, maintainable pipelines without reinventing the wheel.But we’re just getting started. In this session, we’ll take a step back and share a broader vision for the future of Spark Declarative Pipelines — one that opens the door to a new level of openness, standardization and community momentum.We’ll cover the core concepts behind Declarative Pipelines, where the architecture is headed, and what this shift means for both existing Lakeflow users and Spark engineers building procedural code. Don’t miss this session — we’ll be sharing something new that sets the direction for what comes next.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, FINANCIAL SERVICES
Technologies: MLFLOW, AI/BI, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. Get the most out of your AI/BI Dashboards by scaling them across your entire organization. This session covers best practices for automating report distribution, embedding dashboards in external applications, and ensuring secure access across all surfaces. You'll walk away with practical strategies for delivering insights to the right people at the right time—empowering decision-makers at every level with the data they need to drive impactful outcomes.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, FINANCIAL SERVICES
Technologies: APACHE SPARK, DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
As enterprise streaming adoption accelerates, more teams are turning to real-time processing to support operational workloads that require sub-second response times. To address this need, Databricks introduced Project Lightspeed in 2022, which recently delivered Real-Time Mode in Apache Spark™ Structured Streaming. This new mode achieves consistent p99 latencies under 300ms for a wide range of stateless and stateful streaming queries. In this session, we’ll define what constitutes an operational use case, outline typical latency requirements and walk through how to meet those SLAs using Real-Time Mode in Structured Streaming.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE
Skill Level: INTERMEDIATE
Duration: 20 MIN
The Delta Lake architecture promises to provide a single, highly functional, and high-scale copy of data that can be leveraged by a variety of tools to satisfy a broad range of use cases. To date, most use cases have focused on interactive data warehousing, ETL, model training, and streaming. Real-time access is generally delegated to costly and sometimes difficult-to-scale NoSQL, indexed storage, and domain-specific specialty solutions, which provide limited functionality compared to Spark on Delta Lake. In this session, we will explore the Delta data-skipping and optimization model and discuss how Capital One leveraged it along with Databricks photon and Spark Connect to implement a real-time web application backend. We’ll share how we built a highly-functional and performant security information and event management user experience (SIEM UX) that is cost effective.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE
Skill Level: BEGINNER
Duration: 40 MIN
Delta Kernel makes it easy for engines and connectors to read and write Delta tables. It supports many Delta features and robust connectors, including DuckDB, Clickhouse, Spice AI and delta-dotnet. In this session, we'll cover lessons learned about how to build a high-performance library that lets engines integrate the way they want, while not having to worry about the details of the Delta protocol. We'll talk through how we streamlined the API as well as its changes and underlying motivations. We'll discuss some new highlight features like write support, and the ability to do CDF scans. Finally we'll cover the future roadmap for the Kernel project and what you can expect from the project over the coming year.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA LAKE
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this presentation, we’ll dive into the power of Liquid Clustering—an innovative, out-of-the-box solution that automatically tunes your data layout to scale effortlessly with your datasets. You’ll get a deep look at how Liquid Clustering works, along with real-world examples of customers leveraging it to unlock blazing-fast query performance on petabyte-scale datasets. We’ll also give you an exciting sneak peek into the roadmap ahead, with upcoming features and enhancements to come.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, PUBLIC SECTOR
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Delta Lake has proven to be an excellent storage format. Coupled with the Databricks platform, the storage format has shined as a component of a distributed system on the lakehouse. The pairing of Delta and Spark provides an excellent platform, but users often struggle to perform comparable work outside of the Spark ecosystem. Tools such as delta-rs, Polars and DuckDb have brought access to users outside of Spark, but they are only building blocks of a larger system. In this 40-minute talk we will demonstrate how users can use data products on the Nextdata OS data mesh to interact with the Databricks platform to drive Delta Lake workflows. Additionally, we will show how users can build autonomous data products that interact with their Delta tables both inside and outside of the lakehouse platform. Attendees will learn how to integrate the Nextdata OS data mesh with the Databricks platform as both an external and integral component.
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Data sharing doesn’t have to be complicated. In this session, we’ll take a practical look at Delta Sharing in Databricks — what it is, how it works and how it fits into your organization’s data ecosystem. The focus will be on giving an overview of the different ways to share data using Databricks, from direct sharing setups to broader distribution via the Databricks Marketplace and more collaborative approaches like Clean Rooms. This talk is meant for anyone curious about modern, secure data sharing — whether you're just getting started or looking to expand your use of Databricks. Attendees will walk away with a clearer picture of what’s possible, what’s required to get started and how to choose the right sharing method for the right scenario.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Delta Sharing is revolutionizing how enterprises share live data and AI assets securely, openly and at scale. As the industry’s first open data-sharing protocol, it empowers organizations to collaborate seamlessly across platforms and with any partner, whether inside or outside the Databricks ecosystem. In this deep-dive session, you’ll learn best practices and real-world use cases that show how Delta Sharing helps accelerate collaboration and fuel AI-driven innovation. We’ll also unveil the latest advancements, including: Whether you’re a data engineer, architect, or data leader, you’ll leave with practical strategies to future-proof your data-sharing architecture. Don’t miss the live demos, expert guidance and an exclusive look at what’s next in data collaboration.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us as we introduce Delta-Kernel-RS, a new Rust implementation of the Delta Lake protocol designed for unparalleled interoperability across query engines. In this session, we will explore how maintaining a native implementation of the Delta specification — with native C and C++ FFI support — can deliver consistent benefits across diverse data processing systems, eliminating the need for repetitive, engine-specific reimplementations. We will dive deep into a real-world case study where a query engine harnessed Delta-Kernel-RS to unlock significant data skipping improvements — enhancements achieved “for free” by leveraging the kernel. Attendees will gain insights into the architectural decisions, interoperability strategies and the practical impact of this innovation on performance and development efficiency in modern data ecosystems.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: DELTA LAKE, APACHE ICEBERG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Five years ago, the delta-rs project embarked on a journey to bring Delta Lake's robust capabilities to the Rust & Python ecosystem. In this talk, we'll delve into the triumphs, tribulations and lessons learned along the way. We'll explore how delta-rs has matured alongside the thriving Rust data ecosystem, adapting to its evolving landscape and overcoming the challenges of maintaining a complex data project. Join us as we share insights into the project's evolution, the symbiotic relationship between delta-rs and the Rust community, and the current hurdles and future directions that lie ahead. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: MEDIA AND ENTERTAINMENT
Technologies: AI/BI, DATABRICKS SQL, MOSAIC AI
Skill Level: BEGINNER
Duration: 40 MIN
Ludia, a leading mobile gaming company, is empowering its analysts and domain experts by democratizing data engineering with Databricks and dbt. This talk explores how Ludia enabled cross-functional teams to build and maintain production-grade data pipelines without relying solely on centralized data engineering resources—accelerating time to insight, improving data reliability, and fostering a culture of data ownership across the organization.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join our 2024 Databricks Disruptor award winners for a session on how they leveraged the Databricks and AWS platforms to build an internal technology marketplace in the highly regulated banking industry empowering end-users to innovate and own their data sets while maintaining strict compliance. In this talk, leaders from the J.P. Morgan Payments Data team share how they’ve done it — from keeping customer needs at the center of all decision-making to promoting a culture of experimentation. They’ll also expand upon how J.P. Morgan Payments products team now leverages the data platform they’ve built to create customer products including Cash Flow Intelligence.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
Databricks Unity Catalog (UC) is the industry’s only unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform. UC provides a single source of truth for organization’s data and AI, providing open connectivity to any data source, any format, lineage, monitoring and support for open sharing and collaboration. In this session we will discuss the challenges in upgrading to UC from your existing databricks Non-UC set up. We will discuss a few customer use cases and how we overcame difficulties and created a repeatable pattern and reusable assets to replicate the success of upgrading to UC across some of the largest databricks customers. It is co-presented with our partner Celebal Technologies.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, DLT, LAKEFLOW
Skill Level: ADVANCED
Duration: 40 MIN
This session is repeated.Managing data and AI workloads in Databricks can be complex. Databricks Asset Bundles (DABs) simplify this by enabling declarative, Git-driven deployment workflows for notebooks, jobs, Lakeflow Declarative Pipelines, dashboards, ML models and more.Join the DABs Team for a Deep Dive and learn about:The Basics: Understanding Databricks asset bundlesDeclare, define and deploy assets, follow best practices, use templates and manage dependenciesCI/CD & Governance: Automate deployments with GitHub Actions/Azure DevOps, manage Dev vs. Prod differences, and ensure reproducibilityWhat’s new and what's coming up! AI/BI Dashboard support, Databricks Apps support, a Pythonic interface and workspace-based deploymentIf you're a data engineer, ML practitioner or platform architect, this talk will provide practical insights to improve reliability, efficiency and compliance in your Databricks workflows.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: RETAIL AND CPG - FOOD
Technologies: APACHE SPARK, AI/BI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
“Once an idea has taken hold of the brain it's almost impossible to eradicate. An idea that is fully formed — fully understood — that sticks, right in there somewhere.” The Data Scientists and Engineers at 84.51˚ utilize the Databricks Lakehouse for a wide array of tasks, including data exploration, analysis, machine learning operations, orchestration, automated deployments and collaboration. In this talk, 84.51˚’s Data Science Learning Lead, Michael Carrico, will share their approach to upskilling a diverse workforce to support the company’s strategic initiatives. This approach includes creating tailored learning experiences for a variety of personas using content curated in partnership with Databricks’ educational offerings. Then he will demonstrate how he puts his 11 years of data science and engineering experience to work by using the Databricks Lakehouse not just as a subject, but also as a tool to create impactful training experiences and a learning culture at 84.51˚.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK, DELTA LAKE, UNITY CATALOG
Skill Level: ADVANCED
Duration: 20 MIN
Step into the world of Disney Streaming as we unveil the creation of our Foundational Medallion, a cornerstone in our architecture that redefines how we manage data at scale. In this session, we'll explore how we tackled the multi-faceted challenges of building a consistent, self-service surrogate key architecture — a foundational dataset for every ingested stream powering Disney Streaming's data-driven decisions. Learn how we streamlined our architecture and unlocked new efficiencies by leveraging cutting-edge Databricks features such as liquid clustering, Photon with dynamic file pruning, Delta's identity column, Unity Catalog and more — transforming our implementation into a simpler, more scalable solution. Join us on this thrilling journey as we navigate the twists and turns of designing and implementing a new Medallion at scale — the very heartbeat of our streaming business!
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: RETAIL AND CPG - FOOD
Technologies: APACHE SPARK, DELTA LAKE
Skill Level: INTERMEDIATE
Duration: 20 MIN
The "Doordash Customer 360 Data Store" represents a foundational step in centralizing and managing customer profile to enable targeting and personalized customer experiences built on Delta Lake. This presentation will explore the initial goals and architecture of the Customer 360 Data Store, its journey to becoming a robust entity management framework, and the challenges and opportunities encountered along the way. We will discuss how the evolution addressed scalability, data governance and integration needs, enabling the system to support dynamic and diverse use cases, including customer lifecycle analytics, marketing campaign targeting using segmentation. Attendees will gain insight into key design principles, technical innovations and strategic decisions that transformed the system into a flexible platform for entity management, positioning it as a critical enabler of data-driven growth at Doordash. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, LAKEFLOW, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Demonstrating a real ROI is key to driving executive and stakeholder buy-in for major technology changes. At Veeam, we aligned our Databricks Platform change with projects to increase sales pipeline and improve customer retention. By delivering targeted improvements on those critical business metrics, we created positive ROI in short order while at the same time setting the foundation for long term Databricks Platform success. This session targets data and business leaders looking to understand how they can turn their infrastructure change into a business revenue driver.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
As enterprises adopt AI and Large Language Models (LLMs), securing and governing these models - and the data used to train them - is essential. In this session, learn how Databricks Partner PointGuard AI helps organizations implement the Databricks AI Security Framework to manage AI-specific risks, ensuring security, compliance, and governance across the entire AI lifecycle. Then, discover how Obsidian Security provides a robust approach to AI security, enabling organizations to confidently scale AI applications.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
Deliver trusted, high-performance insights by incorporating Unity Catalog metric views and business semantics into your AI/BI workflows. This session dives into the architecture and best practices for defining reusable metrics, implementing governance and enhancing query performance in AI/BI Dashboards and Genie. Learn how to manage business semantics effectively to ensure data consistency while empowering business users with governed, self-service analytics. Ideal for teams looking to streamline analytics at scale, this session provides practical strategies for driving data accuracy and governance.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, PROFESSIONAL SERVICES
Technologies: MLFLOW, DSPY, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The DSPy OSS team at Databricks and beyond is excited to present DSPy 3.0, targeted for release close to DAIS 2025. We will present what DSPy is and how it evolved over the past year. We will discuss greatly improved prompt optimization and finetuning/RL capabilities, improved productionization and observability via thorough and native integration with MLflow, and lessons from usage of DSPy in various Databricks R&D and professional services contexts.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: DATABRICKS SQL, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us to hear about how National Australia Bank (NAB) successfully completed a significant milestone in its data strategy by decommissioning its 26-year-old Teradata environment and migrating to a new strategic data platform called 'Ada'. This transition marks a pivotal shift from legacy systems to a modern, cloud-based data and AI platform powered by Databricks. The migration process, which spanned two years, involved ingesting 16 data sources, transferring 456 use cases, and collaborating with hundreds of users across 12 business units. This strategic move positions NAB to leverage the full potential of cloud-native data analytics, enabling more agile and data-driven decision-making across the organization. The successful migration to Ada represents a significant step forward in NAB's ongoing efforts to modernize its data infrastructure and capitalize on emerging technologies in the rapidly evolving financial services landscape
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: DELTA LAKE, AI/BI, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
In this session, we will explore effective strategies for optimizing costs on the Databricks platform, a leading solution for handling large-scale data workloads. Databricks, known for its open and unified approach, offers several tools and methodologies to ensure users can maximize their return on investment (ROI) while managing expenses efficiently. Key points: By the end of this session, you will have a comprehensive understanding of how to leverage Databricks' built-in tools for cost optimization, ensuring that their data and AI projects not only deliver value but do so in a cost-effective manner. This session is ideal for data engineers, financial analysts, and decision-makers looking to enhance their organization’s efficiency and financial performance through strategic cost management on Databricks.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
Writing SQL is a core part of any data analyst’s workflow, but small inefficiencies can add up, slowing down analysis and making it harder to iterate quickly. In this session, we’ll explore our powerful features in the Databricks SQL editor and notebook that help you to be more productive when writing SQL on Databricks. We’ll demo the new features and the customer use cases that inspired them.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENERGY AND UTILITIES, MANUFACTURING, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Join us for an introductory session on Databricks DQX, a Python-based framework designed to validate the quality of PySpark DataFrames. Discover how DQX can empower you to proactively tackle data quality challenges, enhance pipeline reliability and make more informed business decisions with confidence. Traditional data quality tools often fall short by providing limited, actionable insights, relying heavily on post-factum monitoring, and being restricted to batch processing. DQX overcomes these limitations by enabling real-time quality checks at the point of data entry, supporting both batch and streaming data validation and delivering granular insights at the row and column level. If you’re seeking a simple yet powerful data quality framework that integrates seamlessly with Databricks, this session is for you.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this session, we’ll introduce Zerobus Direct Write API, part of Lakeflow Connect, which enables you to push data directly to your lakehouse and simplify ingestion for IOT, clickstreams, telemetry, and more. We’ll start with an overview of the ingestion landscape to date. Then, we'll cover how you can “shift left” with Zerobus, embedding data ingestion into your operational systems to make analytics and AI a core component of the business, rather than an afterthought. The result is a significantly simpler architecture that scales your operations, using this new paradigm to skip unnecessary hops. We'll also highlight one of our early customers, Joby Aviation and how they use Zerobus. Finally, we’ll provide a framework to help you understand when to use Zerobus versus other ingestion offerings—and we’ll wrap up with a live Q&A so that you can hit the ground running with your own use cases.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Bagelcode, a leader in the social casino industry, has utilized Databricks since 2018 and manages over 10,000 tables via Hive Metastore. In 2024, we embarked on a transformative journey to resolve inefficiencies and unlock new capabilities. Over five months, we redesigned ETL pipelines with Delta Lake, optimized partitioned table logs and executed a seamless migration with minimal disruption. This effort improved governance, simplified management and unlocked Unity Catalog’s advanced features. Post-migration, we integrated the Genie Room with Slack to enable natural language queries, accelerating decision-making and operational efficiency. Additionally, a lineage-powered internal tool allowed us to quickly identify and resolve issues like backfill needs or data contamination. Unity Catalog has revolutionized our data ecosystem, elevating governance and innovation. Join us to learn how Bagelcode unlocked its data’s full potential and discover strategies for your own transformation.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENERGY AND UTILITIES, RETAIL AND CPG - FOOD, TRAVEL AND HOSPITALITY
Technologies: AI/BI
Skill Level: ADVANCED
Duration: 40 MIN
In this session, we'll explore how Rooms To Go enhances organizational collaboration by integrating AI/BI Genie with Microsoft Teams. Genie enables warehouse employees and members of the sales team to interact with data using natural language, simplifying data exploration and analysis. By connecting Genie to Microsoft Teams, we bring real-time data insights directly to a user’s phone. We'll provide a comprehensive overview on setting up this integration as well as a demo of how the team uses it daily. Attendees will gain practical knowledge to implement this integration, empowering their teams to access and interact with data seamlessly within Microsoft Teams.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: PUBLIC SECTOR
Technologies: MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
Artificial Intelligence (AI) is more than a corporate tool; it’s a force for good. At Doctors Without Borders/Médecins Sans Frontières (MSF), we use AI to optimize fundraising, ensuring that every dollar raised directly supports life-saving medical aid worldwide. With Databricks, Mosaic AI and Unity Catalog, we analyze donor behavior, predict giving patterns and personalize outreach, increasing contributions while upholding ethical AI principles. This session will showcase how AI maximizes fundraising impact, enabling faster crisis response and resource allocation. We’ll explore predictive modeling for donor engagement, secure AI governance with Unity Catalog and our vision for generative AI in fundraising, leveraging AI-assisted storytelling to deepen donor connections. AI is not just about efficiency; it’s about saving lives. Join us to see how AI-driven fundraising is transforming humanitarian aid on a global scale.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, DATA MARKETPLACE, DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 40 MIN
NHS England is revolutionizing healthcare research by enabling secure, seamless access to de-identified patient data through the Federated Data Platform (FDP). Despite vast data resources spread across regional and national systems, analysts struggle with fragmented, inconsistent datasets. Enter Databricks: powering a unified, virtual data lake with Unity Catalog at its core — integrating diverse NHS systems while ensuring compliance and security. By bridging AWS and Azure environments with a private exchange and leveraging the Iceberg connector to interface with Palantir, analysts gain scalable, reliable and governed access to vital healthcare data. This talk explores how this innovative architecture is driving actionable insights, accelerating research and ultimately improving patient outcomes.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES
Technologies: APACHE SPARK, MLFLOW, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Tonal is the ultimate strength training system, giving you the expertise of a personal trainer and a full gym in your home. Through user interviews and social media feedback, we identified a consistent challenge: members found it difficult to measure their progress in their fitness journey. To address this, we developed the Training Goal (TG) ecosystem, a four-part solution that introduced new preference options to capture users' fitness aspirations, implemented weekly metrics that accumulate as members complete workouts, defined personalized weekly targets to guide progress, and enhanced workout details to show how each session contributes toward individual goals.We present how we leveraged Spark, MLflow, and Workflows within the Databricks ecosystem to compute TG metrics, manage model development, and orchestrate data pipelines. These tools allowed us to launch the TG system on schedule, supporting scalability, reliability, and a more meaningful, personalized way for members to track their progress.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: PROFESSIONAL SERVICES, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: APACHE SPARK, AI/BI, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
The new Budget Execution Validation process has transformed how the Navy reviews unspent funds. Powered by Databricks Workflows, MLflow, Delta Lake and Apache Spark™, this data-driven model predicts which financial transactions are most likely to have errors, streamlining reviews and increasing accuracy. In FY24, it helped review $40 billion, freeing $1.1 billion for other priorities, including $260 million from active projects. By reducing reviews by 80%, cutting job runtime by over 50% and lowering costs by 60%, it saved 218,000 work hours and $6.7 million in labor costs. With automated workflows and robust data management, this system exemplifies how advanced tools can improve financial decision-making, save resources and ensure efficient use of taxpayer dollars.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Leveraging Databricks as a platform, we facilitate the sharing of anonymized datasets across various Databricks workspaces and accounts, spanning multiple cloud environments such as AWS, Azure, and Google Cloud. This capability, powered by Delta Sharing, extends both within and outside Sleep Number, enabling accelerated insights while ensuring compliance with data security and privacy standards. In this session, we will showcase our architecture and implementation strategy for data sharing, highlighting the use of Databricks’ Unity Catalog and Delta Sharing, along with integration with platforms like Jira, Jenkins, and Terraform to streamline project management and system orchestration.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session will showcase Bosch’s journey in consolidating supply chain information using the Databricks platform. It will dive into how Databricks not only acts as the central data lakehouse but also integrates seamlessly with transformative components such as dbt and Large Language Models (LLMs). The talk will highlight best practices, architectural considerations, and the value of an interoperable platform in driving actionable insights and operational excellence across complex supply chain processes. Key Topics and Sections
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENERGY AND UTILITIES
Technologies: AI/BI, DELTA SHARING, UNITY CATALOG
Skill Level: BEGINNER
Duration: 60 MIN
Join us for a compelling forum exploring how energy leaders are harnessing data and AI to build a more sustainable future. As the industry navigates the complex balance between rising global energy demands and ambitious decarbonization goals, innovative companies are discovering that intelligence-driven operations are the key to success. From optimizing renewable energy integration to revolutionizing grid management, learn how energy pioneers are using AI to transform traditional operations while accelerating the path to net zero. This session reveals how Databricks is empowering energy companies to turn their sustainability aspirations into reality, proving that the future of energy is both clean and intelligent.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING, FINANCIAL SERVICES
Technologies: DATABRICKS WORKFLOWS, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Morgan Stanley, a highly regulated financial institution, needs to meet stringent security and regulatory requirements around data storage and processing. Traditionally, this has necessitated maintaining control over data and compute within their own accounts with the associated management overhead. In this session, we will cover how Morgan Stanley has partnered with Databricks on a fully-managed compute and storage solution that allows them to meet their regulatory obligations with significantly reduced effort. This innovative approach enables rapid onboarding of new projects onto the platform, improving operational efficiency while maintaining the highest levels of security and compliance.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session shows you how to gain visibility into your Databricks SQL spend and ensure cost efficiency. Learn about the latest features to gain detailed insights into Databricks SQL expenses so you can easily monitor and control your costs. Find out how you can enable attribution to internal projects, understand the Total Cost of Ownership, set up proactive controls and find ways to continually optimize your spend.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, MLFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
We will present a framework for FinCrime detection leveraging Databricks lakehouse architecture specifically how institutions can achieve both data flexibility & ACID transaction guarantees essential for FinCrime monitoring. The framework incorporates advanced ML models for anomaly detection, pattern recognition, and predictive analytics, while maintaining clear data lineage & audit trails required by regulatory bodies. We will also discuss some specific improvements in reduction of false positives, improvement in detection speed, and faster regulatory reporting, delve deep into how the architecture addresses specific FATF recommendations, Basel III risk management requirements, and BSA compliance obligations, particularly in transaction monitoring and SAR. The ability to handle structured and unstructured data while maintaining data quality and governance makes it particularly valuable for large financial institutions dealing with complex, multi-jurisdictional compliance requirements.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING
Technologies: MLFLOW, MOSAIC AI, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
There are many ways to implement entity resolution (ER) system — both using vendor software and open-source libraries that enable DIY Entity Resolution. However, generally we see common challenges with any approach — scalability, bound to a single model architecture, lack of metrics and explainability, and stagnant implementations that do not "learn" with experience. Recent experiments with transformer-based approaches, fast lookups with vector search and Databricks components such as Databricks Apps and Agent Eval provide the foundations for a composable ER system that can get better with time on your data. In this presentation, we include a demo of how to use these components to build a composable ER that has the best outcomes for your data.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
In enterprise AI, Evaluation-Driven Development (EDD) ensures reliable, efficient systems by embedding continuous assessment and improvement into the AI development lifecycle. High-quality evaluation datasets are created using techniques like document analysis, synthetic data generation via Mosaic AI’s synthetic data generation API, SME validation, and relevance filtering, reducing manual effort and accelerating workflows. EDD focuses on metrics such as context relevance, groundedness, and response accuracy to identify and address issues like retrieval errors or model limitations. Custom LLM judges, tailored to domain-specific needs like PII detection or tone assessment, enhance evaluations. By leveraging tools like Mosaic AI Agent Framework and Agent Evaluation, MLflow, EDD automates data tracking, streamlines workflows, and quantifies improvements, transforming AI development for delivering scalable, high-performing systems that drive measurable organizational value.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session dives into building multi-agent systems on the Mosaic AI Platform, exploring the techniques, architectures and lessons learned from experiences building Greenlight’s real-world agent applications. This presentation is well suited for executives, product managers and engineers alike, breaking down AI Agents into easy-to-understand concepts, while presenting an architecture for building complex systems. We’ll examine the core components of generative AI Agents and different ways to assemble them into agents, including different prompting and reasoning techniques. We’ll cover how the Mosaic AI Platform has enabled our small team to build, deploy and monitor our AI Agents, touching on vector search, feature and model serving endpoints, and the evaluation framework. Finally, we’ll discuss the pros and cons of building a multi-agent system consisting of specialized agents vs. a single large agent for Greenlight’s AI Assistant, and the challenges we encountered.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: DELTA SHARING, UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Mastercard is a global technology company whose role is anchored in trust. It supports 3.4 billion cards and over 143 billion transactions annually. To address customers’ increasing data volume and complex privacy needs, Mastercard has developed a novel service atop Databricks’ Clean Rooms and broader Data Intelligence Platform. This service combines several Databricks components with Mastercard’s IP, providing an evolved method for data-driven insights and value-added services while ensuring a unique standalone turnkey service. The result is a secure environment where multiple parties can collaborate on sensitive data without directly accessing each other’s information. After this session, attendees will understand how Mastercard used its expertise in privacy-enhancing technologies to create collaboration tools powered by Databricks’ Clean Rooms, AI/BI, Apps, Unity Catalog, Workflows and DatabricksIQ — as well as how to take advantage of this new privacy-enhancing service directly.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: EDUCATION, PUBLIC SECTOR
Technologies: MLFLOW, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The lakehouse is built for storage flexibility, but what about compute? In this session, we’ll explore how Unity Catalog enables you to connect and govern multiple compute engines across your data ecosystem. With open APIs and support for the Iceberg REST Catalog, UC lets you extend access to engines like Trino, DuckDB, and Flink while maintaining centralized security, lineage, and interoperability. We will show how you can get started today working with engines like Apache Spark and Starburst to read and write to UC managed tables with some exciting demos. Learn how to bring flexibility to your compute layer—without compromising control.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: APACHE SPARK, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Are you struggling to keep up with rapid business changes that demand constant updates to your data pipelines? Is your data engineering team growing rapidly just to manage this complexity? Databricks was not immune to this challenge either. Managing our BI with contributions from hundreds of Product Engineering Teams across the company while maintaining central oversight and quality posed significant hurdles. Join us to learn how we developed a config-driven data pipeline framework using Metric Store and UC Metrics that helped us reduce engineering effort — achieving the work of 100 classical data engineers with just two platform engineers.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: DATA MARKETPLACE, AI/BI, MOSAIC AI
Skill Level: BEGINNER
Duration: 60 MIN
Overflow Available First come, first serve. No reservation required. Where: Moscone South, Level 3, Room 302 Join the 60-minute kickoff session at the Financial Services Forum to explore how data and AI transform finance. Featuring keynotes from top innovators in banking, capital markets, and insurance and exciting announcements from Databricks, this event offers invaluable insights. What to expect: Connect with C-suite executives and industry pioneers shaping financial services. Leave with actionable strategies to drive growth, ensure compliance and transform your organization through intelligence-driven decisions!
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Unstructured data — images, documents, videos and more — is growing in importance with AI and ML. Yet managing access control at scale is challenging. Unity Catalog Volumes offer a secure foundation, but access control has remained volume-level until now. This session introduces Volume Path Permissions, a new feature enabling fine-grained access within volumes. Expanding on Unity Catalog’s robust permission model, they let you grant privileges to users and groups based on path prefixes. We’ll cover the governance model, share examples and demonstrate how to enforce least-privilege access. By the end, you’ll know how to manage file-level access with Unity Catalog’s flexibility and control.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. You’ve seen your usage grow on Databricks, across departments, use cases, product lines and users. What can you do to ensure your end-users (data practitioners) of the platform remain cost-efficient and productive, while staying accountable to your budget? We’ll discuss spend monitoring, chargeback models and developing a culture of cost efficiency by using Databricks tools.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
Westat, a leader in data-driven research for 60 years+, has implemented a centralized Databricks platform to support hundreds of research projects for government, foundations, and private clients. This initiative modernizes Westat’s technical infrastructure while maintaining rigorous statistical standards and streamlining data science. The platform enables isolated project environments with strict data boundaries, centralized oversight, and regulatory compliance. It allows project-specific customization of compute and analytics, and delivers scalable computing for complex analyses. Key features include config-driven Infrastructure as Code (IaC) with Terragrunt, custom tagging and AWS cost integration for ROI tracking, budget policies with alerts for proactive cost management, and a centralized dashboard with row-level security for self-service cost analytics. This unified approach provides full financial visibility and governance while empowering data teams to deliver value. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: MEDIA AND ENTERTAINMENT
Technologies: DELTA LAKE, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
Join us as we explore how KRAFTON optimized data governance for PUBG IP, enhancing cost efficiency and scalability. KRAFTON operates a massive data ecosystem, processing tens of terabytes daily. As real-time analytics demands increased, traditional Batch-based processing faced scalability challenges. To address this, we redesigned data pipelines and governance models, improving performance while reducing costs. Learn more: https://www.databricks.com/customers/krafton
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS WORKFLOWS, LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
This is an overview of migrating from Apache Airflow to Lakeflow Jobs for modern data orchestration. It covers key differences, best practices and practical examples of transitioning from traditional Airflow DAGs orchestrating legacy systems to declarative, incremental ETL pipelines with Lakeflow. Attendees will gain actionable tips on how to improve efficiency, scalability and maintainability in their workflows.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 40 MIN
As language models have advanced, they have moved beyond code completion and are beginning to tackle software engineering tasks in a more autonomous, agentic way. However, evaluating agentic capabilities is challenging. To address this, we first introduce SWE-bench, a benchmark built from real GitHub issues that has become the standard for assessing AI’s ability to resolve complex software tasks in large codebases. We will discuss the current state of the field, the limitations of today’s models, and how far we still are from truly autonomous AI developers. Next, we will explore the fundamentals of agents based on hands-on demonstrations with SWE-agent, a simple yet powerful agent framework designed for software engineering but adaptable to a variety of domains. By the end of this session, you will have a clear understanding of the current frontier of agentic AI in software engineering, the challenges ahead and how you can experiment with AI agents in your own workflows.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, LAKEFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: MLFLOW, DATABRICKS WORKFLOWS, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Imagine performing complex regulatory checks in minutes instead of days. We made this a reality using GenAI on the Databricks Data Intelligence Platform. Join us for a deep dive into our journey from POC to a production-ready AI audit tool. Discover how we automated thousands of legal requirement checks in annual reports with remarkable speed and accuracy. Learn our blueprint for: This session provides actionable insights for deploying impactful, compliant GenAI in the enterprise.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
The Global Water Security Center translates environmental science into actionable insights for the U.S. Department of Defense. Prior to incorporating Databricks, responding to these requests required querying approximately five hundred thousand raster files representing over five hundred billion points. By leveraging lakehouse architecture, Databricks Auto Loader, Spark Streaming, Databricks Spatial SQL, H3 geospatial indexing and Databricks Liquid Clustering, we were able to drastically reduce our “time to analysis” from multiple business days to a matter of seconds. Now, our data scientists execute queries on pre-computed tables in Databricks, resulting in a “time to analysis” that is 99% faster, giving our teams more time for deeper analysis of the data. Additionally, we’ve incorporated Databricks Workflows, Databricks Asset Bundles, Git and Git Actions to support CI/CD across workspaces. We completed this work in close partnership with Databricks.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Building and deploying Pyspark pipelines to Databricks should be effortless. However, our team at FreeWheel has, for the longest time, struggled with a convoluted and hard-to-maintain CI/CD infrastructure. It followed an imperative paradigm, demanding that every project implement custom scripts to build artifacts and deploy resources, and resulting in redundant boilerplate code and awkward interactions with the Databricks REST API. We set our mind on rebuilding it from scratch, following a declarative paradigm instead. We will share how we were able to eliminate thousands of lines of code from our repository, create a fully configuration-driven infrastructure where projects can be easily onboarded, and improve the quality of our codebase using Hatch and Databricks Asset Bundles as our tools of choice. In particular, DAB has made deploying across our 3 environments a breeze, and has allowed us to quickly adopt new features as soon as they are released by Databricks.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us as we unveil how we transformed the largest Databricks workspace into the best-in-class lakehouse through Unity Catalog. Discover how we harnessed lineage and unified access management to build ultimate governance automation.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT
Technologies: AI/BI, DATABRICKS SQL, DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
Media enterprises generate vast amounts of visual content, but unlocking its full potential requires multimodal AI at scale. Coactive AI and NBCUniversal’s Corporate Decision Sciences team are transforming how enterprises discover and understand visual content. We explore how Coactive AI and Databricks — from Delta Share to Genie — can revolutionize media content search, tagging and enrichment, enabling new levels of collaboration. Attendees will see how this AI-powered approach fuels AI workflows, enhances BI insights and drives new applications — from automating cut sheet generation to improving content compliance and recommendations. By structuring and sharing enriched media metadata, Coactive AI and NBCU are unlocking deeper intelligence and laying the groundwork for agentic AI systems that retrieve, interpret and act on visual content. This session will showcase real-world examples of these AI agents and how they can reshape future content discovery and media workflows.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: EDUCATION, ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, MLFLOW, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: AI/BI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Protecting insurers against emerging threats is critical. This session reveals how leading companies use Databricks’ Data Intelligence Platform to transform risk management, enhance fraud detection, and ensure compliance. Learn how advanced analytics, AI, and machine learning process vast data in real time to identify risks and mitigate threats. Industry leaders will share strategies for building resilient operations that protect against financial losses and reputational harm. Key takeaways: Discover how data intelligence is revolutionizing insurance risk management and safeguarding the industry’s future.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: HEALTH AND LIFE SCIENCES
Technologies: DATABRICKS WORKFLOWS, DLT, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
In today's data-driven world, the ability to efficiently manage and transform data is crucial for any organization. This presentation will explore the process of converting a complex and messy workflow into a clean and simple Lakeflow Declarative Pipelines at a large integrated health system, Intermountain Health.Alteryx is a powerful tool for data preparation and blending, but as workflows grow in complexity, they can become difficult to manage and maintain. Lakeflow Declarative Pipelines, on the other hand, offers a more democratized, streamlined and scalable approach to data engineering, leveraging the power of Apache Spark and Delta Lake.We will begin by examining a typical legacy workflow, identifying common pain points such as tangled logic, performance bottlenecks and maintenance challenges. Next, we will demonstrate how to translate this workflow into a Lakeflow Declarative Pipelines, highlighting key steps such as data transformation, validation and delivery.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, TRAVEL AND HOSPITALITY
Technologies: MLFLOW, AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Unlock the hidden potential in your image data without specialized computer vision expertise! This session explores how to leverage Databricks' multi-modal Foundation Model APIs to analyze, classify and extract insights from visual content. Learn how Databricks provides a unified API to understand images using powerful foundation models within your data workflows. Key takeaways: Whether analyzing product images, processing visual documents or building content moderation systems, you'll discover how to extract valuable insights from your image data within the Databricks ecosystem.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT
Technologies: AI/BI, DATABRICKS SQL, MOSAIC AI
Skill Level: BEGINNER
Duration: 60 MIN
Come hear from some of the biggest names in games about how Data and AI is helping them shape their future, build better games and create player-centric experiences. In this session you’ll hear, first, what Databricks is hearing from Games studios globally as their key priorities. We then shift to customers sharing their stories and perspectives. You’ll leave invigorated on the impact Data and AI can have on games, and our global players and have new ideas on ways you can further your impact.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, DATABRICKS SQL, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Enterprises generate massive amounts of unstructured data — from support tickets and PDFs to emails and product images. But extracting insight from that data requires brittle pipelines and complex tools. Databricks AI Functions make this simpler. In this session, you’ll learn how to apply powerful language and vision models directly within your SQL and ETL workflows — no endpoints, no infrastructure, no rewrites. We’ll explore practical use cases and best practices for analyzing complex documents, classifying issues, translating content, and inspecting images — all in a way that’s scalable, declarative, and secure. What you’ll learn:
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
Customer support is going through the GenAI revolution, but how can we use AI to foster deeper empathy with our end users?To enable this, Earnin has built its GenAI observability platform on Databricks, leveraging Lakeflow Declarative Pipeliness, Kafka and Databricks AI/BI.This session covers how we use Lakeflow Declarative Pipelines to monitor our customer care chatbot in near real-time and how we leverage Databricks to better anticipate our customers' needs.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: EDUCATION, MEDIA AND ENTERTAINMENT
Technologies: AI/BI, DATABRICKS SQL, PYTORCH
Skill Level: BEGINNER
Duration: 20 MIN
Nondeterministic AI models, like large language models (LLMs), offer immense creative potential but require new approaches to testing and scalability. Drawing from her experience running New York Times-featured Generative AI comedy shows, Erin uncovers how traditional benchmarks may fall short and how embracing unpredictability can lead to innovative, laugh-inducing results. This talk will explore methods like multi-tiered feedback loops, chaos testing and exploratory user testing, where AI outputs are evaluated not by rigid accuracy standards but by their adaptability and resonance across different contexts — from comedy generation to functional applications. Erin will emphasize the importance of establishing a root source of truth — a reliable dataset or core principle — to manage consistency while embracing creativity. Whether you’re looking to generate a few laughs of your own or explore creative uses of Generative AI, this talk will inspire and delight enthusiasts of all levels.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, LLAMA, PYTORCH
Skill Level: INTERMEDIATE
Duration: 20 MIN
We present a novel framework for designing and inducing controlled hallucinations in long-form content generation by LLMs across diverse domains. The purpose is to create fully-synthetic benchmarks and mine hard cases for iterative refinement of zero-shot hallucination detectors. We will first demonstrate how Gretel Data Designer (now part of NVIDIA) can be used to design realistic, high-quality long-context datasets across various domains. Second, we will describe our reasoning-based approach to hard-case mining. Specifically, our methodology relies on chain-of-thought-based generation of both faithful and deceptive question-answer pairs based upon long-context samples. Subsequently, a consensus labeling & detector framework is employed to filter synthetic examples to zero-shot hard cases. The result of this process is a fully-automated system, operating under open data licenses such as Apache-2.0, for the generation of hallucinations at the edge-of-capabilities for a target LLM to detect.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, LLAMA, MOSAIC AI
Skill Level: ADVANCED
Duration: 20 MIN
Our project demonstrates building enterprise AI systems cost-effectively, focusing on matching merchant descriptors to known businesses. Using fine-tuned LLMs and advanced search, we created a solution rivaling alternatives at minimal cost. The system works in three steps: A fine-tuned Llama 3 8B model parses merchant descriptors into standardized components. A hybrid search system uses these components to find candidate matches in our database. A Llama 3 70B model then evaluates top candidates, with an AI judge reviewing results for hallucination. We achieved a 400% latency improvement while maintaining accuracy and keeping costs low and each fine-tuning round cost hundreds of dollars. Through careful optimization and simple architecture for a balance between cost, speed and accuracy, we show that small teams with modest budgets can tackle complex problems effectively using this technology. We share key insights on prompt engineering, fine-tuning and cost and latency management.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: MANUFACTURING
Technologies: APACHE SPARK, DELTA LAKE, DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this session, we will explore how Genie, an AI-driven platform transformed HVAC operational insights by leveraging Databricks offerings like Apache Spark, Delta Lake and the Databricks Data Intelligence Platform.Key contributions: By analyzing real-time data from HVAC installations, Genie identified discrepancies between design specs and field performance, allowing engineers to optimize algorithms, reduce inefficiencies and improve customer satisfaction. Discover how Genie revolutionized HVAC management and apply to your projects.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENERGY AND UTILITIES, PUBLIC SECTOR, RETAIL AND CPG - FOOD
Technologies: DATABRICKS SQL, UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this presentation, we will explore how to leverage Databricks' SQL engine to efficiently ingest and transform geospatial data. We'll demonstrate the seamless process of connecting to external systems such as ArcGIS to retrieve datasets, showcasing the platform's versatility in handling diverse data sources. We'll then delve into the power of Databricks Apps, illustrating how you can create custom geospatial dashboards using various frameworks like Streamlit and Flask, or any framework of your choice. This flexibility allows you to tailor your visualizations to your specific needs and preferences. Furthermore, we'll highlight the Databricks Lakehouse's integration capabilities with popular dashboarding tools such as Tableau and Power BI. This integration enables you to combine the robust data processing power of Databricks with the advanced visualization features of these specialized tools.
Type: LIGHTNING TALK
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: VIRTUAL
Track: DATA ENGINEERING AND STREAMING
Industry: N/A
Technologies: N/A
Skill Level: N/A
Duration: N/A
No description available.
Type: VIRTUAL
Track: DATA WAREHOUSING
Industry: N/A
Technologies: N/A
Skill Level: N/A
Duration: N/A
This course provides a comprehensive overview of Databricks’ modern approach to data warehousing, highlighting how a data lakehouse architecture combines the strengths of traditional data warehouses with the flexibility and scalability of the cloud. You’ll learn about the AI-driven features that enhance data transformation and analysis on the Databricks Data Intelligence Platform. Designed for data warehousing practitioners, this course provides you with the foundational information needed to begin building and managing high-performant, AI-powered data warehouses on Databricks. This course is designed for those starting out in data warehousing and those who would like to execute data warehousing workloads on Databricks. Participants may also include data warehousing practitioners who are familiar with traditional data warehousing techniques and concepts and are looking to expand their understanding of how data warehousing workloads are executed on Databricks.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, MANUFACTURING
Technologies: DELTA LAKE
Skill Level: BEGINNER
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENERGY AND UTILITIES
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Genie Rooms have played an integral role in democratizing important datasets like Cell Tower and Lease Information. However, in order to ensure that this exciting new release from Databricks was configured as optimally as possible from development to deployment, we needed additional scaffolding around governance. In this talk we will describe the four main components we used in conjunction with the Genie Room to build a successful product and will provide generalizable lessons to help others get the most out of this object. At the core are a declarative, metadata approach to creating UC tables deployed on a robust framework. Second, a platform that efficiently crowdsourced targeted feedback from different user groups. Third, a tool that balances the LLM’s creativity with human wisdom. And finally, a platform that enforces our principle of separating Storage from Compute to manage access to the room at a fine-grained level and enables a whole host of interesting use-cases.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: LAKEFLOW
Skill Level: BEGINNER
Duration: 40 MIN
Hundreds of customers are already ingesting data with Lakeflow Connect from SQL Server, Salesforce, ServiceNow, Google Analytics, SharePoint, PostgreSQL and more to unlock the full power of their data. Lakeflow Connect introduces built-in, no-code ingestion connectors from SaaS applications, databases and file sources to help unlock data intelligence. In this demo-packed session, you’ll learn how to ingest ready-to-use data for analytics and AI with a few clicks in the UI or a few lines of code. We’ll also demonstrate how Lakeflow Connect is fully integrated with the Databricks Data Intelligence Platform for built-in governance, observability, CI/CD, automated pipeline maintenance and more. Finally, we’ll explain how to use Lakeflow Connect in combination with downstream analytics and AI tools to tackle common business challenges and drive business impact.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: APACHE SPARK, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
I have metrics, you have metrics — we all have metrics. But the real problem isn’t having metrics, it’s that the numbers never line up, leading to endless cycles of reconciliation and confusion. Join us as we share how our Data Team at Databricks tackled this fundamental challenge in Business Intelligence by building an internal Metric Store — creating a single source of truth for all business metrics using the newly-launched UC Metric Views. Imagine a world where numbers always align, metric definitions are consistently applied across the organization and every metric comes with built-in ML-based forecasting, AI-powered anomaly detection and automatic explainability. That’s the future we’ve built — and we’ll show you how you can get started today.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: APACHE SPARK
Skill Level: INTERMEDIATE
Duration: 40 MIN
Spark Connect, first included for SQL/DataFrame API in Apache Spark 3.4 and recently extended to MLlib in 4.0, introduced a new way to run Spark applications over a gRPC protocol. This has many benefits, including easier adoption for non-JVM clients, version independence from applications and increased stability and security of the associated Spark clusters. The recent Spark Connect extension for ML also included a plugin interface to configure enhanced server-side implementations of the MLlib algorithms when launching the server. In this talk, we shall demonstrate how this new interface, together with Spark SQL’s existing plugin interface, can be used with NVIDIA GPU-accelerated plugins for ML and SQL to enable no-code change, end-to-end GPU acceleration of Spark ETL and ML applications over Spark Connect, with optimal performance up to 9x at 80% cost reduction compared to CPU baselines.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: DELTA LAKE, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Observability data — logs, metrics, and traces — captures the complex interactions within modern distributed systems. A graph query engine on top of Databricks enables complex traversal of massive observability data, helping users trace service dependencies, analyze upstream/downstream impacts, and uncover recurring error patterns, making it easier to diagnose issues and optimize system performance. A critical challenge in handling observability data is managing dynamic RBAC for the sensitive system telemetry. This session explains how Coinbase leverages credential vending, a method for issuing short-lived credentials to enable fine-grained, secure access to observability data stored in Databricks without long-lived secrets. Key takeaways: Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Type: HANDS-ON LEARNING
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: DATABRICKS WORKFLOWS, DLT, LAKEFLOW
Skill Level: BEGINNER
Duration: 90 MIN
This session is repeated. This introductory workshop caters to data engineers seeking hands-on experience and data architects looking to deepen their knowledge. The workshop is structured to provide a solid understanding of the following data engineering and streaming concepts: We believe you can only become an expert if you work on real problems and gain hands-on experience. Therefore, we will equip you with your own lab environment in this workshop and guide you through practical exercises like using GitHub, ingesting data from various sources, creating batch and streaming data pipelines, and more.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: EDUCATION, ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS WORKFLOWS, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Discover how Stack Overflow optimized its data engineering workflows using Databricks Asset Bundles (DABs) for scalable and efficient pipeline deployments. This session explores the structured pipeline architecture, emphasizing code reusability, modular design and bundle variables to ensure clarity and data isolation across projects. Learn how the data team leverages enterprise infrastructure to streamline deployment across multiple environments. Key topics include DRY-principled modular design, essential DAB features for automation and data security strategies using Unity Catalog. Designed for data engineers and teams managing multi-project workflows, this talk offers actionable insights on optimizing pipelines with Databricks evolving toolset.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
This research introduces a groundbreaking method for healthcare time-series forecasting using a Large Language Model (LLM) foundation model. By leveraging a comprehensive dataset of over 50 million IQVIA time-series trends, which includes data on procedure demands, sales and prescriptions (TRx), alongside publicly available data spanning two decades, the model aims to significantly enhance predictive accuracy in various healthcare applications. The model's transformer-based architecture incorporates self-attention mechanisms to effectively capture complex temporal dependencies within historical time-series trends, offering a sophisticated approach to understanding patterns, trends and cyclical variations.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, DATABRICKS WORKFLOWS, MOSAIC AI
Skill Level: ADVANCED
Duration: 40 MIN
This talk explores using advanced data processing and generative AI techniques to revolutionize the retail industry. Using Databricks, we will discuss how cutting-edge technologies enable real-time data analysis and machine learning applications, creating a powerful ecosystem for large-scale, data-driven retail solutions. Attendees will gain insights into architecting scalable data pipelines for retail operations and implementing advanced analytics on streaming customer data. Discover how these integrated technologies drive innovation in retail, enhancing customer experiences, streamlining operations and enabling data-driven decision-making. Learn how retailers can leverage these tools to gain a competitive edge in the rapidly evolving digital marketplace, ultimately driving growth and adaptability in the face of changing consumer behaviors and market dynamics.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: HEALTH AND LIFE SCIENCES
Technologies: DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Building scalable, reliable ETL pipelines is a challenge for organizations managing large, diverse data sources. Theseus, our custom ETL framework, streamlines data ingestion and transformation by fully leveraging Databricks-native capabilities, including Lakeflow Declarative Pipelines, auto loader and event-driven orchestration. By decoupling supplier logic and implementing structured bronze, silver, and gold layers, Theseus ensures high-performance, fault-tolerant data processing with minimal operational overhead. The result? Faster time-to-value, simplified governance and improved data quality — all within a declarative framework that reduces engineering effort. In this session, we’ll explore how Theseus automates complex data workflows, optimizes cost efficiency and enhances scalability, showcasing how Databricks-native tools drive real business outcomes.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: HEALTH AND LIFE SCIENCES
Technologies: AI/BI, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 60 MIN
Join us for an engaging 60-minute Healthcare and Life Sciences Industry Forum at the year’s premier Databricks event! You’ll hear directly from Databricks experts and industry leaders about how unifying data, governing AI models and empowering teams with data intelligence can drive meaningful change across the healthcare and life sciences continuum. Discover how data and AI are transforming the industry — helping streamline healthcare operations, personalize patient care and accelerate breakthroughs in research and development. Don’t miss this opportunity to learn about the future of data-driven healthcare.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Healthcare and life sciences organizations are exploring AI Agents, driving transformation through intelligent supply chains to helping up-level the patient experience via virtual assistants. This session explores how you can get started with AI Agents, powered by Databricks and robust data governance, and tapping into the full potential of all your data. You’ll learn practical steps for getting started: unifying data with Databricks, ensuring compliance with Unity Catalog, and rapidly deploying AI Agents to drive operational efficiency, improve care, and foster innovation across healthcare and life sciences.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: HEALTH AND LIFE SCIENCES
Technologies: AI/BI, DATABRICKS WORKFLOWS, DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
Redox & Databricks direct integration can streamline your interoperability workflows from responding in record time to preauthorization requests to letting attending physicians know about a change in risk for sepsis and readmission in near real time from ADTs. Data engineers will learn how to create fully-streaming ETL pipelines for ingesting, parsing and acting on insights from Redox FHIR bundles delivered directly to Unity Catalog volumes. Once available in the Lakehouse, AI/BI Dashboards and Agentic Frameworks help write FHIR messages back to Redox for direct push down to EMR systems. Parsing FHIR bundle resources has never been easier with SQL combined with the new VARIANT data type in Delta and streaming table creation against Serverless DBSQL Warehouses. We'll also use Databricks accelerators dbignite and redoxwrite for writing and posting FHIR bundles back to Redox integrated EMRs and we'll extend AI/BI with Unity Catalog SQL UDFs and the Redox API for use in Genie.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Ever wondered how industry leaders handle thousands of ML predictions per second? This session reveals the architecture behind high-performance model serving systems on Databricks. We'll explore how to build inference pipelines that efficiently scale to handle massive request volumes while maintaining low latency. You'll learn how to leverage Feature Store for consistent, low-latency feature lookups and implement auto-scaling strategies that optimize both performance and cost. Key takeaways: Whether you're serving recommender systems or real-time fraud detection models, you'll gain practical strategies for building enterprise-grade ML serving systems.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: PUBLIC SECTOR
Technologies: APACHE SPARK, DATABRICKS SQL, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
The problem of matching GPS locations to roads and local government areas (LGAs) involves handling large datasets and a number of geospatial operations. In this deep dive, we will outline the challenges of developing scalable solutions for these tasks. We will discuss our multi-step approach, first focusing on the use of H3 indexing to isolate matches with single candidates, then explaining use of different geospatial computational techniques to accurately match points with multiple candidates. From technical perspective, the talk will showcase the use of broadcasting and partitioning techniques, their effect on autoscaling, memory usage and effective data parallelization. This session is for anyone interested in geospatial data, spark performance optimization and the real-world challenges of large-scale data engineering.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Hinge Health faced challenges in hiring global data teams due to the complexities of PHI (Protected Health Information) governance. Unable to hire without compliant data sharing and unable to scale PHI governance without a larger team, we overcame this chicken or the egg challenge by adopting Unity Catalog's Fine-Grain Access Control. In this session, we will share our journey migrating to Unity Catalog, securing PHI with row filters/column masks, lessons learned and how our efforts surpassed our own expectations. This session equips data teams with strategies for HIPAA compliance without compromising flexibility and collaboration. Hinge Health is the leading digital MSK clinic, serving 11M+ members and 500+ employer health plans offering virtual physical therapy to reduce pain, surgeries and opioid use.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: MEDIA AND ENTERTAINMENT
Technologies: AI/BI, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Discover how leading ad techs and agencies — including Magnite, LG Ads, MiQ and Publicis Influential — leverage Databricks to power the advertising and marketing ecosystem.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: MEDIA AND ENTERTAINMENT
Technologies: MLFLOW, DATABRICKS SQL, MOSAIC AI
Skill Level: BEGINNER
Duration: 40 MIN
Discover how Adobe is redefining its Data Supply Chain through an AI-first, agentic solution that transforms the entire Data Development and Delivery Lifecycle (DDLC). This next-generation engineering workbench empowers data engineers, analysts, and practitioners with intelligent automation, context-aware assistance, and seamless collaboration to accelerate and streamline every phase of the data supply chain — from capturing business intent and sourcing data to building pipelines, validating quality, and delivering trusted, actionable insights.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: RETAIL AND CPG - FOOD
Technologies: APACHE SPARK, DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: MLFLOW, AI/BI, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: APACHE SPARK, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this session, you’ll gain actionable insights to modernize your security operations and strengthen cyber resilience. Arctic Wolf will highlight how they eliminated data silos & enhanced their MDR pipeline to investigate suspicious threat actors for customers using Databricks.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MANUFACTURING, PUBLIC SECTOR
Technologies: DATABRICKS SQL, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Blue Origin is revolutionizing space exploration with a mission-critical data strategy powered by Databricks on AWS GovCloud. Learn how they leverage Databricks to meet ITAR and FedRAMP High compliance, streamline manufacturing and accelerate their vision of a 24/7 factory. Key use cases include predictive maintenance, real-time IoT insights and AI-driven tools that transform CAD designs into factory instructions. Discover how Delta Lake, Structured Streaming and advanced Databricks functionalities like Unity Catalog enable real-time analytics and future-ready infrastructure, helping Blue Origin stay ahead in the race to adopt generative AI and serverless solutions.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MANUFACTURING
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
We will explore how leveraging Databricks' Unity Catalog has accelerated our FinOps maturity, enabling us to optimize platform utilization and achieve significant cost reductions. By implementing Unity Catalog, we've gained comprehensive visibility and governance over our data assets, leading to more informed decision-making and efficient resource allocation. Learn how Corning discovered actionable insights and leveraged best practices on utilizing Unity Catalog to streamline data management, enhance financial operations and drive substantial savings within your organization.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Learn how Danone, a global leader in the food industry, improved its data-sharing processes using Delta Sharing, an open protocol developed by Databricks. This session will explore how Danone migrated from a traditional hub-and-spoke model to a more efficient and scalable data-sharing approach that works seamlessly across regions and platforms. We’ll discuss practical concepts such as in-region and cross-region data sharing, fine-grained access control, data discovery, and the implementation of data contracts. You’ll also hear about the strategies Danone uses to deliver governed data efficiently while maintaining compliance with global regulations. Additionally, we’ll discuss a cost comparison between direct data access and replication. Finally, we’ll share insights into the challenges faced by global organizations in managing data sharing at scale and how Danone addressed these issues. Attendees will gain practical knowledge on building a reliable and secure data-sharing framework for international collaboration.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: HEALTH AND LIFE SCIENCES
Technologies: DATA MARKETPLACE, DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, DATABRICKS SQL, DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
As cybersecurity threats grow in volume and complexity, organizations must efficiently process security telemetry for best-in-class detection and mitigation. Barracuda’s XDR platform is redefining security operations by layering advanced detection methodologies over a broad range of supported technologies. Our vision is to deliver unparalleled protection through automation, machine learning and scalable detection frameworks, ensuring threats are identified and mitigated quickly. To achieve this, we have adopted Databricks as the foundation of our security analytics platform, providing greater control and flexibility while decoupling from traditional SIEM tools. By leveraging Lakeflow Declarative Pipelines, Spark Structured Streaming and detection-as-code CI/CD pipelines, we have built a real-time detection engine that enhances scalability, accuracy and cost efficiency. This session explores how Databricks is shaping the future of XDR through real-time analytics and cloud-native security.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: RETAIL AND CPG - FOOD
Technologies: DATABRICKS SQL, DATABRICKS WORKFLOWS, DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
Feastables, founded by YouTube sensation MrBeast, partnered with Engine to build a modern, AI-enabled BI ecosystem that transforms complex, disparate data into actionable insights, driving smarter decision-making across the organization. In this session, learn how Engine, a Built-On Databricks Partner, brought expertise combined with strategic partnerships that enabled Feastables to rapidly stand up a secure, modern data estate to unify complex internal and external data sources into a single, permissioned analytics platform. Feastables unlocked the power of cross-functional collaboration by democratizing data access throughout their enterprise and seamlessly integrating financial, retailer, supply chain, syndicated, merchandising and e-commerce data. Discover how a scalable analytics framework combined with advanced AI models and tools empower teams with Smarter BI across sales, marketing, supply chain, finance and executive leadership to enable real-time decision-making at scale.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
FedEx, a global leader in transportation and logistics, faced a common challenge in the era of big data: how to democratize data and foster data-driven decision making with thousands of data practitioners at FedEx wanting to build models, get real-time insights, explore enterprise data, and build enterprise-grade solutions to run the business. This breakout session will highlight how FedEx overcame challenges in data governance and security using Unity Catalog, ensuring that sensitive information remains protected while still allowing appropriate access across the organization. We'll share their approach to building intuitive self-service interfaces, including the use of natural-language processing to enable non-technical users to query data effortlessly. The tangible outcomes of this initiative are numerous, but chiefly: increased data literacy across the company, faster time-to-insight for business decisions, and significant cost-savings through improved operational efficiency.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This talk takes you through the Nationwide Security and Infrastructure data team's journey of migrating from HMS to UC. Discover how HMS federation simplified our transition to UC, allowing for an incremental migration that minimized disruption to data consumers while optimizing our data layout. We’ll share the key technical decisions, challenges faced and lessons learned along the way. The migration process wasn’t without its hurdles, so we’ll walk you through our detailed, step-by-step approach covering planning, execution and validation. We will also showcase the benefits realized, such as improved data governance, more efficient data access and enhanced operational performance. Join us to gain practical insights into executing complex data migrations with a focus on security, flexibility and long-term scalability.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: MANUFACTURING
Technologies: AI/BI, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
HP’s 3D Print division empowers manufacturers with telemetry data to optimize operations and streamline maintenance. Using Delta Sharing, Unity Catalog and AI/BI dashboards, HP provides a secure, scalable solution for data sharing and analytics. Delta Sharing D2O enables seamless data access, even for customers not on Databricks. Apigee masks private URLs, and Unity Catalog enhances security by managing data assets. Predictive maintenance with Mosaic AI boosts uptime by identifying issues early and alerting support teams. Custom dashboards and sample code let customers run analytics using any supported client, while Apigee simplifies access by abstracting complexity. Insights from A/BI dashboards help HP refines data strategy, aligning solutions with customer needs despite the complexity of diverse technologies, fragmented systems and customer-specific requirements. This fosters trust, drives innovation,and strengthens HP as a trusted partner for scalable, secure data solutions.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: FINANCIAL SERVICES
Technologies: DATABRICKS SQL, LAKEFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Navy Federal Credit Union has 200+ enterprise data sources in the enterprise data lake. These data assets are used for training 100+ machine learning models and hydrating a semantic layer for serving, at an average 4,000 business users daily across the credit union. The only option for extracting data from analytic semantic layer was to allow consuming application to access it via an already-overloaded cloud data warehouse. Visualizing data lineage for 1,000 + data pipelines and associated metadata is impossible and understanding the granular cost for running data pipelines is a challenge. Implementing Unity Catalog opened alternate path for accessing analytic semantic data from lake. It also opened the doors to remove duplicate data assets stored across multiple lakes which will save hundred thousands of dollars in data engineering efforts, compute and storage costs.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: APACHE SPARK, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
At Nubank, we successfully migrated to Unity Catalog, addressing the needs of our large-scale data environment with 3k active users, over 4k notebooks and jobs and 1.1 million tables, including sensitive PII data. Our primary objectives were to enhance data governance, security and user experience.Key points: This migration significantly improved our data governance capabilities, enhanced security measures and provided a more user-friendly experience for our large user base, ultimately leading to better control and utilization of our vast data resources.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: FINANCIAL SERVICES
Technologies: DATABRICKS SQL, UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 20 MIN
Databricks’ Serverless compute streamlines infrastructure setup and management, delivering unparalleled performance and cost optimization for Data and BI workflows. In this presentation, we will explore how Nationwide is leveraging Databricks’ serverless technology and unified governance through Unity Catalog to build scalable, world-class BI solutions. Key features like AI/BI Dashboards, Genie, Materialized Views, Lakehouse Federation and Lakehouse Apps, all powered by serverless, have empowered business teams to deliver faster, scalable and smarter insights. We will show how Databricks’ serverless technology is enabling Nationwide to unlock new levels of efficiency and business impact, and how other organizations can adopt serverless technology to realize similar benefits.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Deploying AI in production is getting more complex — with different model types, tighter timelines, and growing infrastructure demands. In this session, we’ll walk through how Mosaic AI Model Serving helps teams deploy and scale both traditional ML and generative AI models efficiently, with built-in monitoring and governance.We’ll also hear from Skyscanner on how they’ve integrated AI into their products, scaled to 100+ production endpoints, and built the processes and team structures to support AI at scale. Key Takeaways:
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT
Technologies: MLFLOW, DATABRICKS WORKFLOWS, DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
Don't miss this session where we demonstrate how the Texas Rangers baseball team is staying one step ahead of the competition by going back to the basics. After implementing a modern data strategy with Databricks and winnng the 2023 World Series the rest of the league quickly followed suit. Now more than ever, data and AI are a central pillar of every baseball team's strategy driving profound insights into player performance and game dynamics. With a 'fundamentals win games' back to the basics focus, join us as we explain our commmitment to world-class data quality, engineering, and MLOPS by taking full advantage of the Databricks Data Intelligence Platform. From system tables to federated querying, find out how the Rangers use every tool at their disposal to stay one step ahead in the hyper competitive world of baseball.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Building an open data lakehouse? Start with the right blueprint. This session walks through common reference architectures for interoperable lakehouse deployments across AWS, Google Cloud, Azure and tools like Snowflake, BigQuery and Microsoft Fabric. Learn how to design for cross-platform data access, unify governance with Unity Catalog and ensure your stack is future-ready — no matter where your data lives.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENERGY AND UTILITIES, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: DATABRICKS SQL, DELTA SHARING, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Unlock the full potential of your BI tools with Databricks. This session explores how features like Photon, Databricks SQL, Liquid Clustering, AI/BI Genie and Publish to Power BI enhance performance, scalability and user experience. Learn how Databricks accelerates query performance, optimizes data layouts and integrates seamlessly with BI tools. Gain actionable insights and best practices to improve analytics efficiency, reduce latency and drive better decision-making. Whether migrating from a data warehouse or optimizing an existing setup, this talk provides the strategies to elevate your BI capabilities.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
Migrating your legacy Oracle data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. Discover the pros and cons of using CSV files to PySpark or using pipelines to Databricks tables. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
Migrating your Snowflake data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. Though a cloud platform-to-cloud platform migration should be relatively easy, the breadth of the Databricks Platform provides flexibility and hence requires careful planning and execution. In this session, we present the migration methodology, technical approaches, automation tools, product/feature mapping, a technical demo and best practices using real-world case studies for migrating data, ELT pipelines and warehouses from Snowflake to Databricks.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
Storage and processing costs of your legacy Teradata data warehouses impact your ability to deliver. Migrating your legacy Teradata data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. How to use Databricks natively hosted LLMs to assist with migration activities. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: TRAVEL AND HOSPITALITY
Technologies: DATABRICKS WORKFLOWS, PARTNER CONNECT, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Discover how United Airlines, in collaboration with Databricks and Impetus Technologies, has built a next-generation data intelligence platform leveraging System Wide Information Management (SWIM) to deliver mission-critical, real-time insights for flight disruption prediction, situational analysis, and smarter, faster decision-making. In this session, United Airlines experts will share how their Databricks-based SWIM architecture enables near real-time operational awareness, enhances responsiveness during irregular operations (IRROPs), and drives proactive actions to minimize disruptions. They will also discuss how United efficiently processes and manages the large volume and variety of SWIM data, ensuring seamless integration and actionable intelligence across their operations.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this talk, we will discuss the lessons learned and future vision of transforming two business units to a modern financial data platform at Nasdaq. We'll highlight the transition from disjointed systems to a unified platform using Databricks. Our target audience includes financial engineers, data architects and technical leaders. The agenda covers challenges of legacy systems, reasons for choosing Databricks and key architectural decisions.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
AI/BI Genie has transformed self-service analytics for the Databricks Marketing team. This user-friendly conversational AI tool empowers marketers to perform advanced data analysis using natural language — no SQL required. By reducing reliance on data teams, Genie increases productivity and enables faster, data-driven decisions across the organization. But realizing Genie’s full potential takes more than just turning it on. In this session, we’ll share the end-to-end journey of implementing Genie for over 200 marketing users, including lessons learned, best practices and the real business impact of this Databricks-on-Databricks solution. Learn how Genie democratizes data access, enhances insight generation and streamlines decision-making at scale.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: MANUFACTURING
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
HP Print's data platform team took on a migration from a monolithic, shared resource of AWS Redshift, to a modular and scalable data ecosystem on Databricks lakehouse. The result was 30–40% cost savings, scalable and isolated resources for different data consumers and ETL workloads, and performance optimization for a variety of query types. Through this migration, there were technical challenges and learnings relating to the ETL migrations with DBT, new Databricks features like Liquid Clustering, predictive optimization, Photon, SQL serverless warehouses, managing multiple teams on Unity Catalog, and others. This presentation dives into both the business and technical sides of this migration. Come along as we share our key takeaways from this journey.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, TRAVEL AND HOSPITALITY
Technologies: APACHE SPARK, APACHE ICEBERG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The Apache Iceberg™ community is introducing native geospatial type support, addressing key challenges in managing geospatial data at scale, including fragmented formats and inefficiencies in storing large spatial datasets. This talk will delve into the origins of the Iceberg geo type, its specification design and future goals. We will examine the impact on both the geospatial and Iceberg communities, in introducing a standard data warehouse storage layer to the geospatial community, and enabling optimized geospatial analytics for Iceberg users. We will also present a live demonstration of the Iceberg geo data type with Apache Sedona™ and Apache Spark™, showcasing how it simplifies and accelerates geospatial analytics workflows and queries. Finally, we will also provide an in-depth look at its current capabilities and outline the roadmap for future developments, and offer a perspective on its role in advancing geospatial data management in the industry.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: HEALTH AND LIFE SCIENCES
Technologies: AI/BI, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The convergence of cloud, data and AI is revolutionizing the pharmaceutical industry, creating a powerful ecosystem that drives innovation at scale across the entire value chain. At Gilead, teams harness these technologies on a unified cloud, data, & AI platform, accelerating business processes in pre-clinical and clinical stage, enabling smarter manufacturing and commercial processes, and deliver AI initiatives by reusing data products. Gilead will discuss how they have leveraged AWS, Databricks, and Data Mesh to manage vast amounts of heterogeneous data. Also, showcase use cases of traditional AI/ML, and Generative AI, and a Marketplace approach to drive adoption of AI Agents, demonstrating how this cloud-based, AI-powered platform is transforming the entire value chain. Gilead will also discuss how they are exploring the future of pharmaceutical innovation through Agentic AI, where the synergy of cloud, data and AI is unlocking new possibilities for a healthier world. In the second part, Muddu Sudhakar, Founder and Investor, will discuss how organizations can build and buy solutions for AI, Agents with Data Platforms. AWS and Databricks provide industry-leading platforms to build Agentic AI solutions. We will also cover Agentic AI Platform, Agent orchestration, Agent Interoperability, Agent Guardrails and Agentic workflows. This discussion also covers challenges in deploying and managing Agentic AI platforms. Enterprises need impactful AI initiatives & Agents to realize the promise and vision of AI and drive significant ROI.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us on a technical journey into GreenOps at ABN AMRO Bank using Databricks system tables. We'll explore security, implementation challenges and best-practice verification, with practical examples and actionable reports. Discover how to optimize resource usage, ensure compliance and maintain agility. We'll discuss best practices, potential pitfalls and the nuanced 'it depends' scenarios, offering a comprehensive guide for intermediate to advanced practitioners.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: APACHE SPARK, AI/BI, LLAMA
Skill Level: INTERMEDIATE
Duration: 40 MIN
A big challenge in LLM development and synthetic data generation is ensuring data quality and diversity. While data incorporating varied perspectives and reasoning traces consistently improves model performance, procuring such data remains impossible for most enterprises. Human-annotated data struggles to scale, while purely LLM-based generation often suffers from distribution clipping and low entropy. In a novel compound AI approach, we combine LLMs with probabilistic graphical models and other tools to generate synthetic personas grounded in real demographic statistics. The approach allows us to address major limitations in bias, licensing, and persona skew of existing methods. We release the first open-source dataset aligned with real-world distributions and show how enterprises can leverage it with Gretel Data Designer (now part of NVIDIA) to bring diversity and quality to model training on the Databricks platform, all while addressing model collapse and data provenance concerns head-on.
Type: LIGHTNING TALK
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
To scale Databricks SQL to 2,000 users efficiently and cost-effectively, we adopted serverless, ensuring dynamic scalability and resource optimization. During peak times, resources scale up automatically; during low demand, they scale down, preventing waste. Additionally, we implemented a strong content governance model. We created continuous monitoring to assess query and dashboard performance, notifying users about adjustments and ensuring only relevant content remains active. If a query exceeds time or impact limits, access is reviewed and, if necessary, deactivated. This approach brought greater efficiency, cost reduction and an improved user experience, keeping the platform well-organized and high-performing.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, APACHE ICEBERG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Apache Iceberg is a popular table format for managing large analytical datasets. But replicating iceberg tables at scale can be a daunting task — especially when dealing with its hierarchical metadata. In this talk, we present an end-to-end workflow for replicating Apache Iceberg tables, leveraging Apache Spark to ensure that backup tables remain identical to their source counterparts. More excitingly, we have contributed these libraries back to the open-source community. Attendees will gain a comprehensive understanding of how to set up replication workflows for Iceberg tables, as well as practical guidance on how to manage and maintain replicated datasets at scale. This talk is ideal for data engineers, platform architects and practitioners looking to apply replication and disaster recovery for Apache Iceberg in complex data ecosystems.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: RETAIL AND CPG - FOOD
Technologies: DATABRICKS WORKFLOWS, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Retail data is expanding at an unprecedented rate, demanding a scalable, cost-efficient, and near real-time architecture. At Unilever, we transformed our data management approach by leveraging Databricks Lakeflow Declarative Pipelines, achieving approximately $500K in cost savings while accelerating computation speeds by 200–500%.By adopting a streaming-driven architecture, we built a system where data flows continuously across processing layers, enabling real-time updates with minimal latency.Lakeflow Declarative Pipelines' serverless simplicity replaced complex-dependency management, reducing maintenance overhead, and improving pipeline reliability. Lakeflow Declarative Pipelines Direct Publishing further enhanced data segmentation, concurrency, and governance, ensuring efficient and scalable data operations while simplifying workflows.This transformation empowers Unilever to manage data with greater efficiency, scalability, and reduced costs, creating a future-ready infrastructure that evolves with the needs of our retail partners and customers.
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT
Technologies: DELTA LAKE, DATA MARKETPLACE
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
In the modern business landscape, AI and data strategies can no longer operate in isolation. To drive meaningful outcomes, organizations must align these critical components within a unified framework tied to overarching business objectives. This presentation explores the necessity of integrating AI and data strategies, emphasizing the importance of high-quality data, scalable architectures and robust governance. Attendees will learn three essential steps that need to be taken: Additionally, the talk will highlight the cultural shift required to bridge IT and business silos, fostering roles that combine technical and business expertise. We’ll dive into specific practical steps that can be taken to ensure an organization has a cohesive and blended AI and data strategy, using specific case examples.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, DATABRICKS SQL, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Most enterprise data is trapped in unstructured formats — documents, PDFs, scanned images and tables — making it difficult to access, analyze and use. This session shows how to unlock that hidden value by building intelligent document processing workflows on the Databricks Data Intelligence Platform. You’ll learn how to ingest unstructured content using Lakeflow Connect, extract structured data with AI Parse — even from complex tables and scanned documents — and apply analytics or AI to this newly structured data. What you’ll learn:
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: BEGINNER
Duration: 40 MIN
Ever struggled with getting AI to work effectively for your specific domain needs? Join us to discover how establishing the right data foundation transforms the way you build and deploy specialized AI agents. This session demonstrates how to prepare and structure your information assets to enable more powerful, efficient AI applications. Learn proven approaches for extracting value from both structured and unstructured data, creating knowledge bases that serve as the backbone for domain-specific agents, and implement optimization techniques that balance quality and resource constraints.Key takeaways:
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS WORKFLOWS, DLT, LAKEFLOW
Skill Level: BEGINNER
Duration: 40 MIN
Join us to explore Lakeflow, Databricks' end-to-end solution for simplifying and unifying the most complex data engineering workflows. This session builds on keynote announcements, offering an accessible introduction for newcomers while emphasizing the transformative value Lakeflow delivers.We’ll cover: Discover how Lakeflow equips data teams with a seamless experience for ingestion, transformation, and orchestration, reducing complexity and driving productivity. By unifying these capabilities, Lakeflow lays the groundwork for scalable, reliable, efficient data pipelines in a governed and high-performing environment.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: MANUFACTURING, MEDIA AND ENTERTAINMENT, FINANCIAL SERVICES
Technologies: APACHE SPARK
Skill Level: ADVANCED
Duration: 20 MIN
This presentation will review the new change feed and snapshot capabilities in Apache Spark™ Structured Streaming’s State Reader API. The State Reader API enables users to access and analyze Structured Streaming's internal state data. Readers will learn how to leverage the new features to debug, troubleshoot and analyze state changes efficiently, making streaming workloads easier to manage at scale.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. If you are brand new to Databricks SQL and want to get a lightning tour of this intelligent data warehouse, this session is for you. Learn about the architecture of Databricks SQL. Then show how simple, streamlined interfaces are making it easier for analysts, developers, admins and business users to get their jobs done and questions answered. We’ll show how easy it is to create a warehouse, get data, transform it and build queries and dashboards. By the end of the session, you’ll be able to build a Databricks SQL warehouse in 5 minutes.
Type: DEEP DIVE
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENERGY AND UTILITIES, MANUFACTURING, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, APACHE ICEBERG
Skill Level: BEGINNER
Duration: 90 MIN
In this session, learn about why modern open table formats like Delta and Iceberg are a big deal and how they work with catalogs. Learn about what motivated their creation, how they work, what benefits they can bring to your data and AI platform. Hear about how these formats are becoming increasingly interoperable and what our vision is for their future.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Today’s organizations need faster, more reliable insights — but metric sprawl and inconsistent KPIs make that difficult. In this session, you’ll learn how Unity Catalog Metrics helps unify business semantics across your organization. Define your KPIs once, apply enterprise-grade governance with fine-grained access controls, auditing and lineage, and use them across any Databricks tool — from AI/BI Dashboards and Genie to notebooks and Lakeflow. You’ll learn how to eliminate metric chaos by centrally defining and governing metrics with Unity Catalog. You’ll walk away with strategies to boost trust through built-in governance and empower every team — regardless of technical skill — to work from the same certified metrics.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: FINANCIAL SERVICES
Technologies: DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Intuit leverages Databricks Clean Rooms to create a secure, privacy-safe lending marketplace, enabling small business lending partners to perform analytics and deploy ML/AI workflows on sensitive data assets. This session explores the technical foundations of building isolated clean rooms across multiple partners and cloud providers, differentiating Databricks Clean Rooms from market alternatives. We'll demonstrate our automated approach to clean room lifecycle management using APIs, covering creation, collaborator onboarding, data asset sharing, workflow orchestration and activity auditing. The integration with Unity Catalog for managing clean room inputs and outputs will also be discussed. Attendees will gain insights into harnessing collaborative ML/AI potential, support various languages and workloads, and enable complex computations without compromising sensitive information in Clean Rooms.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: HEALTH AND LIFE SCIENCES
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This presentation will explore the transformation of IQVIA's decade-old patient support platform through the implementation of Databricks Data Intelligence Platform. Facing scalability challenges, performance bottlenecks and rising costs, the existing platform required significant redesign to handle growing data volumes and complex analytics. Key issues included static metrics limiting workflow optimization, fragmented data governance and heightened compliance and security demands. By partnering with Customertimes (a Databricks Partner) and adopting Databricks' centralized, scalable analytics solution with enhanced self-service capabilities, IQVIA achieved improved query performance, cost efficiency and robust governance, ensuring operational effectiveness and regulatory compliance in an increasingly complex environment.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: DATABRICKS SQL, DATABRICKS WORKFLOWS, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Your data and AI use-cases are multiplying. At the same time, there is increased focus and scrutiny to meet sophisticated security and regulatory requirements. IQVIA utilizes serverless use-cases across data engineering, data analytics, and ML and AI, to empower their customers to make informed decisions, support their R&D processes and improve patient outcomes. By leveraging native controls on the platform, serverless enables them to streamline their use cases while maintaining a strong security posture, top performance and optimized costs. This session will go over IQVIA’s journey to serverless, how they met their security and regulatory requirements, and the latest and upcoming enhancements to the Databricks Platform.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENERGY AND UTILITIES
Technologies: DATABRICKS SQL, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
At Italgas, Europe’s leading gas distributor both by network size and number of customers, we are spearheading digital transformation through a state-of-the-art, fully-fledged Databricks Intelligent platform. The future of gas distribution is data-driven: predictive maintenance, automated operations, and real-time decision making are now realities. Our AI Factory isn't just digitizing infrastructure—it's creating a more responsive, efficient, and sustainable gas network that anticipates needs before they arise.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
At OpenAI, Kafka fuels real-time data streaming at massive scale, but traditional consumers struggle under the burden of partition management, offset tracking, error handling, retries, Dead Letter Queues (DLQ), and dynamic scaling — all while racing to maintain ultra-high throughput. As deployments scale, complexity multiplies. Enter Kafka Forwarder — a game-changing Kafka Consumer Proxy that flips the script on traditional Kafka consumption. By offloading client-side complexity and pushing messages to consumers, it ensures at-least-once delivery, automated retries, and seamless DLQ management via Databricks. The result? Scalable, reliable and effortless Kafka consumption that lets teams focus on what truly matters. Curious how OpenAI simplified self-service, high-scale Kafka consumption? Join us as we walk through the motivation, architecture and challenges behind Kafka Forwarder, and share how we structured the pipeline to seamlessly route DLQ data into Databricks for analysis.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, DELTA LAKE
Skill Level: INTERMEDIATE
Duration: 20 MIN
Delta Lake is redesigning its Spark connector through the combination of three key technologies: First, we're updating our Spark APIs to DSv2 to achieve deeper catalog integration and improved integration with the Spark optimizer. Second, we're fully integrating on top of Delta Kernel to take advantage of its simplified abstraction of Delta protocol complexities, accelerating feature adoption and improving maintainability. Third, we are transforming Delta to become a catalog-aware lakehouse format with Catalog Commits, enabling more efficient metadata management, governance and query performance. Join us to explore how we're advancing Delta Lake's architecture, pushing the boundaries of metadata management and creating a more intelligent, performant data lakehouse platform.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, PROFESSIONAL SERVICES
Technologies: DATABRICKS WORKFLOWS, MOSAIC AI, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 20 MIN
In an era where cloud costs can spiral out of control, Sportsbet achieved a remarkable 49% reduction in Total Cost of Ownership (TCO) through an innovative AI-powered solution called 'Kill Bill.' This presentation reveals how we transformed Databricks' consumption-based pricing model from a challenge into a strategic advantage through an intelligent automation and optimization. Attendees will leave with a clear understanding of how to implement AI within Databricks solutions to address similar cost challenges in their environments.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS APPS
Skill Level: BEGINNER
Duration: 40 MIN
Lakebase is a new Postgres-compatible OLTP database designed to support intelligent applications. Lakebase eliminates custom ETL pipelines with built-in lakehouse table synchronization, supports sub-10ms latency for high-throughput workloads, and offers full Postgres compatibility, so you can build applications more quickly.In this session, you’ll learn how Lakebase enables faster development, production-level concurrency, and simpler operations for data engineers and application developers building modern, data-driven applications. We'll walk through key capabilities, example use cases, and how Lakebase simplifies infrastructure while unlocking new possibilities for AI and analytics.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
Lakeflow Connect streamlines the ingestion of incremental data from popular databases like SQL Server and PostgreSQL. In this session, we’ll review best practices for networking, security, minimizing database load, monitoring and more — tailored to common industry scenarios. Join us to gain practical insights into Lakeflow Connect's functionality so that you’re ready to build your own pipelines. Whether you're looking to optimize data ingestion or enhance your database integrations, this session will provide you with a deep understanding of how Lakeflow Connect works with databases.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: MANUFACTURING, TRAVEL AND HOSPITALITY
Technologies: LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
Lakeflow Connect enables you to easily and efficiently ingest data from enterprise applications like Salesforce, ServiceNow, Google Analytics, SharePoint, NetSuite, Dynamics 365 and more. In this session, we’ll dive deep on using connectors for the most popular SaaS applications, reviewing common use cases such as analyzing consumer behavior, predicting churn and centralizing HR analytics. You'll also hear from an early customer about how Lakeflow Connect helped unify their customer data to drive an improved automotive experience. We’ll wrap up with a Q&A so you have the opportunity to learn from our experts.
Type: DEEP DIVE
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DLT, LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 90 MIN
Auto Loader is the definitive tool for ingesting data from cloud storage into your lakehouse. In this session, we’ll unveil new features and best practices that simplify every aspect of cloud storage ingestion. We’ll demo out-of-the-box observability for pipeline health and data quality, walk through improvements for schema management, introduce a series of new data formats and unveil recent strides in Auto Loader performance. Along the way, we’ll provide examples and best practices for optimizing cost and performance. Finally, we’ll introduce a preview of what’s coming next — including a REST API for pushing files directly to Delta, a UI for creating cloud storage pipelines and more. Join us to help shape the future of file ingestion on Databricks.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, PUBLIC SECTOR
Technologies: DLT, LAKEFLOW, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
In 2020, Delaware implemented a state-of-the-art, event-driven architecture for EFSA, enabling a highly decoupled system landscape, presented at the Data&AI Summit 2021. By centrally brokering events in near real-time, consumer applications react instantly to events from producer applications as they occur. Event producers are decoupled from consumers via a publisher/subscriber mechanism. Over the past years, we noticed some drawbacks. The processing of these custom events, primarily aimed for process integration weren’t covering all edge cases, the data quality was not always optimal due to missing events and we needed to create a complex logic for SCD2 tables. Lakeflow Connect allows us to extract the data directly from the source without the complex architecture in between, avoiding data loss and thus, data quality issues, and with some simple adjustments, an SCD2 table is created automatically. Lakeflow Connect allows us to create more efficient and intelligent data provisioning.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY, MANUFACTURING
Technologies: APACHE SPARK, DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated.In this session, you will learn how to integrate Lakeflow Declarative Pipelines with external systems in order to ingest and send data virtually anywhere. Lakeflow Declarative Pipelines is most often used in ingestion and ETL into the Lakehouse. New Lakeflow Declarative Pipelines capabilities like the Lakeflow Declarative Pipelines Sinks API and added support for Python Data Source and ForEachBatch have opened up Lakeflow Declarative Pipelines to support almost any integration. This includes popular Apache Spark™ integrations like JDBC, Kafka, External and managed Delta tables, Azure CosmosDB, MongoDB and more.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
Building robust, production-grade data pipelines goes beyond writing transformation logic — it requires rigorous testing, version control, automated CI/CD workflows and a clear separation between development and production. In this talk, we’ll demonstrate how Lakeflow, paired with Databricks Asset Bundles (DABs), enables Git-based workflows, automated deployments and comprehensive testing for data engineering projects. We’ll share best practices for unit testing, CI/CD automation, data quality monitoring and environment-specific configurations. Additionally, we’ll explore observability techniques and performance tuning to ensure your pipelines are scalable, maintainable and production-ready.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS WORKFLOWS, DLT, LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
Monitoring data pipelines is key to reliability at scale. In this session, we’ll dive into the observability experience in Lakeflow, Databricks’ unified DE solution — from intuitive UI monitoring to advanced event analysis, cost observability and custom dashboards. We’ll walk through the revamped UX for Lakeflow observability, showing how to: This session will help you unlock full visibility into your data workflows.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: PROFESSIONAL SERVICES, RETAIL AND CPG - FOOD
Technologies: AI/BI, UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this presentation, we showcase Reckitt’s journey to develop and implement a state-of-the-art Gen AI platform, designed to transform enterprise operations starting with the marketing function. We will explore the unique technical challenges encountered and the innovative architectural solutions employed to overcome them. Attendees will gain insights into how cutting-edge Gen AI technologies were integrated to meet Reckitt’s specific needs. This session will not only highlight the transformative impacts on Reckitt’s marketing operations but also serve as a blueprint for AI-driven innovation in the Consumer Goods sector, demonstrating a successful model of partnership in technology and business transformation.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: APACHE SPARK, MOSAIC AI, PYTORCH
Skill Level: INTERMEDIATE
Duration: 40 MIN
If you're building AI applications, chances are you're solving a retrieval problem somewhere along the way. This is why vector databases are popular today. But if we zoom out from just vector search, serving AI applications also requires handling KV workloads like a traditional feature store, as well as analytical workloads to explore and visualize data. This means that building an AI application often requires multiple data stores, which means multiple data copies, manual syncing, and extra infrastructure expenses. LanceDB is the first and only system that supports all of these workloads in one system. Powered by Lance columnar format, LanceDB completely breaks open the impossible triangle of performance, scalability, and cost for AI serving. Serving AI applications is different from previous waves of technology, and a new paradigm demands new tools.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
Discover how the latest innovations in Databricks AI/BI Dashboards and Genie are transforming self-service analytics. This session offers a high-level tour of new capabilities that empower business users to ask questions in natural language, generate insights faster and make smarter decisions. Whether you're a long-time Databricks user or just exploring what's possible with AI/BI, you'll walk away with a clear understanding of how these tools are evolving — and how to leverage them for greater business impact.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, AI/BI, DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
The Virtue Foundation uses cutting-edge techniques in AI to optimize global health care delivery to save lives. With Unity Catalog as a foundation, they are using advanced Gen AI with model serving, vector search and MLflow to radically change how they map volunteer health resources with the right locations and facilities. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: MLFLOW, DSPY, MOSAIC AI
Skill Level: BEGINNER
Duration: 20 MIN
Writing prompts for our GenAI applications is long, tedious, and unmaintainable. A proper software development lifecycle requires proper testing and maintenance, something incredibly difficult to do on a block of text. Our current prompt engineering best practices have largely been manual trial and error, testing which of our prompts work well in certain situations. This process worsens as our prompts become more complex, adding multiple tasks and functionality within one long singular prompt. Enter DSPy, your PROGRAMATIC way of building GenAI Applications. Learn how DSPy allows you to modularize your prompt into modules and enforce typing through signatures. Then, utilize state of the art algorithms to optimize the prompts and weights against your evaluation datasets, just like machine learning! We will compare DSPy to a restaurant to help illustrate and demo DSPy’s capabilities. It's time to start programming, rather than prompting, again!
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Data is the backbone of modern decision-making, but centralizing it is only the tip of the iceberg. Entitlements, secure sharing and just-in-time availability are critical challenges to any large-scale platform. Join Goldman Sachs as we reveal how our Legend Lakehouse, coupled with Databricks, overcomes these hurdles to deliver high-quality, governed data at scale. By leveraging an open table format (Apache Iceberg) and open catalog format (Unity Catalog), we ensure platform interoperability and vendor neutrality. Databricks Unity Catalog then provides a robust entitlement system that aligns with our data contracts, ensuring consistent access control across producer and consumer workspaces. Finally, Legend functions, integrating with Databricks User Defined Functions (UDF), offer real-time data enrichment and secure transformations without exposing raw datasets. Discover how these components unite to streamline analytics, bolster governance and power innovation.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK, APACHE ICEBERG
Skill Level: BEGINNER
Duration: 40 MIN
Over the past three years, Netflix has built a catalog of 100+ mobile and cloud games across TV, mobile and web platforms. With both internal and external studios contributing to this diverse ecosystem, building a robust game analytics platform became crucial for gaining insights into player behavior, optimizing game performance and driving member engagement.In this talk, we’ll share our journey of building Netflix’s Game Analytics platform from the ground up. We’ll highlight key decisions around data strategy, such as whether to develop an in-house solution or adopt an external service. We’ll discuss the challenges of balancing developer autonomy with data integrity and the complexities of managing data contracts for custom game telemetry, with an emphasis on self-service analytics. Attendees will learn how the Games Data team navigated these challenges, the lessons learned and the trade-offs involved in building a multi-tenant data ecosystem that supports diverse stakeholders.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PUBLIC SECTOR
Technologies: AI/BI, DSPY, LLAMA
Skill Level: BEGINNER
Duration: 40 MIN
Large Language Models (LLMs) excel at understanding messy, real-world data, but integrating them into production systems remains challenging. Prompts can be unruly to write, vary by model and can be difficult to manage in the large context of a pipeline. In this session, we'll demonstrate incorporating LLMs into a geospatial conflation pipeline, using DSPy. We'll discuss how DSPy works under the covers and highlight the benefits it provides pipeline creators and managers.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: RETAIL AND CPG - FOOD
Technologies: DATA MARKETPLACE, DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
Retailers and suppliers face persistent financial and technical challenges to data sharing — including expensive proprietary platforms, complex data integration hurdles, fragmented governance and more — which currently restrict seamless data exchange primarily to their largest trading partners. In this session, we’ll provide an in-depth explanation of Elevate, an industry alliance focused on building open source standards for data sharing and collaboration to drive greater efficiency across the entire ecosystem. This session will highlight proposed standards for data sharing, data models, business cases on the ROI and potential areas of innovation to democratize data sharing, drastically reduce costs, simplify integration processes and foster transparent, trusted collaboration. Learn about the Elevate industry data-sharing initiative and how your company can participate and help guide standards to improve data sharing with your key partners.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT
Technologies: DELTA LAKE, APACHE ICEBERG
Skill Level: BEGINNER
Duration: 40 MIN
Delta Lake is a fantastic technology for quickly querying massive data sets, but first you need those massive data sets! In this session we will dive into the cloud-native architecture Scribd has adopted to ingest data from AWS Aurora, SQS, Kinesis Data Firehose and more. By using off-the-shelf open source tools like kafka-delta-ingest, oxbow and Airbyte, Scribd has redefined its ingestion architecture to be more event-driven, reliable, and most importantly: cheaper. No jobs needed! Attendees will learn how to use third-party tools in concert with a Databricks and Unity Catalog environment to provide a highly efficient and available data platform. This architecture will be presented in the context of AWS but can be adapted for Azure, Google Cloud Platform or even on-premise environments.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT
Technologies: AI/BI, DATABRICKS WORKFLOWS, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
In the competitive gaming industry, understanding player behavior is key to delivering engaging experiences. Supercell, creators of Clash of Clans and Brawl Stars, faced challenges with fragmented data and limited visibility into user journeys. To address this, they partnered with Snowplow and Databricks to build a scalable, privacy-compliant data platform for real-time insights. By leveraging Snowplow’s behavioral data collection and Databricks’ Lakehouse architecture, Supercell achieved: This session explores Supercell’s data journey and AI-driven player engagement strategies.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
In the contemporary landscape of data management, organizations are increasingly faced with the challenges of data segregation, governance and permission management, particularly when operating within complex structures such as holding companies with multiple subsidiaries. Unipol comprises seven subsidiary companies, each with a diverse array of workgroups, leading to a cumulative total of multiple operational groups. This intricate organizational structure necessitates a meticulous approach to data management, particularly regarding the segregation of data and the assignment of precise read-and-write permissions tailored to each workgroup. The challenge lies in ensuring that sensitive data remains protected while enabling seamless access for authorized users. This speech wants to demonstrate how Unity Catalog emerges as a pivotal tool in the daily use of the data platform, offering a unified governance solution that supports data management across diverse AWS environments.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD
Technologies: APACHE SPARK, LLAMA, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Testing Spark jobs in local environments is often difficult due to the lack of suitable datasets, especially under tight timelines. This creates challenges when jobs work in development clusters but fail in production, or when they run locally but encounter issues in staging clusters due to inadequate documentation or checks. In this session, we’ll discuss how these challenges can be overcome by leveraging Generative AI to create custom synthetic datasets for local testing. By incorporating variations and sampling, a testing framework can be introduced to solve some of these challenges, allowing for the generation of realistic data to aid in performance and load testing. We’ll show how this approach helps identify performance bottlenecks early, optimize job performance and recognize scalability issues while keeping costs low. This methodology fosters better deployment practices and enhances the reliability of Spark jobs across environments.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: MLFLOW, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this session, we will delve into the creation of an infrastructure, CI/CD processes and monitoring systems that facilitate the responsible and efficient deployment of Large Language Models (LLMs) at Intermountain Healthcare. Using the "AI Inventory Agents" project as a case study, we will showcase how an LLM Agent can assist in effort and impact estimates, as well as provide insights into various AI products, both custom-built and third-party hosted. This includes their responsible AI certification status, development status and monitoring status (lights on, performance, drift, etc.). Attendees will learn how to build and customize their own LLMOps infrastructure to ensure seamless deployment and monitoring of LLMs, adhering to responsible AI practices.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES
Technologies: AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join two energy industry leaders as they showcase groundbreaking applications of AI and data solutions in modern oil and gas operations. NOV demonstrates how their Generative AI pipeline revolutionized drilling mud report processing, automating the analysis of 300 reports daily with near-perfect accuracy and real-time analytics capabilities. BP shares how Unity Catalog has transformed their enterprise-wide data strategy, breaking down silos while maintaining robust governance and security. Together, these case studies illustrate how AI and advanced analytics are enabling cleaner, more efficient energy operations while maintaining the reliability demanded by today's market.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
As cheat developers evolve, so must detection techniques. This session will explore our methodologies, challenges and future directions, demonstrating how machine learning is transforming anti-cheat strategies and preserving competitive integrity in online gaming and how Databricks is enabling us to do so. As online gaming grows, maintaining fair play is an ongoing challenge. Call of Duty, a highly competitive first-person action game, faces aimbot usage—cheats that enable near-perfect accuracy, undermining fair play. Additionally, traditional detection methods are increasingly becoming less effective against advanced cheats that mimic human behavior. Machine learning presents a scalable and adaptive solution to this. We developed a data pipeline that collects features such as angle velocity, acceleration, etc. to train a deep neural network and deployed it. We are processing 30 million rows of data per hour for this detection on Databricks Platform.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: LLAMA, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The Databricks Security team led a broad working group that significantly evolved the Databricks AI Security Framework (DASF) to its 2.0 version since its first release by closely collaborating with the top cyber security researchers at industry organizations such as OWASP, Gartner, NIST, HITRUST, FAIR Institute and several Fortune 100 companies to address the evolving risks and associated controls of AI systems in enterprises. Join us to to learn how The CLEVER GenAI pipeline, an AI-driven innovation in healthcare, processes over 1.5 million clinical notes daily to classify social determinants impacting veteran care while adhering to robust security measures like NIST 800-53 controls and by leveraging Databricks AI Security Framework. We will discuss robust AI security guidelines to help data and AI teams understand how to deploy their AI applications securely. This session will give a security framework for security teams, AI practitioners, data engineers and governance teams.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
T-Mobile’s leadership in 5G innovation and its rapid growth in the fixed wireless business have led to an exponential increase in data, reaching 100s of terabytes daily. This session explores how T-Mobile uses Databricks to manage this data efficiently, focusing on scalable architecture with Delta Lake, auto-scaling clusters, performance optimization through data partitioning and caching and comprehensive data governance with Unity Catalog. Additionally, it covers cost management, collaborative tools and AI-driven productivity tools, highlighting how these strategies empower T-Mobile to innovate, streamline operations and maximize data impact across network optimization, supporting the community, energy management and more.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING
Technologies: DATA MARKETPLACE, UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
As organizations increasingly adopt Databricks as a unified platform for analytics and AI, ensuring robust data governance becomes critical for compliance, security, and operational efficiency. This presentation will explore the end-to-end framework for governing the Databricks cloud, covering key use cases, foundational governance principles, and scalable automation strategies. We will discuss best practices for metadata, data access, catalog, classification, quality, and lineage, while leveraging automation to streamline enforcement. Attendees will gain insights into best practices and real-world approaches to building a governed data cloud that balances innovation with control.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MANUFACTURING
Technologies: AI/BI, DELTA SHARING, UNITY CATALOG
Skill Level: BEGINNER
Duration: 60 MIN
Join us for an inspiring forum showcasing how manufacturers and transportation leaders are turning today's challenges into tomorrow's opportunities. From automotive giants revolutionizing product development with generative AI to logistics providers optimizing routes for both cost and sustainability, discover how industry pioneers are reshaping the future of industrial operations. Highlighting this session is an exciting collaboration between Heathrow Airport and Virgin Atlantic, demonstrating how partnership and innovation are transforming the air travel experience. Learn how these leaders and other companies are using Databricks to tackle their most pressing challenges — from smart factory transformations to autonomous systems development — proving that the path to profitability and sustainability runs through intelligent operations.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MANUFACTURING
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join industry leaders from Dow and Michelin as they reveal how data intelligence is revolutionizing sustainable manufacturing without compromising profitability. Dow demonstrates how their implementation of Databricks' Data Intelligence Platform has transformed their ability to track and reduce carbon footprints while driving operational efficiencies, resulting in significant cost savings through optimized maintenance and reduced downtime. Michelin follows with their ambitious strategy to achieve 3% energy consumption reduction by 2026, leveraging Databricks to turn this environmental challenge into operational excellence. Together, these manufacturing giants showcase how modern data architecture and AI are creating a new paradigm where sustainability and profitability go hand-in-hand.
Type: MEETUP
Track: DATA WAREHOUSING
Industry: MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 180 MIN
Join us Tuesday June 10th, 9:10-12:10 PM PT
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Marketing owns the outcomes, but IT owns the infrastructure that makes those outcomes possible. In today’s data-driven landscape, the success of customer engagement and personalization strategies depends on a tight partnership between marketing and IT. This session explores how leading brands are using Databricks and Epsilon to unlock the full value of first-party data — transforming raw data into rich customer profiles, real-time engagement and measurable marketing ROI. Join Epsilon to see how a unified data foundation powers marketing to drive outcomes — with IT as the enabler of scale, governance and innovation. Key takeaways:
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: PROFESSIONAL SERVICES, TRAVEL AND HOSPITALITY, FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Unity Catalog puts variety of schemas into a centralized repository, now the developer community wants more productivity and automation for schema inference, translation, evolution and optimization especially for the scenarios of ingestion and reverse-ETL with more code generations.Coinbase Data Platform attempts to pave a path with "Schemaster" to interact with data catalog with the (proposed) metadata model to make schema translation and evolution more manageable across some of the popular systems, such as Delta, Iceberg, Snowflake, Kafka, MongoDB, DynamoDB, Postgres...This Lighting Talk covers 4 areas: Takeaway: standardize schema lineage & translation
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DLT, LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
Transactional systems are a common source of data for analytics, and Change Data Capture (CDC) offers an efficient way to extract only what’s changed. However, ingesting CDC data into an analytics system comes with challenges, such as handling out-of-order events or maintaining global order across multiple streams. These issues often require complex, stateful stream processing logic.This session will explore how Lakeflow Declarative Pipelines simplifies CDC ingestion using the Apply Changes function. With Apply Changes, global ordering across multiple change feeds is handled automatically — there is no need to manually manage state or understand advanced streaming concepts like watermarks. It supports both snapshot-based inputs from cloud storage and continuous change feeds from systems like message buses, reducing complexity for common streaming use cases.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MANUFACTURING, PUBLIC SECTOR
Technologies: DATABRICKS WORKFLOWS, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Ensuring data security & meeting compliance requirements are critical priorities for businesses operating in regulated industries, where the stakes are high and the standards are stringent. We will showcase how CoorsTek, a global leader in technical ceramics MFG, partnered with Databricks to leverage the power of UC for addressing regulatory challenges while achieving significant operational efficiency gains. We'll dive into the migration journey, highlighting the adoption of key features such as RBAC, comprehensive data lineage tracking and robust auditing capabilities. Attendees will gain practical insights into the strategies and tools used to manage sensitive data, ensure compliance with industry standards and optimize cloud data architectures. Additionally, we’ll share real-world lessons learned, best practices for integrating compliance into a modern data ecosystem and actionable takeaways for leveraging Databricks as a catalyst for secure and compliant data innovation.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Ensuring the operational excellence of AI agents in production requires robust monitoring capabilities that span both performance metrics and quality evaluation. This session explores Databricks' comprehensive Mosaic Agent Monitoring solution, designed to provide visibility into deployed AI agents through an intuitive dashboard that tracks critical operational metrics and quality indicators. We'll demonstrate how to use the Agent Monitoring solution to iteratively improve a production agent that delivers a better customer support experience while decreasing the cost of delivering customer support. We will show how to: Key session takeaways include:
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: HEALTH AND LIFE SCIENCES
Technologies: DATA MARKETPLACE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this session, attendees will learn how to leverage Databricks' system tables to measure user adoption and track key performance indicators (KPIs) for data products. The session will focus on how organizations can use system tables to analyze user behavior, assess engagement with data products and identify usage trends that can inform product development. By measuring KPIs such as user retention, frequency of use and data queries, organizations can optimize their data products for better performance and ROI.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, AI/BI, DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
The Virtue Foundation uses cutting-edge techniques in AI to optimize global health care delivery to save lives. With Unity Catalog as a foundation, they are using advanced Gen AI with model serving, vector search and MLflow to radically change how they map volunteer health resources with the right locations and facilities. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: MLFLOW, DSPY, MOSAIC AI
Skill Level: BEGINNER
Duration: 20 MIN
Writing prompts for our GenAI applications is long, tedious, and unmaintainable. A proper software development lifecycle requires proper testing and maintenance, something incredibly difficult to do on a block of text. Our current prompt engineering best practices have largely been manual trial and error, testing which of our prompts work well in certain situations. This process worsens as our prompts become more complex, adding multiple tasks and functionality within one long singular prompt. Enter DSPy, your PROGRAMATIC way of building GenAI Applications. Learn how DSPy allows you to modularize your prompt into modules and enforce typing through signatures. Then, utilize state of the art algorithms to optimize the prompts and weights against your evaluation datasets, just like machine learning! We will compare DSPy to a restaurant to help illustrate and demo DSPy’s capabilities. It's time to start programming, rather than prompting, again!
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Data is the backbone of modern decision-making, but centralizing it is only the tip of the iceberg. Entitlements, secure sharing and just-in-time availability are critical challenges to any large-scale platform. Join Goldman Sachs as we reveal how our Legend Lakehouse, coupled with Databricks, overcomes these hurdles to deliver high-quality, governed data at scale. By leveraging an open table format (Apache Iceberg) and open catalog format (Unity Catalog), we ensure platform interoperability and vendor neutrality. Databricks Unity Catalog then provides a robust entitlement system that aligns with our data contracts, ensuring consistent access control across producer and consumer workspaces. Finally, Legend functions, integrating with Databricks User Defined Functions (UDF), offer real-time data enrichment and secure transformations without exposing raw datasets. Discover how these components unite to streamline analytics, bolster governance and power innovation.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK, APACHE ICEBERG
Skill Level: BEGINNER
Duration: 40 MIN
Over the past three years, Netflix has built a catalog of 100+ mobile and cloud games across TV, mobile and web platforms. With both internal and external studios contributing to this diverse ecosystem, building a robust game analytics platform became crucial for gaining insights into player behavior, optimizing game performance and driving member engagement.In this talk, we’ll share our journey of building Netflix’s Game Analytics platform from the ground up. We’ll highlight key decisions around data strategy, such as whether to develop an in-house solution or adopt an external service. We’ll discuss the challenges of balancing developer autonomy with data integrity and the complexities of managing data contracts for custom game telemetry, with an emphasis on self-service analytics. Attendees will learn how the Games Data team navigated these challenges, the lessons learned and the trade-offs involved in building a multi-tenant data ecosystem that supports diverse stakeholders.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PUBLIC SECTOR
Technologies: AI/BI, DSPY, LLAMA
Skill Level: BEGINNER
Duration: 40 MIN
Large Language Models (LLMs) excel at understanding messy, real-world data, but integrating them into production systems remains challenging. Prompts can be unruly to write, vary by model and can be difficult to manage in the large context of a pipeline. In this session, we'll demonstrate incorporating LLMs into a geospatial conflation pipeline, using DSPy. We'll discuss how DSPy works under the covers and highlight the benefits it provides pipeline creators and managers.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: RETAIL AND CPG - FOOD
Technologies: DATA MARKETPLACE, DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
Retailers and suppliers face persistent financial and technical challenges to data sharing — including expensive proprietary platforms, complex data integration hurdles, fragmented governance and more — which currently restrict seamless data exchange primarily to their largest trading partners. In this session, we’ll provide an in-depth explanation of Elevate, an industry alliance focused on building open source standards for data sharing and collaboration to drive greater efficiency across the entire ecosystem. This session will highlight proposed standards for data sharing, data models, business cases on the ROI and potential areas of innovation to democratize data sharing, drastically reduce costs, simplify integration processes and foster transparent, trusted collaboration. Learn about the Elevate industry data-sharing initiative and how your company can participate and help guide standards to improve data sharing with your key partners.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT
Technologies: DELTA LAKE, APACHE ICEBERG
Skill Level: BEGINNER
Duration: 40 MIN
Delta Lake is a fantastic technology for quickly querying massive data sets, but first you need those massive data sets! In this session we will dive into the cloud-native architecture Scribd has adopted to ingest data from AWS Aurora, SQS, Kinesis Data Firehose and more. By using off-the-shelf open source tools like kafka-delta-ingest, oxbow and Airbyte, Scribd has redefined its ingestion architecture to be more event-driven, reliable, and most importantly: cheaper. No jobs needed! Attendees will learn how to use third-party tools in concert with a Databricks and Unity Catalog environment to provide a highly efficient and available data platform. This architecture will be presented in the context of AWS but can be adapted for Azure, Google Cloud Platform or even on-premise environments.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT
Technologies: AI/BI, DATABRICKS WORKFLOWS, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
In the competitive gaming industry, understanding player behavior is key to delivering engaging experiences. Supercell, creators of Clash of Clans and Brawl Stars, faced challenges with fragmented data and limited visibility into user journeys. To address this, they partnered with Snowplow and Databricks to build a scalable, privacy-compliant data platform for real-time insights. By leveraging Snowplow’s behavioral data collection and Databricks’ Lakehouse architecture, Supercell achieved: This session explores Supercell’s data journey and AI-driven player engagement strategies.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
In the contemporary landscape of data management, organizations are increasingly faced with the challenges of data segregation, governance and permission management, particularly when operating within complex structures such as holding companies with multiple subsidiaries. Unipol comprises seven subsidiary companies, each with a diverse array of workgroups, leading to a cumulative total of multiple operational groups. This intricate organizational structure necessitates a meticulous approach to data management, particularly regarding the segregation of data and the assignment of precise read-and-write permissions tailored to each workgroup. The challenge lies in ensuring that sensitive data remains protected while enabling seamless access for authorized users. This speech wants to demonstrate how Unity Catalog emerges as a pivotal tool in the daily use of the data platform, offering a unified governance solution that supports data management across diverse AWS environments.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD
Technologies: APACHE SPARK, LLAMA, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Testing Spark jobs in local environments is often difficult due to the lack of suitable datasets, especially under tight timelines. This creates challenges when jobs work in development clusters but fail in production, or when they run locally but encounter issues in staging clusters due to inadequate documentation or checks. In this session, we’ll discuss how these challenges can be overcome by leveraging Generative AI to create custom synthetic datasets for local testing. By incorporating variations and sampling, a testing framework can be introduced to solve some of these challenges, allowing for the generation of realistic data to aid in performance and load testing. We’ll show how this approach helps identify performance bottlenecks early, optimize job performance and recognize scalability issues while keeping costs low. This methodology fosters better deployment practices and enhances the reliability of Spark jobs across environments.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: MLFLOW, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this session, we will delve into the creation of an infrastructure, CI/CD processes and monitoring systems that facilitate the responsible and efficient deployment of Large Language Models (LLMs) at Intermountain Healthcare. Using the "AI Inventory Agents" project as a case study, we will showcase how an LLM Agent can assist in effort and impact estimates, as well as provide insights into various AI products, both custom-built and third-party hosted. This includes their responsible AI certification status, development status and monitoring status (lights on, performance, drift, etc.). Attendees will learn how to build and customize their own LLMOps infrastructure to ensure seamless deployment and monitoring of LLMs, adhering to responsible AI practices.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES
Technologies: AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join two energy industry leaders as they showcase groundbreaking applications of AI and data solutions in modern oil and gas operations. NOV demonstrates how their Generative AI pipeline revolutionized drilling mud report processing, automating the analysis of 300 reports daily with near-perfect accuracy and real-time analytics capabilities. BP shares how Unity Catalog has transformed their enterprise-wide data strategy, breaking down silos while maintaining robust governance and security. Together, these case studies illustrate how AI and advanced analytics are enabling cleaner, more efficient energy operations while maintaining the reliability demanded by today's market.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
As cheat developers evolve, so must detection techniques. This session will explore our methodologies, challenges and future directions, demonstrating how machine learning is transforming anti-cheat strategies and preserving competitive integrity in online gaming and how Databricks is enabling us to do so. As online gaming grows, maintaining fair play is an ongoing challenge. Call of Duty, a highly competitive first-person action game, faces aimbot usage—cheats that enable near-perfect accuracy, undermining fair play. Additionally, traditional detection methods are increasingly becoming less effective against advanced cheats that mimic human behavior. Machine learning presents a scalable and adaptive solution to this. We developed a data pipeline that collects features such as angle velocity, acceleration, etc. to train a deep neural network and deployed it. We are processing 30 million rows of data per hour for this detection on Databricks Platform.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: LLAMA, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The Databricks Security team led a broad working group that significantly evolved the Databricks AI Security Framework (DASF) to its 2.0 version since its first release by closely collaborating with the top cyber security researchers at industry organizations such as OWASP, Gartner, NIST, HITRUST, FAIR Institute and several Fortune 100 companies to address the evolving risks and associated controls of AI systems in enterprises. Join us to to learn how The CLEVER GenAI pipeline, an AI-driven innovation in healthcare, processes over 1.5 million clinical notes daily to classify social determinants impacting veteran care while adhering to robust security measures like NIST 800-53 controls and by leveraging Databricks AI Security Framework. We will discuss robust AI security guidelines to help data and AI teams understand how to deploy their AI applications securely. This session will give a security framework for security teams, AI practitioners, data engineers and governance teams.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
T-Mobile’s leadership in 5G innovation and its rapid growth in the fixed wireless business have led to an exponential increase in data, reaching 100s of terabytes daily. This session explores how T-Mobile uses Databricks to manage this data efficiently, focusing on scalable architecture with Delta Lake, auto-scaling clusters, performance optimization through data partitioning and caching and comprehensive data governance with Unity Catalog. Additionally, it covers cost management, collaborative tools and AI-driven productivity tools, highlighting how these strategies empower T-Mobile to innovate, streamline operations and maximize data impact across network optimization, supporting the community, energy management and more.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING
Technologies: DATA MARKETPLACE, UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
As organizations increasingly adopt Databricks as a unified platform for analytics and AI, ensuring robust data governance becomes critical for compliance, security, and operational efficiency. This presentation will explore the end-to-end framework for governing the Databricks cloud, covering key use cases, foundational governance principles, and scalable automation strategies. We will discuss best practices for metadata, data access, catalog, classification, quality, and lineage, while leveraging automation to streamline enforcement. Attendees will gain insights into best practices and real-world approaches to building a governed data cloud that balances innovation with control.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MANUFACTURING
Technologies: AI/BI, DELTA SHARING, UNITY CATALOG
Skill Level: BEGINNER
Duration: 60 MIN
Join us for an inspiring forum showcasing how manufacturers and transportation leaders are turning today's challenges into tomorrow's opportunities. From automotive giants revolutionizing product development with generative AI to logistics providers optimizing routes for both cost and sustainability, discover how industry pioneers are reshaping the future of industrial operations. Highlighting this session is an exciting collaboration between Heathrow Airport and Virgin Atlantic, demonstrating how partnership and innovation are transforming the air travel experience. Learn how these leaders and other companies are using Databricks to tackle their most pressing challenges — from smart factory transformations to autonomous systems development — proving that the path to profitability and sustainability runs through intelligent operations.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MANUFACTURING
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join industry leaders from Dow and Michelin as they reveal how data intelligence is revolutionizing sustainable manufacturing without compromising profitability. Dow demonstrates how their implementation of Databricks' Data Intelligence Platform has transformed their ability to track and reduce carbon footprints while driving operational efficiencies, resulting in significant cost savings through optimized maintenance and reduced downtime. Michelin follows with their ambitious strategy to achieve 3% energy consumption reduction by 2026, leveraging Databricks to turn this environmental challenge into operational excellence. Together, these manufacturing giants showcase how modern data architecture and AI are creating a new paradigm where sustainability and profitability go hand-in-hand.
Type: MEETUP
Track: DATA WAREHOUSING
Industry: MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 180 MIN
Join us Tuesday June 10th, 9:10-12:10 PM PT
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Marketing owns the outcomes, but IT owns the infrastructure that makes those outcomes possible. In today’s data-driven landscape, the success of customer engagement and personalization strategies depends on a tight partnership between marketing and IT. This session explores how leading brands are using Databricks and Epsilon to unlock the full value of first-party data — transforming raw data into rich customer profiles, real-time engagement and measurable marketing ROI. Join Epsilon to see how a unified data foundation powers marketing to drive outcomes — with IT as the enabler of scale, governance and innovation. Key takeaways:
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: PROFESSIONAL SERVICES, TRAVEL AND HOSPITALITY, FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Unity Catalog puts variety of schemas into a centralized repository, now the developer community wants more productivity and automation for schema inference, translation, evolution and optimization especially for the scenarios of ingestion and reverse-ETL with more code generations.Coinbase Data Platform attempts to pave a path with "Schemaster" to interact with data catalog with the (proposed) metadata model to make schema translation and evolution more manageable across some of the popular systems, such as Delta, Iceberg, Snowflake, Kafka, MongoDB, DynamoDB, Postgres...This Lighting Talk covers 4 areas: Takeaway: standardize schema lineage & translation
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DLT, LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
Transactional systems are a common source of data for analytics, and Change Data Capture (CDC) offers an efficient way to extract only what’s changed. However, ingesting CDC data into an analytics system comes with challenges, such as handling out-of-order events or maintaining global order across multiple streams. These issues often require complex, stateful stream processing logic.This session will explore how Lakeflow Declarative Pipelines simplifies CDC ingestion using the Apply Changes function. With Apply Changes, global ordering across multiple change feeds is handled automatically — there is no need to manually manage state or understand advanced streaming concepts like watermarks. It supports both snapshot-based inputs from cloud storage and continuous change feeds from systems like message buses, reducing complexity for common streaming use cases.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MANUFACTURING, PUBLIC SECTOR
Technologies: DATABRICKS WORKFLOWS, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Ensuring data security & meeting compliance requirements are critical priorities for businesses operating in regulated industries, where the stakes are high and the standards are stringent. We will showcase how CoorsTek, a global leader in technical ceramics MFG, partnered with Databricks to leverage the power of UC for addressing regulatory challenges while achieving significant operational efficiency gains. We'll dive into the migration journey, highlighting the adoption of key features such as RBAC, comprehensive data lineage tracking and robust auditing capabilities. Attendees will gain practical insights into the strategies and tools used to manage sensitive data, ensure compliance with industry standards and optimize cloud data architectures. Additionally, we’ll share real-world lessons learned, best practices for integrating compliance into a modern data ecosystem and actionable takeaways for leveraging Databricks as a catalyst for secure and compliant data innovation.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Ensuring the operational excellence of AI agents in production requires robust monitoring capabilities that span both performance metrics and quality evaluation. This session explores Databricks' comprehensive Mosaic Agent Monitoring solution, designed to provide visibility into deployed AI agents through an intuitive dashboard that tracks critical operational metrics and quality indicators. We'll demonstrate how to use the Agent Monitoring solution to iteratively improve a production agent that delivers a better customer support experience while decreasing the cost of delivering customer support. We will show how to: Key session takeaways include:
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: HEALTH AND LIFE SCIENCES
Technologies: DATA MARKETPLACE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this session, attendees will learn how to leverage Databricks' system tables to measure user adoption and track key performance indicators (KPIs) for data products. The session will focus on how organizations can use system tables to analyze user behavior, assess engagement with data products and identify usage trends that can inform product development. By measuring KPIs such as user retention, frequency of use and data queries, organizations can optimize their data products for better performance and ROI.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: MEDIA AND ENTERTAINMENT
Technologies: AI/BI, DATABRICKS SQL, MOSAIC AI
Skill Level: BEGINNER
Duration: 60 MIN
Join us at the Media & Advertising Forum to explore how data and AI are transforming media and advertising from content to creative and identity to outcomes. Featuring innovators from leading agencies, platforms, streamers and ad tech — plus exciting announcements from Databricks — this session delivers must-have insights for industry leaders and change agents. What to expect:
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD
Technologies: APACHE SPARK, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
At Plexure, we ingest hundreds of millions of customer activities and transactions into our data platform every day, fuelling our personalisation engine and providing insights into the effectiveness of marketing campaigns.We're on a journey to transition from infrequent batch ingestion to near real-time streaming using Azure Event Hubs and Lakeflow Declarative Pipelines. This transformation will allow us to react to customer behaviour as it happens, rather than hours or even days later.It also enables us to move faster in other ways. By leveraging a Schema Registry, we've created a metadata-driven framework that allows data producers to: Join us to learn more about our journey and see how we're implementing this with Lakeflow Declarative Pipelines meta-programming - including a live demo of the end-to-end process!
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
In PacificSource Health Plans, a health insurance company in the US, we are on a successful multi-year journey to migrate all of our data and analytics ecosystem to Databricks Enterprise Data Warehouse (lakehouse). A particular obstacle on this journey was a reporting data mart which relied on copious amounts of legacy SAS code that applied sophisticated business logic transformations for membership, claims, premiums and reserves. This core data mart was driving many of our critical reports and analytics. In this session we will share the unique and somewhat unexpected challenges and complexities we encountered in migrating this legacy SAS code. How our partner (T1A) leveraged automation technology (Alchemist) and some unique approaches to reverse engineer (analyze), instrument, translate, migrate, validate and reconcile these jobs; and what lessons we learned and carried from this migration effort.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 60 MIN
Ready to streamline your ML lifecycle? Join us to explore MLflow 3.0 on Databricks, where we'll show you how to manage everything from experimentation to production with less effort and better results. See how this powerful platform provides comprehensive tracking, evaluation, and deployment capabilities for traditional ML models and cutting-edge generative AI applications. Key takeaways: Whether you're an MLOps novice or veteran, you'll walk away with practical techniques to accelerate your ML development and deployment.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: MLFLOW, AI/BI, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Deploying AI models efficiently and consistently is a challenge many organizations face. This session will explore how Vizient built a standardized MLOps stack using Databricks and Azure DevOps to streamline model development, deployment and monitoring. Attendees will gain insights into how Databricks Asset Bundles were leveraged to create reproducible, scalable pipelines and how Infrastructure-as-Code principles accelerated onboarding for new AI projects. The talk will cover: By the end of this session, participants will have a roadmap for implementing a scalable, reusable MLOps framework that enhances operational efficiency across AI initiatives.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, DATABRICKS WORKFLOWS, DLT
Skill Level: INTERMEDIATE
Duration: 20 MIN
Adopting MLOps is getting increasingly important with the rise of AI. A lot of different features are required to do MLOps in large organizations. In the past, you had to implement these features yourself. Luckily, the MLOps space is getting more mature, and end-to-end platforms like Databricks provide most of the features. In this talk, I will walk through the MLOps components and how you can simplify your processes using Databricks. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: ENERGY AND UTILITIES
Technologies: AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session showcases how both Westinghouse Electric and Alabama Power Company are leveraging cloud-based tools, advanced analytics, and machine learning to transform operational resilience across the energy sector. In the first segment, we'll explore AI's crucial role in enhancing safety, efficiency, and compliance in nuclear operations through technologies like HiVE and Bertha, focusing on the unique reliability and credibility requirements of the regulated nuclear industry. We’ll then highlight how Alabama Power Company has modernized its grid management and storm preparedness by using Databricks to develop SPEAR and RAMP—applications that combine real-time data and predictive analytics to improve reliability, efficiency, and customer service.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Learn how Data Profiling, Data Quality Monitoring, and Data Classification come together to provide end-to-end visibility into the health of your data and AI pipelines.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The AI Screening Agent automates Level 1 (L1) screening process, essential for Know Your Customer (KYC) and compliance due diligence during customer onboarding. This system aims to minimize false positives, significantly reducing human review time and costs. Beyond typical Retrieval-Augmented Generation (RAG) applications like summarization and chat-with-your-data (CWYD), the AI Screening Agent employs a ReAct architecture with intelligent tools, enabling it to perform complex compliance decision-making with human-like accuracy and greater consistency. In this talk, I will explore the screening agent architecture, demonstrating its ability to meet evolving client policies. I will discuss evaluation and configuration management using MLflow LLM-as-judge and Unity Catalog, and discuss challenges, such as, data fidelity and customization. This session underscores the transformative potential of AI agents in compliance workflows, emphasizing their adaptability, accuracy, and consistency.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
No description available.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Get a first look at multi-statement transactions in Databricks. In this session, we will dive into their capabilities, exploring how multi-statement transactions enable atomic updates across multiple tables in your data pipelines, ensuring data consistency and integrity for complex operations. We will also share how we are enabling unified transactions across Delta Lake and Iceberg with Unity Catalog — powering our vision for an open and interoperable lakehouse.
Type: LIGHTNING TALK
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 20 MIN
Multi-statement transactions bring the atomicity and reliability of traditional databases to modern data warehousing on the lakehouse. In this session, we’ll explore real-world patterns enabled by multi-statement transactions — including multi-table updates, deduplication pipelines and audit logging — and show how Databricks ensures atomicity and consistency across complex workflows. We’ll also dive into demos and share tips to getting started and migrations with this feature in Databricks SQL.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: DELTA LAKE, MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Modern data science teams face the challenge of navigating complex landscapes of languages, tools and infrastructure. Positron, Posit’s next-generation IDE, offers a powerful environment tailored for data science, seamlessly integrating with Databricks to empower teams working in Python and R. Now integrated within Posit Workbench, Positron enables data scientists to efficiently develop, iterate and analyze data with Databricks — all while maintaining their preferred workflows. In this session, we’ll explore how Python and R users can develop, deploy and scale their data science workflows by combining Posit tools with Databricks. We’ll showcase how Positron simplifies development for both Python and R and how Posit Connect enables seamless deployment of applications, reports and APIs powered by Databricks. Join us to see how Posit + Databricks create a frictionless, scalable and collaborative data science experience — so your teams can focus on insights, not infrastructure.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: PROFESSIONAL SERVICES
Technologies: MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Moving data between operational systems and analytics platforms is often painful. Traditional pipelines become complex, brittle, and expensive to maintain.Take Kafka and Iceberg: batching on Kafka causes ingestion bottlenecks, while streaming-style writes to Iceberg create too many small Parquet files—cluttering metadata, degrading queries, and increasing maintenance overhead. Frequent updates further strain background table operations, causing retries—even before dealing with schema evolution. But much of this complexity is avoidable. What if Kafka Topics and Iceberg Tables were treated as two sides of the same coin? By establishing a transparent equivalence, we can rethink pipeline design entirely. This session introduces Tableflow—a new approach to bridging streaming and table-based systems. It shifts complexity away from pipelines and into a unified layer, enabling simpler, declarative workflows. We’ll cover schema evolution, compaction, topic-to-table mapping, and how to continuously materialize and optimize thousands of topics as Iceberg tables. Whether modernizing or starting fresh, you’ll leave with practical insights for building resilient, scalable, and future-proof data architectures.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: EDUCATION, HEALTH AND LIFE SCIENCES, MEDIA AND ENTERTAINMENT
Technologies: DELTA LAKE, AI/BI, DSPY
Skill Level: INTERMEDIATE
Duration: 20 MIN
Life as a father, tech leader, and fitness enthusiast demands efficiency. To reclaim my time, I’ve built AI-driven solutions that automate everyday tasks—from research agents that prep for podcasts to multi-agent systems that plan meals—all powered by real-time data and automation. This session dives into the technical foundations of these solutions, focusing on event-driven agent design and scalable patterns for robust AI systems. You’ll discover how Databricks technologies like Delta Lake, for reliable and scalable data management, and DSPy, for streamlining the development of generative AI workflows, empower seamless decision-making and deliver actionable insights. Through detailed architecture diagrams and a live demo, I’ll showcase how to design systems that process data in motion to tackle complex, real-world problems. Whether you’re an engineer, architect, or data scientist, you’ll leave with practical strategies to integrate AI-driven automation into your workflows.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK
Skill Level: BEGINNER
Duration: 20 MIN
Apache Spark™ has introduced Arrow-optimized APIs such as Pandas UDFs and the Pandas Functions API, providing high performance for Python workloads. Yet, many users continue to rely on regular Python UDFs due to their simple interface, especially when advanced Python expertise is not readily available. This talk introduces a powerful new feature in Apache Spark that brings Arrow optimization to regular Python UDFs. With this enhancement, users can leverage performance gains without modifying their existing UDFs — simply by enabling a configuration setting or toggling a UDF-level parameter. Additionally, we will dive into practical tips and features for using Arrow-optimized Python UDFs effectively, exploring their strengths and limitations. Whether you’re a Spark beginner or an experienced user, this session will allow you to achieve the best of both simplicity and performance in your workflows with regular Python UDFs.
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: PROFESSIONAL SERVICES
Technologies: DATABRICKS WORKFLOWS, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
In a world where data collaboration is essential but trust is scarce, Databricks Clean Rooms delivers a game-changing model: no data shared, all value gained. Discover how data providers can unlock new revenue streams by launching subscription-based analytics and “Built-on-Databricks” services that run on customer data — without exposing raw data or violating compliance. Clean Rooms integrates Unity Catalog’s governance, Delta Sharing’s secure exchange and serverless compute to enable true multi-party collaboration — without moving data. See how privacy-preserving models like fraud detection, clinical analytics and ad measurement become scalable, productizable and monetizable across industries. Walk away with a proven pattern to productize analytics, preserve compliance and turn trustless collaboration into recurring revenue.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: MEDIA AND ENTERTAINMENT
Technologies: AI/BI, LLAMA
Skill Level: ADVANCED
Duration: 40 MIN
We present Level Infinite AI Translation, a translation engine developed by Tencent, tailored specifically for the gaming industry. The primary challenge in game machine translation (MT) lies in accurately interpreting the intricate context of game texts, effectively handling terminology and adapting to the highly diverse translation formats and stylistic requirements across different games. Traditional MT approaches cannot effectively address the aforementioned challenges due to their weak context representation ability and lack of common knowledge. Leveraging large language model and related technology, our engine is crafted to capture the subtleties of localized language expression while ensuring optimization for domain-specific terminology, jargon and required formats and styles. To date, the engine has been successfully implemented in 15 international projects, translating over one billion words across 23 languages, and has demonstrated cost savings exceeding 25% for partners.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
How to use UC OSS, what features are available, and intro to the ecosystem. We'll dive into the latest release and get hands-on with demos for working with your UC data and AI assets — including tables, volumes, models and AI functions.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: MLFLOW, LLAMA, MOSAIC AI
Skill Level: ADVANCED
Duration: 40 MIN
Each LLM has unique strengths and weaknesses, and there is no one-size-fits-all solution. Companies strive to balance cost reduction with maximizing the value of their use cases by considering various factors such as latency, multi-modality, API costs, user need, and prompt complexity. Model routing helps in optimizing performance and cost along with enhanced scalability and user satisfaction. Overview of cost-effective models training using AI gateway logs, user feedback, prompt, and model features to design an intelligent model-routing AI agent. Covers different strategies for model routing, deployment in Mosaic AI, re-training, and evaluation through A/B testing and end-to-end Databricks workflows. Additionally, it will delve into the details of training data collection, feature engineering, prompt formatting, custom loss functions, architectural modifications, addressing cold-start problems, query embedding generation and clustering through VectorDB, and RL policy-based exploration.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, APACHE ICEBERG, DATABRICKS SQL
Skill Level: ADVANCED
Duration: 40 MIN
This session explores the strategic migration from Snowflake to Databricks, focusing on the journey of transforming a data lake to leverage Databricks’ advanced capabilities. It outlines the assessment of key architectural differences, performance benchmarks, and cost implications driving the decision. Attendees will gain insights into planning and execution, including data ingestion pipelines, schema conversion and metadata migration. Challenges such as maintaining data quality, optimizing compute resources and minimizing downtime are discussed, alongside solutions implemented to ensure a seamless transition. The session highlights the benefits of unified analytics and enhanced scalability achieved through Databricks, delivering actionable takeaways for similar migrations.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES
Technologies: MLFLOW
Skill Level: ADVANCED
Duration: 40 MIN
Accurate charge time estimation is key to vehicle performance and user experience. We developed a scalable ML model that enhances real-time charge predictions in vehicle controls. Traditional rule-based methods struggle with dynamic factors like environment, vehicle state, and charging conditions. Our adaptive ML solution improves accuracy by 10%. We use Unity Catalog for data governance, Delta Tables for storage, and Liquid Clustering for data layout. Job schedulers manage data processing, while AutoML accelerates model selection. MLflow streamlines tracking, versioning, and deployment. A dedicated serving endpoint enables A/B testing and real-time insights. As our data ecosystem grew, scalability became critical. Our flexible ML framework was integrated into vehicle control systems within months. With live accuracy tracking and software-driven blending, we support 50,000+ weekly charge sessions, improving energy management and user experience.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENERGY AND UTILITIES
Technologies: APACHE SPARK, AI/BI, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Octave is a Plotly Dash application used daily by about 1,000 Hydro-Québec technicians and engineers to analyze smart meter load and voltage data from 4.5M meters across the province. As adoption grew, Octave’s back end was migrated to Databricks to address increasingly massive scale (>1T data points), governance and security requirements. This talk will summarize how Databricks was optimized to support performant at-scale interactive Dash application experiences while in parallel managing complex back-end ETL processes. The talk will outline optimizations targeted to further optimize query latency and user concurrency, along with plans to increase data update frequency. Non-technology related success factors to be reviewed will include the value of: subject matter expertise, operational autonomy, code quality for long-term maintainability and proactive vendor technical support.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS WORKFLOWS, LAKEFLOW
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. Curious about orchestrating data pipelines on Databricks? Join us for an introduction to Lakeflow Jobs (formerly Databricks Workflows) — an easy-to-use orchestration service built into the Databricks Data Intelligence Platform. Lakeflow Jobs simplifies automating your data and AI workflows, from ETL pipelines to machine learning model training. In this beginner-friendly session, you'll learn how to: We’ll walk through common use cases, share demos and offer tips to help you get started quickly. If you're new to orchestration or just getting started with Databricks, this session is for you.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: MLFLOW, AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session is repeated. In an era of exponential data growth, organizations across industries face common challenges in transforming raw data into actionable insights. This presentation showcases how Novo Nordisk is pioneering insights generation approaches to clinical data management and AI. Using our clinical trials platform FounData, built on Databricks, we demonstrate how proper data architecture enables advanced AI applications. We'll introduce a multi-agent AI framework that revolutionizes data interaction, combining specialized AI agents to guide users through complex datasets. While our focus is on clinical data, these principles apply across sectors – from manufacturing to financial services. Learn how democratizing access to data and AI capabilities can transform organizational efficiency while maintaining governance. Through this real-world implementation, participants will gain insights on building scalable data architectures and leveraging multi-agent AI frameworks for responsible innovation.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Payer organizations are rapidly embracing digital transformation, leveraging data and AI to drive operational efficiency, improve member experiences and enhance decision-making. This session explores how advanced analytics, robust data governance and AI-powered insights are enabling payers to streamline claims processing, personalize member engagement, manage pharmacy operations, and optimize care management. Thought leaders will share real-world examples of data-driven innovation, discuss strategies for overcoming interoperability and privacy challenges, and highlight the future potential of AI in reshaping the payer landscape.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Databricks Financial Service customers in the GenAI space have a common use case of ingestion and processing of unstructured documents — PDF/images — then performing downstream GenAI tasks such as entity extraction and RAG based knowledge Q&A. The pain points for the customers for these types of use cases are: In this talk we will present an optimized structured streaming workflow for complex PDF ingestion. The key techniques include Apache Spark™ optimization, multi-threading, PDF object extraction, skew handling and auto retry logics
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Data warehousing in enterprise and mission-critical environments needs special consideration for price/performance. This session will explain how Databricks SQL addresses the most challenging requirements for high-concurrency, low-latency performance at scale. We will also cover the latest advancements in resource-based scheduling, autoscaling and caching enhancements that allow for seamless performance and workload management.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
We’ll explore how CipherOwl Inc. constructed a near real-time, multi-chain data lakehouse to power anti-money laundering (AML) monitoring at a petabyte scale. We will walk through the end-to-end architecture, which integrates cutting-edge open-source technologies and AI-driven analytics to handle massive on-chain data volumes seamlessly. Off-chain intelligence complements this to meet rigorous AML requirements. At the core of our solution is ChainStorage, an OSS started by Coinbase that provides robust blockchain data ingestion and block-level serving. We enhanced it with Apache Spark™ and Arrow™, coupled for high-throughput processing and efficient data serialization, backed by Delta Lake and Kafka. For the serving layer, we employ StarRocks to deliver lightning-fast SQL analytics over vast datasets. Finally, our system incorporates machine learning and AI agents for continuous data curation and near real-time insights, which are crucial for tackling on-chain AML challenges.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, PUBLIC SECTOR
Technologies: MLFLOW, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
As a global energy leader, Petrobras relies on machine learning to optimize operations, but manual model deployment and validation processes once created bottlenecks that delayed critical insights. In this session, we’ll reveal how we revolutionized our MLOps framework using MLflow, Databricks Asset Bundles (DABs) and Unity Catalog to: Discover how we enabled data scientists to focus on innovation—not infrastructure—through standardized pipelines while ensuring compliance and scalability in one of the world’s most complex energy ecosystems.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: MANUFACTURING, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. Power BI has long been the dominant BI tool in the market. In this session, we'll discuss how to get the most out of PowerBI and Databricks, beginning with high-level architecture and moving down into detailed how-to guides for troubleshooting common failure points. At the end, you'll receive a cheat-sheet which summarizes those best practices into an easy-to-reference format.
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD
Technologies: DATABRICKS SQL, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Discover how T-Mobile and Deep Sync are redefining personalized marketing through the power of Databricks. Deep Sync, a leader in deterministic identity solutions, has brought its identity spine to Databricks Lakehouse, which covers over 97% of U.S. households with the most current and accurate attribute data available. T-Mobile is bringing to market for the first time a new data services business that introduces privacy-compliant, consent-based consumer data. Together, T-Mobile and Deep Sync are transforming how brands engage with consumers—enabling bespoke, hyper-personalized workflows, identity-driven insights, and closed-loop measurement through Databricks’ Multi-Party Cleanrooms. Join this session to learn how data and identity are converging to solve today’s modern marketing challenges so consumers can rediscover what it feels like to be seen, not targeted
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD, TRAVEL AND HOSPITALITY
Technologies: APACHE SPARK, DELTA LAKE, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
PepsiCo, given its scale, has numerous teams leveraging different tools and engines to access data and perform analytics and AI. To streamline governance across this diverse ecosystem, PepsiCo unifies its data and AI assets under an open and enterprise-grade governance framework with Unity Catalog. In this session, we'll explore real-world examples of how PepsiCo extends Unity Catalog’s governance to all its data and AI assets, enabling secure collaboration even for teams outside Databricks. Learn how PepsiCo architects permissions using service principals and service accounts to authenticate with Unity Catalog, building a multi-engine architecture with seamless and open governance. Attendees will gain practical insights into designing a scalable, flexible data platform that unifies governance across all teams while embracing openness and interoperability.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: MANUFACTURING
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: EDUCATION, PUBLIC SECTOR
Technologies: DELTA LAKE, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 60 MIN
Join the 60-minute kickoff session at the Public Sector Forum for an opportunity to to accelerate innovation into your enterprise through governance, compliance and GenAI. Featuring keynotes from data-driven agency leaders and providing a future-looking journey from Databricks, this event offers invaluable insights. Understand the outcomes of Data and AI powering transformation across common areas of government and beyond: You will not want to miss this exclusive opportunity to own your data and eliminate government silos. Discover the Data + AI Company with deep compliance experience and widespread adoption.
Type: LIGHTNING TALK
Track: DATA WAREHOUSING
Industry: HEALTH AND LIFE SCIENCES, MEDIA AND ENTERTAINMENT, FINANCIAL SERVICES
Technologies: DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
No description available.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK
Skill Level: INTERMEDIATE
Duration: 20 MIN
In today’s digital economy, real-time insights and rapid responsiveness are paramount to delivering exceptional user experiences and lowering TCO. In this session, discover a pioneering approach that leverages a low-latency streaming ETL pipeline built with Spark Structured Streaming and Databricks’ new OLTP-DB—a serverless, managed Postgres offering designed for transactional workloads. Validated in a live customer scenario, this architecture achieves sub-2 second end-to-end latency by seamlessly ingesting streaming data from Kinesis and merging it into OLTP-DB. This breakthrough not only enhances performance and scalability but also provides a replicable blueprint for transforming data pipelines across various verticals. Join us as we delve into the advanced optimization techniques and best practices that underpin this innovation, demonstrating how Databricks’ next-generation solutions can revolutionize real-time data processing and unlock a myriad of new use cases in data landscape.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: MEDIA AND ENTERTAINMENT
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
At Second Dinner, delivering fast, personalized gameplay experiences is key to player engagement. In this session, Justin Wu shares how the team implemented real-time feature serving using Databricks to power responsive, data-driven game mechanics at scale. He’ll dive into the architecture, technical decisions, and trade-offs behind their solution—highlighting how they balance performance, scalability, and cost. Whether you're building live features or rethinking your game data stack, this session offers practical insights to accelerate your journey.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES
Technologies: AI/BI, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session will show how we implemented a solution to support high-frequency data ingestion from smart meters. We implemented a robust API endpoint that interfaces directly with IoT devices. This API processes messages in real time from millions of distributed IoT devices and meters across the network. The architecture leverages cloud storage as a landing zone for the raw data, followed by a streaming pipeline built on Lakeflow Declarative Pipelines. This pipeline implements a multi-layer medallion architecture to progressively clean, transform and enrich the data. The pipeline operates continuously to maintain near real-time data freshness in our gold layer tables. These datasets connect directly to Databricks Dashboards, providing stakeholders with immediate insights into their operational metrics. This solution demonstrates how modern data architecture can handle high-volume IoT data streams while maintaining data quality and providing accessible real-time analytics for business users.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES
Technologies: AI/BI, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
Botnet attacks mobilize digital armies of compromised devices that continuously evolve, challenging traditional security frameworks with their high-speed, high-volume nature. In this session, we will reveal our advanced system — developed on the Databricks platform — that leverages cutting-edge AI/ML capabilities to detect and mitigate bot attacks in near-real time. We will dive into the system’s robust architecture, including scalable data ingestion, feature engineering, MLOps strategies & production deployment of the system. We will address the unique challenges of processing bulk HTTP traffic data, time-series anomaly detection and attack signature identification. We will demonstrate key business values through downtime minimization and threat response automation. With sectors like healthcare facing heightened risks, ensuring data integrity and service continuity is vital. Join us to uncover lessons learned while building an enterprise-grade solution that stays ahead of adversaries.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, DATABRICKS SQL, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
In the fast-paced world of trading, real-time insights are critical for making informed decisions. This presentation explores how Optiver, a leading high-frequency trading firm, harnesses Databricks apps to power its live trading dashboards. The technology enables traders to analyze market data, detect patterns and respond instantly. In this talk, we will showcase how our system leverages Databricks’ scalable infrastructures such as Structured Streaming to efficiently handle vast streams of financial data while ensuring low-latency performance. In addition, we will show how the integration of Databricks apps with Dash has empowered traders to rapidly develop and deploy custom dashboards, minimizing dependency on developers. Attendees will gain insights into our architecture, data processing techniques and lessons learned in integrating Databricks apps with Dash in order to drive rapid, data-driven trading decisions.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES, MEDIA AND ENTERTAINMENT, FINANCIAL SERVICES
Technologies: APACHE SPARK, DLT, LAKEFLOW
Skill Level: ADVANCED
Duration: 40 MIN
Real-time mode is a new low-latency execution mode for Apache Spark™ Structured Streaming. It can consistently provide p99 latencies less than 300 milliseconds for a broad set of stateless and stateful streaming queries. Our talk focuses on the technical aspects of making this possible in Spark. We’ll dive into the core architecture that enables these dramatic latency improvements, including a concurrent stage scheduler and a non-blocking shuffle. We’ll explore how we maintained Spark’s fault-tolerance guarantees, and we’ll also share specific optimizations we made to our streaming SQL operators. These architectural improvements have already enabled Databricks customers to build workloads with latencies up to 10x lower than before. Early adopters in our Private Preview have successfully implemented real-time enrichment pipelines and feature engineering for machine learning — use cases that were previously impossible at these latencies.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Vanderbilt University Medical Center (VUMC) stands at the forefront of health informatics, harnessing the power of data to redefine patient care and make healthcare personal. Join us as we explore how VUMC enables operational and strategic analytics, supports research, and ultimately drives insights into clinical workflow in and around the Epic EHR platform.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, DSPY, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
The rise of GenAI has led to a complete reinvention of how we conceptualize Data + AI. In this breakout, we will recontextualize the rise of GenAI in traditional ML paradigms, and hopefully unite the pre- and post-LLM eras. We will demonstrate when and where GenAI may prove more effective than traditional ML algorithms, and highlight problems for which the wheel is unnecessarily being reinvented with GenAI. This session will also highlight how MLflow provides a unified means of benchmarking traditional ML against GenAI, and lay out a vision for bridging the divide between Traditional ML and GenAI practitioners.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: DELTA LAKE, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
At Kaizen Gaming, data drives our decision-making, but rapid growth exposed inefficiencies in our legacy cloud setup — escalating costs, delayed insights and scalability limits. Operating in 18 countries with 350M daily transactions (1PB+), shared quotas and limited cost transparency hindered efficiency. To address this, we redesigned our cloud architecture with Data Landing Zones, a modular framework that decouples resources, enabling independent scaling and cost accountability. Automation streamlined infrastructure, reduced overhead and enhanced FinOps visibility, while Unity Catalog ensured governance and security. Migration challenges included maintaining stability, managing costs and minimizing latency. A phased approach, Delta Sharing, and DBx Asset Bundles simplified transitions. The result: faster insights, improved cost control and reduced onboarding time, fostering innovation and efficiency. We share our transformation, offering insights for modern cloud optimization.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: EDUCATION, PUBLIC SECTOR
Technologies: DATABRICKS WORKFLOWS, DATABRICKS APPS
Skill Level: BEGINNER
Duration: 40 MIN
Supporting the scale of Public Sector data and the need to protect sensitive information is essential to public sector organizations.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, DELTA LAKE
Skill Level: INTERMEDIATE
Duration: 20 MIN
When using ACID-guaranteed transactions on Databricks concurrently, we can run into transaction conflicts. This talk discusses the basics of concurrent transaction functionality in Databricks—what happens when various combinations of INSERT, UPDATE and MERGE INTO happen concurrently. We discuss how table isolation level, partitioning and deletion vectors affect this. We also mention how Asana used an intermediate blind append stage to support several hundred concurrent transaction updates into the same table.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: MANUFACTURING
Technologies: MOSAIC AI, UNITY CATALOG, DATABRICKS APPS
Skill Level: BEGINNER
Duration: 20 MIN
No description available.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
Atlassian is rebuilding its central lakehouse from the ground up to deliver a more secure, flexible and scalable data environment. In this session, we’ll share how we leverage Unity Catalog for fine-grained governance and supplement it with Immuta for dynamic policy management, enabling row and column level security at scale. By shifting away from broad, monolithic access controls toward a modern, agile solution, we’re empowering teams to securely collaborate on sensitive data without sacrificing performance or usability. Join us for an inside look at our end-to-end policy architecture, from how data owners declare metadata and author policies to the seamless application of access rules across the platform. We’ll also discuss lessons learned on streamlining data governance, ensuring compliance, and improving user adoption. Whether you’re a data architect, engineer or leader, walk away with actionable strategies to simplify and strengthen your own governance and access practices.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: EDUCATION, PUBLIC SECTOR
Technologies: DATABRICKS WORKFLOWS
Skill Level: BEGINNER
Duration: 40 MIN
To dramatically transform the way citizen services are delivered, organizations must bring all data together — streaming, structured and unstructured — in a secure and governed platform.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: RETAIL AND CPG - FOOD
Technologies: AI/BI, DELTA SHARING, DATABRICKS APPS
Skill Level: BEGINNER
Duration: 60 MIN
Consumer industries are being transformed by AI as physical and digital experiences converge. In this flagship session for retail, travel, restaurants and consumer goods attendees at Data + AI Summit, Databricks and a panel of industry leaders will explore how real-time data and machine learning are enabling brands to gain deeper consumer insights, personalize interactions and move closer to true 1:1 marketing. From AI agents shopping on behalf of consumers to consumer-centric supply chains, discover how the most innovative companies will use AI to reshape customer relationships and drive growth in an increasingly connected world.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Explore how Databricks AI/BI Genie revolutionizes retail analytics, empowering business users to become self-reliant data explorers. This session highlights no-code AI apps that create a conversational interface for retail data analysis. Genie spaces harness NLP and generative AI to convert business questions into actionable insights, bypassing complex SQL queries. We'll showcase retail teams effortlessly analyzing sales trends, inventory and customer behavior through Genie's intuitive interface. Witness real-world examples of AI/BI Genie's adaptive learning, enhancing accuracy and relevance over time. Learn how this technology democratizes data access while maintaining governance via Unity Catalog integration. Discover Retail Genie's impact on decision-making, accelerating insights and cultivating a data-driven retail culture. Join us to see the future of accessible, intelligent retail analytics in action.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
Explore the transformative journey of a regional bank as it modernizes its enterprise data infrastructure amidst the challenges of legacy systems and past mergers and acquisitions. The bank is creating an Enterprise Data Hub using Deloitte's industry experience and the Databricks Data Intelligence Platform to drive growth, efficiency and Large Financial Institution readiness needs. This session will showcase how the new data hub will be a one-stop-shop for LOB and enterprise needs, while unlocking the advanced analytics and GenAI possibilities. Discover how this initiative is going to empower the ambitions of a regional bank to realize their “big bank muscle, small bank hustle.”
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 20 MIN
Learn how Morgan Stanley scaled one of their most significant regulatory calculators (SACCR) by leveraging Databricks for horizontal and vertical scaling. Discover how we harnessed Databricks to improve performance, improve calculation accuracy, regulatory compliance and more.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, MLFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us to explore how Standard Chartered Bank's (SCB) groundbreaking strategy is reshaping the future of the cybersecurity landscape by replacing traditional SIEM with a cutting-edge Databricks solution, achieving remarkable business outcomes: This session unveils SCB's journey to a distributed, multi-cloud lakehouse architecture that unlocks unprecedented performance and commercial optimization. Explore why a unified data and AI platform is becoming the cornerstone of next-generation, self-managed SIEM solutions for forward-thinking organizations in this era of AI-powered banking transformation.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Deloitte and GM (General Motors) Financial have collaborated to design and implement a cutting-edge cloud analytics platform, leveraging Databricks. In this session, we will explore how we overcame challenges including dispersed and limited data capabilities, high-cost hardware and outdated software, with a strategic and comprehensive approach. With the help of Deloitte and Databricks, we were able to develop a unified Customer360 view, integrate advanced AI-driven analytics, and establish robust data governance and cyber security measures. Attendees will gain valuable insights into the benefits realized, such as cost savings, enhanced customer experiences, and broad employee upskilling opportunities. Unlock the impact of cloud data modernization and advanced analytics in the automotive finance industry and beyond with Deloitte and Databricks.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: FINANCIAL SERVICES
Technologies: MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
The insurance industry is rapidly evolving as advances in data and artificial intelligence (AI) drive innovation, enabling more personalized customer experiences, streamlined operations, and improved efficiencies. With powerful data analytics and AI-driven solutions, insurers can automate claims processing, enhance risk management, and make real-time decisions. Leveraging insights from large and complex datasets, organizations are delivering more customer-centric products and services than ever before. Key takeaways: Real-world applications of data and AI in claims automation, underwriting, and customer engagementHow predictive analytics and advanced data modeling help anticipate risks and meet customer needs. Personalization of policies, optimized pricing, and more efficient workflows for greater ROI. Discover how data and AI are fueling growth, improving protection, and shaping the future of the insurance industry!
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, MANUFACTURING
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
In this session we will explore the revolutionary advancements in nuclear AI capabilities with HiVE and Bertha on Databricks architecture. HiVE, developed by Westinghouse, leverages over a century of proprietary data to deliver unparalleled AI capabilities. At its core is Bertha, a generative AI model designed to tackle the unique challenges of the nuclear industry. This session will delve into the technical architecture of HiVE and Bertha, showcasing how Databricks' scalable environment enhances their performance. We will discuss the secure data infrastructure supporting HiVE, ensuring data integrity and compliance. Real-world applications and use cases will demonstrate the impact of HiVE and Bertha on improving efficiency, innovation and safety in nuclear operations. Discover how the fusion of HiVE and Bertha with Databricks architecture is transforming the nuclear AI landscape and driving the future of nuclear technology.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: RETAIL AND CPG - FOOD
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session will provide an in-depth overview of how PepsiCo, a global leader in food and beverage, transformed its outdated data platform into a modern, unified and centralized data and AI-enabled platform using the Databricks SQL serverless environment. Through three distinct implementations that transpired at PepsiCo in 2024, we will demonstrate how the PepsiCo Data Analytics & AI Group unlocked pivotal capabilities that facilitated the delivery of diverse data-driven insights to the business, reduced operational expenses and enhanced overall performance through the newly implemented platform.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: MLFLOW, DSPY, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
As companies increasingly adopt Generative AI, they're faced with a new challenge: managing multiple AI assistants. What if you could have a single, intuitive interface that automatically directs questions to the best assistant for the task? Join us to discover how to implement a flexible Routing Agent that streamlines working with multiple AI Assistants. We'll show you how to leverage Databricks and DSPy 3.0 to simplify adding this powerful pattern to your system. We'll dive into the essential aspects including: We'll share real-world examples that you can apply today. You'll leave with the knowledge to make your AI system run smoothly and efficiently.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: EDUCATION, PROFESSIONAL SERVICES, PUBLIC SECTOR
Technologies: DELTA LAKE
Skill Level: BEGINNER
Duration: 20 MIN
Join us for an in-depth Ask Me Anything (AMA) on how Rust is revolutionizing Lakehouse formats like Delta Lake and Apache Iceberg through projects like delta-rs and iceberg-rs! Discover how Rust’s memory safety, zero-cost abstractions and fearless concurrency unlock faster development and higher-performance data operations. Whether you’re a data engineer, Rustacean or Lakehouse enthusiast, bring your questions on how Rust is shaping the future of open table formats!
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, RETAIL AND CPG - FOOD
Technologies: APACHE SPARK
Skill Level: INTERMEDIATE
Duration: 40 MIN
While Spark offers powerful processing capabilities for massive data volumes, cost-efficiency challenges are always bothering users operating at large scales. At Pinterest, where we run millions of Spark jobs monthly, maintaining infra cost efficiency is crucial to support our rapid business growth. To tackle this challenge, we have developed several strategies that have saved us tens of millions of dollars across numerous job instances. We will share our analytical methodology for identifying performance bottlenecks, and the technical solutions to overcome various challenges. Our approach includes extracting insights from billions of collected metrics, leveraging remote shuffle services to address shuffle slowness and improve memory utilization and reduce costs while hosting hundreds of millions of pods. The presentation aims to trigger more discussions about cost efficiency topics of Apache Spark in the community and help the community to tackle the common challenge.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Unlock Genie's full potential with best practices for curating, deploying and monitoring Genie spaces at scale. This session offers a deep dive into the latest enhancements and provides practical guidance on designing high-quality spaces, streamlining deployment workflows and implementing robust monitoring to ensure accuracy and performance in production. Ideal for teams aiming to scale conversational analytics, you’ll leave with actionable strategies to keep your Genie spaces efficient, reliable and aligned with business outcomes.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, MLFLOW, PYTORCH
Skill Level: ADVANCED
Duration: 20 MIN
Coinbase leverages Databricks to scale ML on blockchain data, turning vast transaction networks into actionable insights. This session explores how Databricks’ scalable infrastructure, powered by Delta Lake, enables real-time processing for ML applications like NFT floor price predictions. We’ll show how GraphFrames helps us analyze billion-node transaction graphs (e.g., Bitcoin) for clustering and fraud detection, uncovering structural patterns in blockchain data. But traditional graph analytics has limits. We’ll go further with Graph Neural Networks (GNNs) using Kumo AI, which learn from the transaction network itself rather than relying on hand-engineered features. By encoding relationships directly into the model, GNNs adapt to new fraud tactics, capturing subtle relationships that evolve over time. Join us to see how Coinbase is advancing blockchain ML with Databricks and deep learning on graphs.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
We discuss two real-world use cases in big data engineering, focusing on constructing stable pipelines and managing storage at a petabyte scale. The first use case highlights the implementation of Delta Lake to optimize data pipelines, resulting in an 80% reduction in query time and a 70% reduction in storage space. The second use case demonstrates the effectiveness of the Workflows ‘ForEach’ operator in executing compute-intensive pipelines across multiple clusters, significantly reducing processing time from months to days. This approach involves a reusable design pattern that isolates notebooks into units of work, enabling data scientists to independently test and develop.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: FINANCIAL SERVICES
Technologies: LLAMA, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
With massive data volume and complexity, scaling data governance became a significant challenge. Centralizing metadata management, ensuring regulatory compliance and controlling data access across multiple platforms turned to be critical to maintaining efficiency and trust.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: FINANCIAL SERVICES
Technologies: DATABRICKS SQL, DATABRICKS WORKFLOWS, DLT
Skill Level: ADVANCED
Duration: 40 MIN
In this session, discover how National Australia Bank (NAB) is reshaping its data and AI strategy by positioning data as a strategic enabler. Driven by a vision to unlock data like electricity—continuous and reliable—NAB has established a scalable foundation for data intelligence that balances agility with enterprise-grade control. We'll delve into the key architectural, security, and governance capabilities underpinning this transformation, including Unity Catalog, Serverless, Lakeflow and GenAI. The session will highlight NAB's adoption of Databricks Serverless, platform security controls like private link, and persona-based data access patterns. Attendees will walk away with practical insights into building secure, scalable, and cost-efficient data platforms that fuel innovation while meeting the demands of compliance in highly regulated environments.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK
Skill Level: INTERMEDIATE
Duration: 20 MIN
Zillow has well-established, comprehensive systems for defining and enforcing data quality contracts and detecting anomalies.In this session, we will share how we evaluated Databricks’ native data quality features and why we chose Lakeflow Declarative Pipelines expectations for Lakeflow Declarative Pipelines, along with a combination of enforced constraints and self-defined queries for other job types. Our evaluation considered factors such as performance overhead, cost and scalability. We’ll highlight key improvements over our previous system and demonstrate how these choices have enabled Zillow to enforce scalable, production-grade data quality.Additionally, we are actively testing Databricks’ latest data quality innovations, including enhancements to lakehouse monitoring and the newly released DQX project from Databricks Labs.In summary, we will cover Zillow’s approach to data quality in the lakehouse, key lessons from our migration and actionable takeaways.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: MANUFACTURING
Technologies: DELTA LAKE, MLFLOW, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
At Nikon, camera accessories are essential in meeting the diverse needs of professional photographers worldwide, making their timely availability a priority. Forecasting accessories, however, presents unique challenges including dependencies on parent products, sparse demand patterns, and managing predictions for thousands of items across global subsidiaries. To address this, we leveraged Databricks' unified data and AI platform to develop and deploy an automated, scalable solution for accessory sales planning. Our solution employs a hybrid approach that auto-selects best algorithm from a suite of ML and time-series models, incorporating anomaly detection and methods to handle sparse and low-demand scenarios. MLflow is utilized to automate model logging and versioning, enabling efficient management, and scalable deployment. The framework includes data preparation, model selection and training, performance tracking, prediction generation, and output processing for downstream systems.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: EDUCATION, MEDIA AND ENTERTAINMENT
Technologies: DELTA LAKE, DATA MARKETPLACE, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 20 MIN
This lightning talk dives into real-world GenAI projects that scaled from prototype to production using Databricks’ fully managed tools. Facing cost and time constraints, we leveraged four key Databricks features—Workflows, Model Serving, Serverless Compute, and Notebooks—to build an AI inference pipeline processing millions of documents (text and audiobooks). This approach enables rapid experimentation, easy tuning of GenAI prompts and compute settings, seamless data iteration and efficient quality testing—allowing Data Scientists and Engineers to collaborate effectively. Learn how to design modular, parameterized notebooks that run concurrently, manage dependencies and accelerate AI-driven insights. Whether you're optimizing AI inference, automating complex data workflows or architecting next-gen serverless AI systems, this session delivers actionable strategies to maximize performance while keeping costs low.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, DATABRICKS SQL, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Curious how to apply resource-intensive generative AI models across massive datasets without breaking the bank? This session reveals efficient batch inference strategies for foundation models on Databricks. Learn how to architect scalable pipelines that process large volumes of data through LLMs, text-to-image models and other generative AI systems while optimizing for throughput, cost and quality. Key takeaways: You'll discover how to process any scale of data through your generative AI models efficiently.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, DELTA LAKE
Skill Level: INTERMEDIATE
Duration: 40 MIN
Adobe’s Real-Time Customer Data Platform relies on the identity graph to connect over 70 billion identities and deliver personalized experiences. This session will showcase how the platform leverages Databricks, Spark Streaming and Delta Lake, along with 25+ Databricks deployments across multiple regions and clouds — Azure & AWS — to process terabytes of data daily and handle over a million records per second. The talk will highlight the platform’s ability to scale, demonstrating a 10x increase in ingestion pipeline capacity to accommodate peak traffic during events like the Super Bowl. Attendees will learn about the technical strategies employed, including migrating from Flink to Spark Streaming, optimizing data deduplication, and implementing robust monitoring and anomaly detection. Discover how these optimizations enable Adobe to deliver real-time identity resolution at scale while ensuring compliance and privacy.
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: FINANCIAL SERVICES
Technologies: DATABRICKS SQL, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Master Data Management (MDM) is the foundation of a successful enterprise data strategy — delivering consistency, accuracy and trust across all systems that depend on reliable data. But how can organizations integrate trusted third-party data to enhance their MDM frameworks? How can they ensure that this master data is securely and efficiently shared across internal platforms and external ecosystems? This session explores how Dun & Bradstreet’s pre-mastered data serves as a single source of truth for customers, suppliers and vendors — reducing duplication and driving alignment across enterprise systems. With Delta Sharing, organizations can natively ingest Dun & Bradstreet data into their Databricks environment and establish a scalable, interoperable MDM framework. Delta Sharing also enables secure, real-time distribution of master data across the enterprise ensuring that every system operates from a consistent and trusted foundation.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK, DELTA LAKE, MLFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
At DraftKings, ensuring secure, fair gaming requires detecting fraud in real time with both speed and precision. In this talk, we’ll share how Databricks powers our fraud detection pipeline, integrating real-time streaming, machine learning and rule-based detection within a PySpark framework. Our system enables rapid model training, real-time inference and seamless feature transformation across historical and live data. We use shadow mode to test models and rules in live environments before deployment. Collaborating with Databricks, we push online feature store performance and enhance real-time PySpark capabilities. We'll cover PySpark-based feature transformations, real-time inference, scaling challenges and our migration from a homegrown system to Databricks. This session is for data engineers and ML practitioners optimizing real-time AI workloads, featuring a deep dive, code snippets and lessons from building and scaling fraud detection.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, LLAMA, MOSAIC AI
Skill Level: BEGINNER
Duration: 40 MIN
In this session, discover how Databricks leverages the power of Gen AI, MosaicML, Model Serving and Databricks Apps to revolutionize sales enablement. We’ll showcase how we built an advanced chatbot that equips our go-to-market team with the tools and knowledge needed to excel in customer-facing interactions. This AI-driven solution not only trains our salespeople but also enhances their confidence and effectiveness in demonstrating the transformative potential of Databricks to future customers. Attendees will gain insights into the architecture, development process and practical applications of this innovative approach. The session will conclude with an interactive demo, offering a firsthand look at the chatbot in action. Join us to explore how Databricks is using its own platform to drive sales excellence through cutting-edge AI solutions.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: ADVANCED
Duration: 40 MIN
Learn from the experts on how Databricks’ Mosaic AI Model Serving delivers unparalleled speed and scalability for deploying AI models. This session delves into the architecture and innovations that showcase the impressive improvements in throughput for the AI-serving infrastructure that powers Mosaic AI.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: DATA MARKETPLACE, AI/BI, MOSAIC AI
Skill Level: BEGINNER
Duration: 40 MIN
Growth in banking isn’t just about keeping pace—it’s about setting the pace. This session explores how leading banks leverage Databricks’ Data Intelligence Platform to uncover new revenue opportunities, deepen customer relationships, and expand market reach. Hear from industry leaders who have transformed their growth strategies by harnessing the power of advanced analytics and machine learning. Learn how personalized customer experiences, predictive insights and unified data platforms are driving innovation and helping banks scale faster than ever. Key takeaways: Join us in discovering how data intelligence is redefining growth in banking and thriving throughout uncertainty.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD
Technologies: AI/BI, DATABRICKS SQL, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Managing metrics across teams can feel like everyone’s speaking a different language, which often leads to loss of trust in numbers. Based on a real-world use case, we’ll show you how to establish a governed source of truth for metrics that works at scale and builds a solid foundation for AI integration. You’ll explore how Bolt.eu’s data team governs consistent metrics for different data users and leverages Euno’s automations to navigate the overlap between Looker and dbt. We’ll cover best practices for deciding where your metrics belong and how to optimize engineering and maintenance workflows across Databricks, dbt and Looker. For curious analytics engineers, we’ll dive into thinking in dimensions & measures vs. tables & columns and determining when pre-aggregations make sense. The goal is to help you contribute to a self-serve experience with consistent metric definitions, so business teams and AI agents can access the right data at the right time without endless back-and-forth.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: APACHE SPARK
Skill Level: INTERMEDIATE
Duration: 40 MIN
XGBoost is one of the off-the-shelf gradient boosting algorithms for analyzing tabular datasets. Unlike deep learning, gradient-boosting decision trees require the entire dataset to be in memory for efficient model training. To overcome the limitation, XGBoost features a distributed out-of-core implementation that fetches data in batch, which benefits significantly from the latest NVIDIA GPUs and the NVLink-C2C’s ultra bandwidth. In this talk, we will share our work on optimizing XGBoost using the Grace Blackwell super chip. The fast chip-to-chip link between the CPU and the GPU enables XGBoost to scale up without compromising performance. Our work has effectively increased XGBoost’s training capacity to over 1.2TB on a single node. The approach is scalable to GPU clusters using Spark, enabling XGBoost to handle terabytes of data efficiently. We will demonstrate combining XGBoost out-of-core algorithms with the latest connect ML from Spark 4.0 for large model training workflows.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: PUBLIC SECTOR, TRAVEL AND HOSPITALITY
Technologies: DATABRICKS WORKFLOWS, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Discover how Europe’s third-busiest airport, Schiphol Group, is elevating its data operations by transitioning from a standard Databricks setup to the advanced capabilities of Unity Catalog. In this session, we will share the motivations, obstacles and strategic decisions behind executing a seamless migration in a large-scale environment — one that spans hundreds of workspaces and demands continuous availability. Gain insights into planning and governance, learn how to safeguard data integrity and maintain operational flow, and understand the process of integrating Unity Catalog’s enhanced security and governance features. Attendees will leave with practical lessons from our hands-on experience, proven methods for similar migrations, and a clear perspective on the benefits this transition offers for complex, rapidly evolving organizations.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: EDUCATION, ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 20 MIN
Bryan McCann, You.com’s co-founder and CTO, shares his journey from studying philosophy and meaning to the Stanford Computer Science Department working on groundbreaking AI research alongside Richard Socher. Right now, AI is reshaping everything we hold dear — our jobs, creativity, and identities. It’s also our greatest source of inspiration. The Age of AI is simultaneously a Renaissance, Enlightenment, Industrial Revolution and likely source of humanity’s greatest existential crisis. To surmount this, Bryan will discuss how he uses AI responses as new starting points rather than answers, building teams like neural networks optimized for learning and how the answer to our meaning crisis may be for humans to be more like AI. Exploring AI’s impact on politics, economics, healthcare, education and culture, Bryan asserts that we must all take part in authoring humanity’s new story — AI can inspire us to become something new, rather than merely replace what we are now.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Bringing AI/BI to every business user starts with getting security, access and governance right. In this session, we’ll walk through the latest best practices for configuring Databricks accounts, setting up workspaces, and managing authentication protocols to enable secure and scalable onboarding. Whether you're supporting a small team or an entire enterprise, you'll gain practical insights to protect your data while ensuring seamless and governed access to AI/BI tools.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: FINANCIAL SERVICES
Technologies: DATA MARKETPLACE, AI/BI, MOSAIC AI
Skill Level: BEGINNER
Duration: 40 MIN
In capital markets, mitigating risk is critical to protecting the firm’s reputation, assets, and clients. This session highlights how firms use technology to enhance risk management, ensure compliance and safeguard operations from emerging threats. Learn how advanced analytics and machine learning models are helping firms detect anomalies, prevent fraud, and manage regulatory complexities with greater precision. Hear from industry leaders who have successfully implemented proactive risk strategies that balance security with operational efficiency. Key Takeaways: Don’t miss this session to discover how data intelligence is transforming risk management in capital markets—helping firms secure their future while driving success!"
Type: DEEP DIVE
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DATA MARKETPLACE, DELTA SHARING, UNITY CATALOG
Skill Level: ADVANCED
Duration: 90 MIN
This session will focus on the security aspects of Databricks Delta Sharing, Databricks Cleanrooms and Databricks Marketplace, providing an exploration of how these solutions enable secure and scalable data collaboration while prioritizing privacy. Highlights:
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
No description available.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 20 MIN
Dynamic Insert Overwrite is an important Delta Lake feature that allows fine-grained updates by selectively overwriting specific rows, eliminating the need for full-table rewrites. For examples, this capability is essential for: In this lightning talk, we will:
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
As autonomous agents become increasingly sophisticated and widely deployed, the ability for these agents to evaluate their own performance and continuously self-improve is essential. However, the growing complexity of these agents amplifies potential risks, including exposure to malicious inputs and generation of undesirable outputs. In this talk, we'll explore how to build resilient, self-improving agents. To drive self-improvement effectively, both the agent and the evaluation techniques must simultaneously improve with a continuously iterating feedback loop. Drawing from extensive real-world experiences across numerous productionized use cases, we will demonstrate practical strategies for combining tools from Arize, Databricks MLflow and Mosaic AI to evaluate and improve high-performing agents.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: RETAIL AND CPG - FOOD
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Assortment and space analytics optimizes product selection and shelf allocation to boost sales, improve inventory management and enhance customer experience. However, challenges like evolving demand, data accuracy and operational alignment hinder success. Older approaches struggled due to siloed tools, slow performance and poor governance. Databricks unified platform resolved these issues, enabling seamless data integration, high-performance analytics and governed sharing. The innovative AI/BI Genie interface empowered self-service analytics, driving non-technical user adoption. This solution helped Walmart cut time to value by 90% and saved $5.6M annually in FTE hours leading to increased productivity. Looking ahead, AI agents will let store managers and merchants execute decisions via conversational interfaces, streamlining operations and enhancing accessibility. This transformation positions retailers to thrive in a competitive, customer-centric market.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MANUFACTURING
Technologies: MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Marvell’s AI-driven solutions, powered by Databricks’ Data Intelligence Platform, provide a robust framework for secure, compliant and transparent Data and AI workflows leveraging Data & AI Governance through Unity Catalog. Marvell ensures centralized management of data and AI assets with quality, security, lineage and governance guardrails. With Databricks Unity Catalog, Marvell achieves comprehensive oversight of structured and unstructured data, AI models and notebooks. Automated governance policies, fine-grained access controls and lineage tracking help enforce regulatory compliance while streamlining AI development. This governance framework enhances trust and reliability in AI-powered decision-making, enabling Marvell to scale AI innovation efficiently while minimizing risks. By integrating data security, auditability and compliance standards, Marvell is driving the future of responsible AI adoption with Databricks.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
How do you wrangle over 8TB of granular “hit-level” website analytics data with hundreds of columns, all while eliminating the overhead of cluster management, decreasing runtime and saving money? In this session, we’ll dive into how we helped HP Inc. use Databricks serverless compute and Lakeflow Declarative Pipelines to streamline Adobe Analytics data ingestion while making it faster, cheaper and easier to operate. We’ll walk you through our full migration story — from managing unwieldy custom-defined AWS-based Apache Spark™ clusters to spinning up Databricks serverless pipelines and workflows with on-demand scalability and near-zero overhead. If you want to simplify infrastructure, optimize performance and get more out of your Databricks workloads, this session is for you.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS WORKFLOWS, DLT
Skill Level: BEGINNER
Duration: 40 MIN
Discover how Databricks serverless compute revolutionizes data workflows by eliminating infrastructure management, enabling rapid scaling and optimizing costs for Notebooks, Jobs and Lakeflow Declarative Pipelines. This session will delve into the serverless architecture, highlighting its ability to dynamically allocate resources, reduce idle costs and simplify development cycles. Learn about recent advancements, including cost savings and practical strategies for migration and optimization. Tailored for Data Engineers and Architects, this talk will also explore use cases, features, limitations and future roadmap, empowering you to make informed infrastructure decisions while unlocking the full potential of Databricks’ serverless capabilities.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: AI/BI, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
At ServiceNow, we’re not just talking about AI innovation — we’re delivering it. By harnessing the power of Databricks, we’re reimagining Go-To-Market (GTM) strategies, seamlessly integrating AI at every stage of the deal journey — from identifying high-value leads to generating hyper-personalized outreach and pitch materials. In this session, learn how we’ve slashed data processing times by over 90%, reducing workflows from an entire day to just 30 minutes with Databricks. This unprecedented speed enables us to deploy AI-driven GTM initiatives faster, empowering our sellers with real-time insights that accelerate deal velocity and drive business growth. As Agentic AI becomes a game-changer in enterprise GTM, ServiceNow and Databricks are leading the charge — paving the way for a smarter, more efficient future in AI-powered sales.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: AI/BI, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
At Data and AI in 2022, Databricks pioneered the term to shift left in how AI workloads would enable less data science driven people to create their own apps. In 2025, we take a look at how Experian is doing on that journey. This session highlights Databricks services that assist with the shift left paradigm for Generative AI, including how AI/BI Genie helps with Generative analytics, and how Agent Studio helps with synthetic generation of test cases to validate model performance.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, MOSAIC AI
Skill Level: BEGINNER
Duration: 40 MIN
Two industry veterans have been debating data architecture, tearing apart trends and tinkering with tech for decades and they’re bringing the conversation live — and you’re in control. Got a burning question about lake structures or internal performance? Worried about AI taking over the world? Want straight-talking opinions on the latest hype? Need real-world advice from the people who the experts get advice from? Want to get the juicy behind-the-scenes gossip about any announcements and shockwaves from the Keynotes? This is your chance to have your questions answered! Submit your questions ahead of time or bring them on the day — no topic is off-limits (though there's always a risk of side quests into coffee, sci-fi, or the quirks of English weather). Come for the insights, stay for the chaos.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Delta Sharing enables cross-domain sharing of data assets for collaboration. A practical concern providers and recipients face in doing so is the need to manually configure network and storage firewalls. This is particularly challenging for large-scale providers and recipients with strict compliance requirements. In this talk, we will describe our solution to fully eliminate these complexities. This enhances user experience, scalability and security, facilitating seamless data collaboration across diverse environments and cloud platforms.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES, MANUFACTURING, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, LAKEFLOW
Skill Level: ADVANCED
Duration: 40 MIN
Data engineering teams are frequently tasked with building bespoke ingest and/or egress solutions for myriad custom, proprietary, or industry-specific data sources or sinks. Many teams find this work cumbersome and time-consuming. Recognizing these challenges, Databricks interviewed numerous companies across different industries to better understand their diverse data integration needs. This comprehensive feedback led us to develop the Python Data Source API for Apache Spark™.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: DLT, LAKEFLOW
Skill Level: BEGINNER
Duration: 40 MIN
As part of the new Lakeflow data engineering experience, Lakeflow Declarative Pipelines makes it easy to build and manage reliable data pipelines. It unifies batch and streaming, reduces operational complexity and ensures dependable data delivery at scale — from batch ETL to real-time processing.Lakeflow Declarative Pipelines excels at declarative change data capture, batch and streaming workloads, and efficient SQL-based pipelines. In this session, you’ll learn how we’ve reimagined data pipelining with Lakeflow Declarative Pipelines, including: Join us to see how Lakeflow Declarative Pipelines powers better analytics and AI with reliable, unified pipelines.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
The last year has seen the rapid progress of Open Source GenAI models and frameworks. This talk covers best practices for custom training and OSS GenAI finetuning on Databricks, powered by the newly announced Serverless GPU Compute. We’ll cover how to use Serverless GPU compute to power AI training/GenAI finetuning workloads and framework support for libraries like LLM Foundry, Composer, HuggingFace, and more. Lastly, we’ll cover how to leverage MLFlow and the Databricks Lakehouse to streamline the end to end development of these models. Key takeaways include: Join us to learn about the newly announced Serverless GPU Compute and the latest updates to GPU training and finetuning on Databricks!
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING
Technologies: APACHE SPARK, DELTA LAKE, DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 20 MIN
At Procore, we're transforming the construction industry through innovative data solutions. This session unveils how we've supercharged our analytics offerings using a unified lakehouse architecture and Delta Sharing, delivering game-changing results for our customers and our business and how data professionals can unlock the full potential of their data assets and drive meaningful business outcomes. Key highlights:
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MANUFACTURING
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, AI/BI, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
As connected vehicles generate vast amounts of personal and sensitive data, ensuring privacy and security in machine learning (ML) processes is essential. This session explores how Trusted Execution Environments (TEEs) and Azure Confidential Computing can enable privacy-preserving ML in cloud environments. We’ll present a method to recreate a vehicle environment in the cloud, where sensitive data remains private throughout model training, inference and deployment. Attendees will learn how Mercedes-Benz R&D North America builds secure, privacy-respecting personalized systems for the next generation of connected vehicles.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: EDUCATION, ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD
Technologies: DATABRICKS WORKFLOWS, UNITY CATALOG, DATABRICKS APPS
Skill Level: BEGINNER
Duration: 40 MIN
A successful data strategy requires the right platform and the ability to empower the broader user community by creating simple, scalable and secure patterns that lower the barrier to entry while ensuring robust data practices. Guided by the belief that everyone is a data person, we focus on breaking down silos, democratizing access and enabling distributed teams to contribute through a federated "data-as-a-product" model. We’ll share the impact and lessons learned in creating a single source of truth on Unity Catalog, consolidated from diverse sources and cloud platforms. We’ll discuss how we streamlined governance with Databricks Apps, Workflows and native capabilities, ensuring compliance without hindering innovation. We’ll also cover how we maximize the value of that catalog by leveraging semantics to enable trustworthy, AI-driven self-service in AI/BI dashboards and downstream apps. Come learn how we built a next-gen data ecosystem that empowers everyone to be a data person.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Do you have users that wear multiple hats over a day? Like working with data from various customers and hoping they don’t inadvertently aggregate data? Or are they working on sensitive datasets such as clinical trials that should not be combined, or are data sets that are subject to regulations? We have a solution! In this session, we will present a new capability that allows users wearing multiple hats to switch roles in the Databricks workspace to work exclusively on a dedicated project, data of a particular client or clinical trial. When switching to a particular role, the workspace adapts in such a way that only workspace objects and UC data of that particular role are accessible. We will also showcase the administrative experience of setting up exclusive access using groups and UC permissions.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
AI in healthcare has a data problem. Fragmented data remains one of the biggest challenges, and bottlenecks the development and deployment of AI solutions across life sciences, payers, and providers. Legacy paper-driven workflows and fragmented technology perpetuate silos, making it difficult to create a comprehensive, real-time picture of patient health. Datavant is leveraging Databricks and AWS technology to solve this problem at scale. Through our partnership with Databricks, we are centralizing storage of clinical data from what is arguably the largest health data network so that we can transform it into structured, AI-ready data – and shave off 80 percent of the work of deploying a new AI use case. Learn how we are handling the complexity of this effort while preserving the integrity of source data. We’ll also share early use cases now available to our healthcare customers.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, DATABRICKS SQL, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 20 MIN
Traditionally, spam emails are messages a user does not want, containing some kind of threat like phishing. Because of this, detection systems can focus on malicious content or sender behavior. List bombing upends this paradigm. By abusing public forms such as marketing signups, attackers can fill a user's inbox with high volumes of legitimate mail. These emails don't contain threats, and each sender is following best practices to confirm the recipient wants to be subscribed, but the net effect for an end user is their inbox being flooded with dozens of emails per minute. This talk covers the the exploration and implementation for identifying this attack in our company's anti-spam telemetry: from reading and writing to Kafka, Delta table streaming for ETL workflows, multi-table liquid clustering design for efficient table joins, curating gold tables to speed up critical queries and using Delta tables as an auditable integration point for interacting with external services.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES, MANUFACTURING
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Real-time data is one of the most important datasets for any Data and AI Platform across any industry. Spark 4.0 and Delta 4.0 include new features that make ingestion and querying of real-time data better than ever before. Features such as: In this presentation you will learn how data teams can leverage these latest features to build industry-leading, real-time data products using Spark and Delta and includes real world examples and metrics of the improvements they make in performance and processing of data in the real time space.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK
Skill Level: INTERMEDIATE
Duration: 40 MIN
What if you could run Spark jobs without worrying about clusters, versions and upgrades? Did you know Spark has this functionality built-in today? Join us to take a look at this functionality — Spark Connect. Join us to dig into how Spark Connect works — abstracting away Spark clusters away in favor of the DataFrame API and unresolved logical plans. You will learn some of the cool things Spark Connect unlocks, including:
Type: LIGHTNING TALK
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT
Technologies: APACHE SPARK
Skill Level: INTERMEDIATE
Duration: 20 MIN
At LinkedIn, we manage over 400,000 daily Spark applications consuming 200+ PBHrs of compute daily. To address the challenges posed by manual configuration of Spark's memory tuning options, which led to low memory utilization and frequent OOM errors, we developed an automated Spark executor memory right-sizing system. Our approach, utilizing a policy-based system with nearline and real-time feedback loops, automates memory tuning, leading to more efficient resource allocation, improved user productivity and increased job reliability. By leveraging historical data and real-time error classification, we dynamically adjust memory, significantly narrowing the gap between allocated and utilized resources while reducing failures. This initiative has achieved a 13% increase in memory utilization and a 90% drop in OOM-related job failures, saving us 1000s of PBHrs of compute every year.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: MANUFACTURING, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Agentic AI represents a quantum leap beyond generative AI—enabling systems to make autonomous decisions and act independently. While this unlocks transformative potential, it also brings complex governance challenges. This session explores novel risks, practical strategies and proven Data & AI governance frameworks for governing agentic AI at scale
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 20 MIN
An intelligent, action-driven approach to bridge Data Engineering and AI/ML workflows, delivering continuous data trust through comprehensive monitoring, validation, and remediation across the entire Databricks data lifecycle. Learn how Acceldata’s Agentic Data Management (ADM) platform:
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: FINANCIAL SERVICES
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Modern insurers require agile, integrated data systems to harness AI. This framework for a global insurer uses Azure Databricks to unify legacy systems into a governed lakehouse medallion architecture (bronze/silver/gold layers), eliminating silos and enabling real-time analytics. The solution employs: By combining Databricks’ distributed infrastructure with Azure’s security, the insurer achieves regulatory compliance while enabling AI-driven innovation (e.g., underwriting, claims). The framework establishes a future-proof foundation for mergers/acquisitions (M&A) and cross-functional data products, balancing governance with agility.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this talk, we will explore the transformative potential of Generative AI and Agentic AI in driving enterprise-scale innovation and delivering substantial business value. As organizations increasingly recognize the power of AI to move beyond automation towards true augmentation and intelligent decision-making, understanding the nuances of scaling these advanced AI paradigms becomes critical. We will delve into practical strategies for deploying, managing, and optimizing Agentic AI frameworks showcasing how autonomous, goal-directed AI systems can unlock new efficiencies, enhance customer experiences, and foster continuous innovation. Through real-world case studies and actionable insights, attendees will gain a comprehensive understanding of the key considerations to architect, implement, and measure the ROI of large-scale Generative and Agentic AI initiatives, positioning their enterprises for sustained growth and competitive advantage in the AI-first era.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: EDUCATION, ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES
Technologies: DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
There’s never been a more critical time to ensure data and analytics foundations can deliver the value and efficiency needed to accelerate and scale AI. What are the most difficult challenges that organizations face with data transformation, and what technologies, processes and decisions that overcome these barriers to success? Join this session featuring executives from the Gates Foundation, the nonprofit leading change in communities around the globe, and Avanade, the joint venture between Accenture and Microsoft, in a discussion about impactful data strategy. Learn about the Gates Foundation’s approach to its enterprise data platform to ensure trusted insights at the speed of today’s business. And we’ll share lessons learned from Avanade helping organizations around the globe build with Databricks and seize the AI opportunity.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR
Technologies: DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 20 MIN
One of the largest and trailblazing U.S. states is setting a new standard for how governments can harness data and AI to drive large-scale impact. In this session, we will explore how we are using the Databricks Data Intelligence Platform to address two of the state's most pressing challenges: public health and transportation. From vaccine tracking powered by intelligent record linkage and a service-oriented analytics architecture, to Gen AI-driven insights that reduce traffic fatalities and optimize infrastructure investments, this session reveals how scalable, secure, and real-time data solutions are transforming state operations. Join us to learn how data-driven governance is delivering better outcomes for millions—and paving the way for an AI enabled, data driven and more responsive government.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING, FINANCIAL SERVICES
Technologies: AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
As organizations scale AI initiatives on platforms like Databricks, one challenge remains: bridging the gap between the data in the lakehouse and the vast, distributed data that lives elsewhere. Turning massive volumes of technical metadata into trusted, business-ready insight requires more than cataloging what's inside the lakehouse—it demands true enterprise-wide intelligence. Actian CTO Emma McGrattan will explore how combining Databricks Unity Catalog with the Actian Data Platform extends visibility, governance, and trust beyond the lakehouse. Learn how leading enterprises are:
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, APACHE ICEBERG, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 20 MIN
In this session, discover how effective data movement is foundational to successful GenAI implementations. As organizations rush to adopt AI technologies, many struggle with the infrastructure needed to manage the massive influx of unstructured data these systems require. Jim Kutz, Head of Data at Airbyte, draws from 20+ years of experience leading data teams at companies like Grafana, CircleCI, and BlackRock to demonstrate how modern data movement architectures can enable secure, compliant GenAI applications. Learn practical approaches to data sovereignty, metadata management, and privacy controls that transform data governance into an enabler for AI innovation. This session will explore how you can securely leverage your most valuable asset—first-party data—for GenAI applications while maintaining complete control over sensitive information. Walk away with actionable strategies for building an AI-ready data infrastructure that balances innovation with governance requirements.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, TRAVEL AND HOSPITALITY, FINANCIAL SERVICES
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
In the era of data-driven enterprises, true democratization requires more than just access–it demands context, trust, and governance at scale. In this session, discover how to seamlessly integrate Databricks Unity Catalog with Alation’s Enterprise Data Catalog to deliver:
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: RETAIL AND CPG - FOOD
Technologies: DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
GoTo Foods, the platform company behind brands like Auntie Anne’s, Cinnabon, Jamba, and more, set out to turn a fragmented data landscape into a high-performance customer intelligence engine. In this session, CTO Manuel Valdes and Director of Marketing Technology Brett Newcome share how they unified data using Databricks Delta Sharing and Amperity’s Customer Data Cloud to speed up time to market. As part of GoTo’s broader strategy to support its brands with shared enterprise tools, the team:
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: FINANCIAL SERVICES
Technologies: DATABRICKS SQL, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
As insurers increasingly leverage IoT data to personalize policy pricing, reconciling disparate datasets across devices, policies, and insurers becomes mission-critical. In this session, learn how Nationwide transitioned from prototype workflows in Dataiku to a hardened data stack on Databricks, enabling scalable data governance and high-impact analytics. Discover how the team orchestrates data reconciliation across Postgres, Oracle, and Databricks to align customer driving behavior with insurer and policy data—ensuring more accurate, fair discounts for policyholders. With Anomalo’s automated monitoring layered on top, Nationwide ensures data quality at scale while empowering business units to define custom logic for proactive stewardship. We’ll also look ahead to how these foundations are preparing the enterprise for unstructured data and GenAI initiatives.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS WORKFLOWS
Skill Level: BEGINNER
Duration: 20 MIN
The role of data teams and data engineers is evolving. No longer just pipeline builders or dashboard creators, today’s data teams must evolve to drive business strategy, enable automation, and scale with growing demands. Best practices seen in the software engineering world (Agile development, CI/CD, and Infrastructure-as-code) from the DevOps movement are gradually making their way into data engineering. We believe these changes have led to the rise of DataOps and a new wave of best practices that will transform the discipline of data engineering. But how do you transform a reactive team into a proactive force for innovation? We’ll explore the key principles for building a resilient, high-impact data team—from structuring for collaboration, testing, automation, to leveraging modern orchestration tools. Whether you’re leading a team or looking to future-proof your career, you’ll walk away with actionable insights on how to stay ahead in the rapidly changing data landscape.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, TRAVEL AND HOSPITALITY, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 40 MIN
Airflow 3 is here, bringing a new era of flexibility, scalability, and security to data orchestration. This release makes building, running, and managing data pipelines easier than ever. In this session, we will cover the key benefits of Airflow 3, including: (1) Ease of Use: Airflow 3 rethinks the user experience—from an intuitive, upgraded UI to DAG Versioning and scheduler-integrated backfills that let teams manage pipelines more effectively than ever before (2) Stronger Security: By decoupling task execution from direct database connections, Airflow 3 enforces task isolation and minimal-privilege access. This meets stringent compliance standards while reducing the risk of unauthorized data exposure. (3) Ultimate Flexibility: Run tasks anywhere, anytime with remote execution and event-driven scheduling. Airflow 3 is designed for global, heterogeneous modern data environments with an architecture that facilitates edge and hybrid-cloud to GPU-based deployments.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: MANUFACTURING
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Now the largest automaker in the United States, selling more than 2.7 million vehicles in 2024, General Motors is setting a bold vision for its future, with Software-defined vehicles and AI as a driving force. With data as a crucial asset, a transformation of this scale calls for a modern approach to Data Governance. Join Sherri Adame, Enterprise Data Governance Leader at General Motors, to learn about GM’s novel governance approach, supported by technologies like Atlan and Databricks. Hear how Sherri and her team are shifting governance to the left with automation, implementing data contracts, and accelerating data product discovery across domains, creating a cultural shift that emphasizes data as a competitive advantage.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MEDIA AND ENTERTAINMENT
Technologies: AI/BI, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
With hundreds of millions viewing broadcasts from news to sports, Fox relies on a sophisticated and trusted architecture ingesting 100+ data sources, carefully governed to improve UX across products, drive sales and marketing, and ensure KPI tracking. Join Oliver Gomes, VP of Enterprise and Data Platform at Fox, and Prukalpa Sankar of Atlan to learn how true partnership helps their team navigate opportunities from Governance to AI. To govern and democratize their multi-cloud data platform, Fox chose Atlan to make data accessible and understandable for more users than ever before. Their team then used a data product approach to create a shared language using context from sources like Unity Catalog at a single point of access, no matter the underlying technology. Now, Fox is defining an ambitious future for Metadata. With Atlan and Iceberg driving interoperability, their team prepares to build a “control plane”, creating a common system of trust and governance.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: ENERGY AND UTILITIES, HEALTH AND LIFE SCIENCES, MANUFACTURING
Technologies: DELTA LAKE, AI/BI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Industrial organizations are unlocking new possibilities through the partnership between AVEVA and Databricks. The seamless, no-code, zero-copy solution—powered by Delta Sharing and CONNECT—enables companies to combine IT and OT data effortlessly. By bridging the gap between operational and enterprise data, businesses can harness the power of AI, data science, and business intelligence at an unprecedented scale to drive innovation. In this session, explore real-world applications of this integration, including how industry leaders are using CONNECT and Databricks to boost efficiency, reduce costs, and advance sustainability—all without fragmented point solutions. You’ll also see a live demo of the integration, showcasing how secure, scalable access to trusted industrial data is enabling new levels of industrial intelligence across sectors like mining, manufacturing, power, and oil and gas.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATA MARKETPLACE, MOSAIC AI
Skill Level: BEGINNER
Duration: 20 MIN
AWS Marketplace is revolutionizing how enterprises worldwide discover, procure, and manage their software solutions. With access to over 5,000+ verified sellers offering software, data, and professional services - including industry leaders like Databricks - organizations can streamline procurement through flexible pricing models and simplified terms. The platform seamlessly integrates with AWS services while providing consolidated billing, centralized governance, and streamlined vendor management. Through innovations like Buy with AWS, customers can purchase directly from Partner websites, making software acquisition more efficient than ever. Join us to learn how AWS Marketplace is driving value for both customers and Partners, helping organizations accelerate their digital transformation while maintaining security and compliance.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
In this session, you’ll see how to build and deploy a GenAI agent and Model Context Protocol (MCP) with Databricks, Anthropic, Mosaic External AI Gateway, and Amazon Bedrock. You will learn the architecture, best-practices of using Databricks Mosaic AI, Anthropic Sonnet 3.7 first-party frontier model, and LangGraph for custom workflow orchestration in Databricks Data Intelligence Platform. You’ll also see how to use Databricks Mosaic AI to provide agent evaluation and monitoring. In addition, you will also see how inline agent will use MCP to provide tools and other resources using Amazon Nova models with Amazon Bedrock inline agent for deep research. This approach gives you the flexibility of LangGraph, the powerful managed agents offered by Amazon Bedrock, and Databricks Mosaic AI’s operational support for evaluation and monitoring.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL, DATABRICKS WORKFLOWS, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us as we explore the well-architected framework for modern data lakehouse architecture, where AWS's comprehensive data, AI, and infrastructure capabilities align with Databricks' unified platform approach. Building upon core principles of Operational Excellence, Security, Reliability, Performance, and Cost Optimization, we'll demonstrate how Data and AI Governance alongside Interoperability and Usability enable organizations to build robust, scalable platforms. Learn how Ripple modernized its data infrastructure by migrating from a legacy Hadoop system to a scalable, real-time analytics platform using Databricks on AWS. This session covers the challenges of high operational costs, latency, and peak-time bottlenecks—and how Ripple achieved 80% cost savings and 55% performance improvements with Photon, Graviton, Delta Lake, and Structured Streaming.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: AI/BI, DATABRICKS WORKFLOWS
Skill Level: BEGINNER
Duration: 40 MIN
In the age of agentic AI, competitive advantage lies not only in AI models, but in the quality of the data agents reason on and the agility of the tools that feed them. To fully realize the ROI of agentic AI, organizations need a platform that enables high-quality data pipelines and provides scalable, enterprise-grade tools. In this session, discover how a unified platform for integration, data management, MCP server management, API management, and agent orchestration can help you to bring cohesion and control to how data and agents are used across your organization.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
While modern lakehouse architectures and open-table formats provide flexibility, they are often challenging to manage. Data layouts, clustering, and small files need to be managed for efficiency. Qbeast’s platform-independent patented muti-column indexing optimizes lakehouse data layout, accelerates queries, and sharply reduces compute cost — without disrupting existing architectures. Qbeast also handles high-cardinality clustering and supports incremental updates. Join us to explore how Qbeast enables efficient, scalable, AI-ready data infrastructure — reducing compute costs independent of data platform and compute engine.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 40 MIN
Discover how SAP Business Data Cloud and Databricks can transform your business by unifying SAP and non-SAP data for advanced analytics and AI. In this session, we’ll highlight Optimizing Cash Flow with AI with integrated diverse data sources and AI algorithms that enable accurate cash flow forecasting to help businesses identify trends, prevent bottlenecks, and improve liquidity. You’ll also learn about the importance of high-quality, well-governed data as the foundation for reliable AI models and actionable reporting. Key Takeaways: • How to integrate and leverage SAP and external data in Databricks • Using AI for predictive analytics and better decision-making • Building a trusted data foundation to drive business performance Leave this session with actionable strategies to optimize your data, enhance efficiency, and unlock new growth opportunities.
Type: LIGHTNING TALK
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 20 MIN
Companies need a lot of data to build and deploy AI models—and they want it quickly. To meet this demand, platform teams are quickly scaling their Databricks usage, resulting in excess cost driven by inefficiencies and performance anomalies. Capital One has over 4,000 users leveraging Databricks to power advanced analytics and machine learning capabilities at scale. In this talk, we’ll share lessons learned from optimizing our own Databricks usage while balancing lower cost with peak performance. Attendees will learn how to identify top sources of waste, best practices for cluster management, tips for user governance and methods to keep costs in check.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
Modern companies are managing more data than ever before, and the need to derive value from that data is becoming more urgent with AI. But AI adoption is often limited due to data security challenges, and adding to this complexity is the need to remain compliant with evolving regulation. At Capital One, we’ve deployed tokenization to further secure our data without compromising performance. In this talk, we’ll discuss lessons learned from our tokenization journey and show how companies can tokenize the data in their Databricks environment.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Companies need robust data management capabilities to build and deploy AI. Data needs to be easy to find, understandable, and trustworthy. And it’s even more important to secure data properly from the beginning of its lifecycle, otherwise it can be at risk of exposure during training or inference. Tokenization is a highly efficient method for securing data without compromising performance. In this session, we’ll share tips for managing high-quality, well-protected data at scale that are key for accelerating AI. In addition, we’ll discuss how to integrate visibility and optimization into your compute environment to manage the hidden cost of AI — your data.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, DLT
Skill Level: BEGINNER
Duration: 20 MIN
Priorities shift, requirements change, resources fluctuate, and the demands on data teams are only continuing to grow. Join this session, led by Coalesce Sales Engineering Director, Michael Tantrum, to hear about the most efficient way to deliver high quality data to your organization at the speed they need to consume it. Learn how to sidestep the common pitfalls of data development for maximum data team productivity.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS SQL, DLT
Skill Level: BEGINNER
Duration: 40 MIN
Understanding customer engagement and retention isn’t optional—it’s mission-critical. Join us for a live demo to see how you can build a scalable, governed customer health scoring model by transforming raw signals into actionable insights. Discover how Coalesce’s low-code development platform works seamlessly with Databricks’ lakehouse architecture to unify and operationalize customer data at scale. With built-in governance, automation, and metadata intelligence, you’ll deliver trusted scores that support proactive decision-making across the business. Why Attend?
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: RETAIL AND CPG - FOOD
Technologies: MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Cognizant developed a GenAI-driven market intelligence chatbot for RJR using Dash UI. This chatbot leverages Databricks Vector Search for vector embeddings and semantic search, along with the DBRX-Instruct LLM model to provide accurate and contextually relevant responses to user queries. The implementation involved loading prepared metadata into a Databricks vector database using the GTE model to create vector embeddings, indexing these embeddings for efficient semantic search, and integrating the DBRX-Instruct LLM into the chat system with prompts to guide the LLM in understanding and responding to user queries. The chatbot also generated responses containing URL links to dashboards with requested numerical values, enhancing user experience and productivity by reducing report navigation and discovery time by 30%. This project stands out due to its innovative AI application, advanced reasoning techniques, user-friendly interface, and seamless integration with MicroStrategy.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: MANUFACTURING
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Toyota, the world’s largest automaker, sought to accelerate time-to-data and empower business users with secure data collaboration for faster insights. Partnering with Cognizant, they established a Unified Data Lake, integrating SOX principles, Databricks Unity Catalog to ensure compliance and security. Additionally, they developed a Data Scanner solution to automatically detect non-sensitive data and accelerate data ingestion. Join this dynamic session to discover how they achieved it.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, APACHE ICEBERG, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Learn how Confluent simplifies real-time streaming of your SAP data into AI-ready Delta tables on Databricks. In this session, you'll see how Confluent’s fully managed data streaming platform—with unified Apache Kafka® and Apache Flink®—connects data from SAP S/4HANA, ECC, and 120+ other sources to enable easy development of trusted, real-time data products that fuel highly contextualized AI and analytics. With Tableflow, you can represent Kafka topics as Delta tables in just a few clicks—eliminating brittle batch jobs and custom pipelines. You’ll see a product demo showcasing how Confluent unites your SAP and Databricks environments to unlock ERP-fueled AI, all while reducing the total cost of ownership (TCO) for data streaming by up to 60%.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 20 MIN
The last major shift in data engineering came during the rise of the cloud, transforming how we store, manage, and analyze data. Today, we stand at the cusp of the next revolution: AI-driven data engineering. This shift promises not just faster pipelines, but a fundamental change in the way data systems are designed and maintained. AI will redefine who builds data infrastructure, automating routine tasks, enabling more teams to contribute to data platforms, and (if done right) freeing up engineers to focus on higher-value work. However, this transformation also brings heightened pressure around governance, risk, and data security, requiring new approaches to control and oversight. For those prepared, this is a moment of immense opportunity – a chance to embrace a future of smarter, faster, and more responsive data systems.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: DATABRICKS WORKFLOWS
Skill Level: BEGINNER
Duration: 20 MIN
With expensive contracts up for renewal, Evri faced the challenge of migrating 1,000 SAP HANA assets and 200+ Talend jobs to Databricks. This talk will cover how we transformed SAP HANA and Talend workflows into modern Databricks pipelines through AI-powered translation and validation -- without months of manual coding. We'll cover:- Techniques for handling SAP HANA's proprietary formats- Approaches for refactoring incremental pipelines while ensuring dashboard stability- The technology enabling automated translation of complex business logic- Validation strategies that guarantee migration accuracye'll share real examples of SAP HANA stored procedures transformed into Databricks code and demonstrate how we maintained 100% uptime of critical dashboards during the transition. Join us to discover how AI is revolutionizing what's possible in enterprise migrations from GUI-based legacy systems to modern, code-first data platforms.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
While Databricks powers your data lakehouse, DataHub delivers the critical context layer connecting your entire ecosystem. We'll demonstrate how DataHub extends Unity Catalog to provide comprehensive metadata intelligence across platforms. DataHub's real-time platform:*Cut AI model time-to-market with our unified REST and GraphQL APIs that ensure models train on reliable and compliant data from across platforms, with complete lineage tracking*Decrease data incidents by 60% using our event-driven architecture that instantly propagates changes across systems*Transform data discovery from days to minutes with AI-powered search and natural language interfaces.Leaders use DataHub to transform Databricks data into integrated insights that drive business value. See our demo of syncback technology—detecting sensitive data and enforcing Databricks access controls automatically—plus our AI assistant that enhances' LLMs with cross-platform metadata.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: MLFLOW, AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
In regulated industries like finance, agility can't come at the cost of compliance. Morgan Stanley found the answer in combining Dataiku and Databricks to create a governed, collaborative ecosystem for machine learning and predictive analytics. This session explores how the firm accelerated model development and decision-making, reducing time-to-insight by 50% while maintaining full audit readiness. Learn how no-code workflows empowered business users, while scalable infrastructure powered Terabyte-scale ML. Discover best practices for unified data governance, risk automation, and cross-functional collaboration that unlock innovation without compromising security. Ideal for data leaders and ML practitioners in regulated industries looking to harmonize speed, control, and value.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
AI agent systems hold immense promise for automating complex tasks and driving intelligent decision‑making, but only when they are engineered to be both resilient and transparent. In this session we will explore how Dataiku’s LLM Mesh pairs with Databricks Mosaic AI to streamline the entire lifecycle: ingesting and preparing data in the Lakehouse, prompt engineering LLMs hosted on Mosaic AI Model Serving Endpoints, visually orchestrating multi‑step chains, and monitoring them in real time. We’ll walk through a live demo of a Dataiku flow that connects to a Databricks hosted model, adds automated validation, lineage, and human‑in‑the‑loop review, then exposes the agent via Dataiku's Agent Connect interface. You’ll leave with actionable patterns for setting guardrails, logging decisions, and surfacing explanations—so your organization can deploy trustworthy domain‑specific agents faster & safer.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, MLFLOW, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us as we dive into how Turnpoint Services, in collaboration with DataNimbus, built an Intelligence Platform on Databricks in just 30 days. We'll explore features like MLflow, LLMs, MLOps, Model Registry, Unity Catalog & Dashboard Alerts that powered AI applications such as Demand Forecasting, Customer 360 & Review Automation. Turnpoint’s transformation enabled data-driven decisions, ops efficiency & a better customer experience. Building a modern data foundation on Databricks optimizes resource allocation & drives engagement. We’ll also introduce innovations in DataNimbus Designer: AI Blocks: modular, prompt-driven smart transformers for text data, built visually & deployed directly within Databricks. These capabilities push the boundaries of what's possible on the Databricks platform. Attendees will gain practical insights, whether you're beginning your AI journey or looking to accelerate it.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
The next era of data transformation has arrived. AI is enhancing developer workflows, enabling downstream teams to collaborate effectively through governed self-service. Additionally, SQL comprehension is producing detailed metadata that boosts developer efficiency while ensuring data quality and cost optimization. Experience this firsthand with dbt’s data control plane, a centralized platform that provides organizations with repeatable, scalable, and governed methods to succeed with Databricks in the modern age.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: MEDIA AND ENTERTAINMENT
Technologies: DBT, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 20 MIN
Riot Games reduced its Databricks compute spend and accelerated development cycles by transforming its data engineering workflows—migrating from bespoke Databricks notebooks and Spark pipelines to a scalable, testable, and developer-friendly dbt-based architecture. In this talk, members of the Developer Experience & Automation (DEA) team will walk through how they designed and operationalized dbt to support Riot’s evolving data needs.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, FINANCIAL SERVICES
Technologies: APACHE SPARK, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Enterprise lakehouse platforms are rapidly scaling – and so are complexity and cost. After monitoring over 1B vCore-hours across Databricks and other Apache Spark™ environments, we consistently saw resource waste, preventable data incidents, and painful troubleshooting. Join this session to discover how definity’s unique full-stack observability provides job-level visibility in-motion, unifying infrastructure performance, pipeline execution, and data behavior, and see how enterprise teams use definity to easily optimize jobs and save millions – while proactively ensuring SLAs, preventing issues, and simplifying RCA.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: HEALTH AND LIFE SCIENCES
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
In the rapidly evolving life sciences and healthcare industry, leveraging data-as-a-product is crucial for driving innovation and achieving business objectives. Join us to explore how Deloitte is revolutionizing data strategy solutions by overcoming challenges such as data silos, poor data quality, and lack of real-time insights with the Databricks Data Intelligence Platform. Learn how effective data governance, seamless data integration, and scalable architectures support personalized medicine, regulatory compliance, and operational efficiency. This session will highlight how these strategies enable biopharma companies to transform data into actionable insights, accelerate breakthroughs and enhance life sciences outcomes.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: MEDIA AND ENTERTAINMENT
Technologies: MLFLOW, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Deloitte is observing a growing trend among cybersecurity organizations to develop big data management and analytics solutions beyond traditional Security Information and Event Management (SIEM) systems. Leveraging Databricks to extend these SIEM capabilities, Deloitte can help clients lower the cost of cyber data management while enabling scalable, cloud-native architectures. Deloitte helps clients design and implement cybersecurity data meshes, using Databricks as a foundational data lake platform to unify and govern security data at scale. Additionally, Deloitte extends clients’ cybersecurity capabilities by integrating advanced AI and machine learning solutions on Databricks, driving more proactive and automated cybersecurity solutions. Attendees will gain insight into how Deloitte is utilizing Databricks to manage enterprise cyber risks and deliver performant and innovative analytics and AI insights that traditional security tools and data platforms aren’t able to deliver.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: PUBLIC SECTOR
Technologies: DELTA LAKE, AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Analyzing geospatial data has become a cornerstone of tackling many of today’s pressing challenges from climate change to resource management. However, storing and processing such data can be complex and hard to scale using common GIS packages. This talk explores how Deloitte and Databricks enable horizontally scalable geospatial analysis using delta lake, H3 integration and support for geospatial vector and raster data. We demonstrate how we have leveraged these capabilities for real-world applications in environmental monitoring and agriculture. In doing so, we cover end-to-end processing from ingestion, transformation and analysis to production of geospatial data products accessible by scientists and decision makers through standard GIS tools.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: RETAIL AND CPG - FOOD
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Nestlé USA, a division of the world’s largest food and beverage company, Nestlé S.A., has embarked on a transformative journey to unlock GenAI capabilities on their data platform. Deloitte, Databricks, and Nestlé have collaborated on a data platform modernization program to address gaps associated with Nestlé’s existing data platform. This joint effort introduces new possibilities and capabilities, ranging from development of advanced machine learning models, implementing Unity Catalog, and adopting Lakehouse Federation, all while adhering to confidentiality protocols. With help from Deloitte and Databricks, Nestlé USA is now able to meet its advanced enterprise analytics and AI needs with the Databricks Data Intelligence Platform.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: RETAIL AND CPG - FOOD
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 40 MIN
How does one of the world’s fastest-growing beauty brands stay ahead of Amazon’s complexity and scale retail with precision? At Sol de Janeiro, we built a real-time Amazon Operations Hub—powered by Databricks and DOMO—that drives decisions across inventory, profitability, and marketing ROI. See how the Databricks Lakehouse and DOMO dashboards work together to simplify workflows, surface actionable insights, and enable smarter decisions across the business—from frontline operators to the executive suite. In this session, you’ll get a behind-the-scenes look at how we unified trillions of rows from NetSuite, Amazon, Shopify, and carrier systems into a single source of truth. We’ll show how this hub streamlined cross-functional workflows, eliminated manual reporting, and laid the foundation for AI-powered forecasting and automation.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: MANUFACTURING
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 20 MIN
In today’s logistics landscape, operational continuity depends on real time awareness and proactive decision making. This session presents an AI agent driven solution built on Databricks that transforms real time fleet IoT data into autonomous workflows. Streaming telemetry such as bearing vibration data is ingested and analyzed using FFT to detect anomalies. When a critical pattern is found, an AI agent diagnoses root causes and simulates asset behavior as a digital twin, factoring in geolocation, routing, and context. The agent then generates a corrective strategy by identifying service sites, skilled personnel, and parts, estimating repair time, and orchestrating reroutes. It evaluates alternate delivery vehicles and creates transfer plans for critical shipments. The system features human AI collaboration, enabling teams to review and execute plans. Learn how this architecture reduces downtime and drives resilient, adaptive fleet management.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 20 MIN
Domo's Databricks integration seamlessly connects business users to both Delta Lake data and AI/ML models, eliminating technical barriers while maximizing performance. Domo's Cloud Amplifier optimizes data processing through pushdown SQL, while the Domo AI Services layer enables anyone to leverage both traditional ML and large language models directly from Domo. During this session, we’ll explore an AI solution around fraud detection to demonstrate the power of leveraging Domo on Databricks.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENERGY AND UTILITIES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Data residency laws and legal mandates are driving the need for lakehouses across public and private clouds. This sprawl threatens centralized governance and compliance, while impacting cost, performance, and analytics/AI functionality. This session shows how e6data extends Unity Catalog across hybrid environments for consistent policy enforcement and query execution—regardless of data location—with guarantees around network egress, entitlements, performance, scalability, and cost. Learn how e6data’s “zero-data movement” philosophy powers a cost- and latency-optimized, location-aware architecture. We’ll cover onboarding strategies for hybrid fleets that enforce data movement restrictions and stay close to the data for better performance and lower cost. Discover how a location-aware compute strategy enables hybrid lakehouses with four key value metrics: cross-platform functionality, governed access, low latency, and total cost of ownership.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: PROFESSIONAL SERVICES
Technologies: AI/BI, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
In an era where data drives strategic decision-making, organizations must adapt to the evolving landscape of business analytics. This session will focus on three pivotal themes shaping the future of data management and analytics in 2025. Join our panel of experts, including a Business Analytics Leader, Head of Information Governance, and Data Science Leader, as they explore: - Knowledge-Powered AI: Discover trends in Knowledge-Powered AI and how these initiatives can revolutionize business analytics, with real-world examples of successful implementations. - Information Governance: Explore the role of information governance in ensuring data integrity and compliance. Our experts will discuss strategies for establishing robust frameworks that protect organizational assets. - Real-Time Analytics: Understand the importance of real-time analytics in today’s fast-paced environment. The panel will highlight how organizations can leverage real-time data for agile decision-making.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING, PROFESSIONAL SERVICES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
In the rapidly evolving landscape of pharmaceuticals, the integration of AI and GenAI is transforming how organizations operate and deliver value. We will explore the profound impact of the AI program at Takeda Pharmaceuticals and the central role of Databricks. We will delve into eight pivotal AI/GenAI use cases that enhance operational efficiency across commercial, R&D, manufacturing, and back-office functions, including these capabilities:
Type: SPECIAL INTEREST
Track: N/A
Industry: N/A
Technologies: N/A
Skill Level: N/A
Duration: 135 MIN
No description available.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY, MANUFACTURING
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Xoople aims to provide its users with trusted AI-Ready Earth data and accelerators that unlock new insights for enterprise AI. With access to scientific-grade Earth data that provides spatial intelligence on real-world changes, data scientists and BI analysts can increase forecast accuracy for their enterprise processes and models. These improvements drive smarter, data-driven business decisions across various business functions, including supply chain, finance, and risk across industries. Xoople, which has recently introduced their product, Enterprise AI-Ready Earth Data™, on the Databricks Marketplace, will have their CEO, Fabrizio Pirondini, discuss the importance of the Databricks Data Intelligence Platform in making Xoople’s product a reality for use in the enterprise.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: APACHE ICEBERG, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
Open table formats such as Apache Iceberg or Delta Lake have transformed the data landscape. For the first time, we’re seeing a real open storage ecosystem emerging across database vendors. So far, open table formats have found little adoption powering low-latency, high-concurrency analytics use-cases. Data stored in open formats often gets transformed and ingested into closed systems for serving. The reason for this is simple: most modern query engines don’t properly support these workloads. In this talk we take a look under the hood of Firebolt and dive into the work we’re doing to support low-latency and high concurrency on Iceberg: caching of data and metadata, adaptive object storage reads, subresult reuse, and multi-dimensional scaling. After this session, you will know how you can build low-latency data applications on top of Iceberg. You’ll also have a deep understanding of what it takes for modern high-performance query engines to do well on these workloads.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Retrieval-augmented generation (RAG) has transformed AI applications by grounding responses with external data. It can be better. By pairing RAG with low latency SQL analytics, you can enrich responses with instant insights, leading to a more interactive and insightful user experience with fresh, data-driven intelligence. In this talk, we’ll demo how low latency SQL combined with an AI application can deliver speed, accuracy, and trust.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: MANUFACTURING
Technologies: DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 20 MIN
Dropbox, a leading cloud storage platform, is on a mission to accelerate data insights to better understand customers’ needs and elevate the overall customer experience. By leveraging Fivetran’s data movement platform, Dropbox gained real-time visibility into customer sentiment, marketing ROI, and ad performance-empowering teams to optimize spend, improve operational efficiency, and deliver greater business outcomes.Join this session to learn how Dropbox:- Cut data pipeline time from 8 weeks to 30 minutes by automating ingestion and streamlining reporting workflows.- Enable real-time, reliable data movement across tools like Zendesk Chat, Google Ads, MySQL, and more — at global operations scale.- Unify fragmented data sources into the Databricks Data Intelligence Platform to reduce redundancy, improve accessibility, and support scalable analytics.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: RETAIL AND CPG - FOOD
Technologies: DATABRICKS WORKFLOWS
Skill Level: BEGINNER
Duration: 40 MIN
Organizations have hundreds of data sources, some of which are very niche or difficult to access. Incorporating this data into your lakehouse requires significant time and resources, hindering your ability to work on more value-add projects. Enter the Fivetran Connector SDK- a powerful new tool that enables your team to create custom pipelines for niche systems, custom APIs, and sources with specific data filtering requirements, seamlessly integrating with Databricks. During this session, Fivetran will demonstrate how to (1) Leverage the Connector SDK to build scalable connectors, enabling the ingestion of diverse data into Databricks (2) Gain flexibility and control over historical and incremental syncs, delete capture, state management, multithreading data extraction, and custom schemas (3) Utilize practical examples, code snippets, and architectural considerations to overcome data integration challenges and unlock the full potential of your Databricks environment.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
LLM agents often drift into failure when prompts, retrieval, external data, and policies interact in unpredictable ways. This technical session introduces a repeatable, metric-driven framework for detecting, diagnosing, and correcting these undesirable behaviors in agentic systems at production scale. We demonstrate how to instrument the agent loop with fine-grained signals—tool-selection quality, error rates, action progression, latency, and domain-specific metrics—and send them into an evaluation layer (e.g. Galileo). This telemetry enables a virtuous cycle of system improvement. We present a practical example of a stock-trading system and show how brittle retrieval and faulty business logic cause undesirable behavior. We refactor prompts, adjust the retrieval pipeline—verifying recovery through improved metrics. Attendees will learn how to: add observability with minimal code change, pinpoint root causes via tracing, and drive continuous, metric-validated improvement.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENERGY AND UTILITIES, MANUFACTURING, PROFESSIONAL SERVICES
Technologies: DELTA LAKE, MLFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
How do you transform legacy data into a launchpad for next-gen innovation? GE Vernova is tackling it by rapidly migrating from outdated platforms to Databricks, building one of the world’s largest cloud data implementations. This overhaul wasn’t optional. Scaling AI, cutting technical debt, and slashing license costs demanded a bold, accelerated approach. Led by strategic decisions from the CDO and powered by Genpact’s AI Gigafactory, the migration is tackling 35+ Business and sub domains, 60,000+ data objects, 15,000+ jobs, 3000+ reports from 120+ diverse data sources to deliver a multi-tenant platform with unified governance. The anticipated results? Faster insights, seamless data sharing, and a standardized platform built for AI at scale. This session explores how Genpact and Databricks are fueling GE Vernova’s mission to deliver The Energy to Change the World—and what it takes to get there when speed, scale, and complexity are non-negotiable.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
This session unveils Google Cloud's Agent2Agent (A2A) protocol, ushering in a new era of AI interoperability where diverse agents collaborate seamlessly to solve complex enterprise challenges. Join our panel of experts to discover how A2A empowers you to deeply integrate these collaborative AI systems with your existing enterprise data, custom APIs, and critical workflows. Ultimately, learn to build more powerful, versatile, and securely managed agentic ecosystems by combining specialized Google-built agents with your own custom solutions (Vertex AI or no-code). Extend this ecosystem further by serving these agents with Databricks Model Serving and governing them with Unity Catalog for consistent security and management across your enterprise.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA LAKE, AI/BI, DATABRICKS WORKFLOWS
Skill Level: BEGINNER
Duration: 20 MIN
Enterprise customers need a powerful and adaptable data foundation to navigate demands of AI and multi-cloud environments. This session dives into how Google Cloud Storage serves as a unified platform for modern analytics data lakes, together with Databricks. Discover how Google Cloud Storage provides key innovations like performance optimizations for Apache Iceberg, Anywhere Cache as the easiest way to colocate storage and compute, Rapid Storage for ultra low latency object reads and appends, and Storage Intelligence for vital data insights and recommendations. Learn how you can optimize your infrastructure to unlock the full value of your data for AI-driven success.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 20 MIN
Elevate your AI initiatives on Databricks by harnessing the latest advancements in Google Cloud's Gemini models. Learn how to integrate Gemini's built-in reasoning and powerful development tools to build more dynamic and intelligent applications within your existing Databricks platform. We'll explore concrete ideas for agentic AI solutions, showcasing how Gemini can help you unlock new value from your data in Databricks.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS WORKFLOWS, DLT
Skill Level: BEGINNER
Duration: 40 MIN
Maximize the performance of your Databricks Platform with innovations on Google Cloud. Discover how Google's Arm-based Axion C4A virtual machines (VMs) deliver breakthrough price-performance and efficiency for Databricks, supercharging Databricks Photon engine. Gain actionable strategies to optimize your Databricks deployments on Google Cloud.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, DATABRICKS WORKFLOWS, LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 20 MIN
Global Data at Scale: Powering Front Office Transformation with DatabricksJoin KPMG for an engaging session on how we transformed our data platform and built a cutting-edge Global Data Store (GDS)—a game-changing data hub for our Front Office Transformation (FOT). Discover how we seamlessly unified data from various member firms, turning it into a dynamic engine for and enabled our business to leverage our Front Office ecosystem to enable smarter analytics and decision-making. Learn about our unique approach that rapidly integrates diverse datasets into the GDS and our hub-and-spoke model, connecting member firms’ data lakes, enabling secure, high-speed collaboration via Delta Sharing. Hear how we are leveraging Unity Catalog to help ensure data governance, compliance, and straight forward data lineage. We’ll share strategies for risk management, security (fine-grained access, encryption), and scaling a cloud-based data ecosystem.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 20 MIN
XP is one of the largest financial institutions in Brazil– and they didn’t reach that scale by moving slowly. After getting inspired at Data + AI Summit in 2024, they sprang into action to overhaul their advertising strategy using their first-party data in Databricks. Just a year later, they’ve achieved remarkable success: they’ve unlocked $66 million in incremental revenue from advertising, with the same budget as before. In this session, XP will share the tactical steps they took to bring a first-party data and AI strategy to customer acquisition– including how they built predictive models for customer quality and connected Databricks to their ecosystem through Hightouch’s Composable CDP. If you’re supporting an advertising team or are looking for real strategies to take home from this conference that can transform your business: this session is for you.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, AI/BI
Skill Level: BEGINNER
Duration: 40 MIN
With 75M+ Treats Rewards members, PetSmart knows how to build loyalty with pet parents. But recently, traditional email testing and personalization strategies weren’t delivering the engagement and growth they wanted—especially in the Salon business. This year, they replaced their email calendar and A/B testing with AI Decisioning, achieving a +22% incremental lift in bookings. Join Bradley Breuer, VP of Marketing – Loyalty, Personalization, CRM, and Customer Analytics, to learn how his team reimagined CRM using AI to personalize campaigns and dynamically optimize creative, offers, and timing for every unique pet parent. Learn: How PetSmart blends human insight and creativity with AI to deliver campaigns that engage and convert. How they moved beyond batch-and-blast calendars with AI Decisioning Agents to optimize sends—while keeping control over brand, messaging, and frequency. How using Databricks as their source of truth led to surprising learnings and better outcomes.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
As AI adoption accelerates, unstructured data has emerged as a critical—yet often overlooked—asset for building accurate, trustworthy AI agents. But preparing and governing this data at scale remains a challenge. Traditional data integration and RAG approaches fall short. In this session, discover how IBM enables AI agents grounded in governed, high-quality unstructured data. Learn how our unified data platform streamlines integration across batch, streaming, replication, and unstructured sources—while accelerating data intelligence through built-in governance, quality, lineage, and data sharing. But governance doesn’t stop at data. We’ll explore how AI governance extends oversight to the models and agents themselves. Walk away with practical strategies to simplify your stack, strengthen trust in AI outputs, and deliver AI-ready data at scale.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: AI/BI, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
As AI, internal data marketplaces, and self-service access become more popular, data teams must rethink how they securely govern and provision data at scale. Success depends on provisioning data in a way that balances security, compliance, and innovation, and promotes data-driven decision making when decision makers are AI Agents. In this session, we'll discuss how you can:- Launch and manage effective and secure data provisioning- Secure your AI initiatives- Scale your Data Governors through Agentic AIJoin us to learn how to navigate the complexities of modern data environments, and start putting your data to work faster.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENERGY AND UTILITIES
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
HR departments increasingly rely on data to improve workforce planning and experiences. However, managing and getting value from this data can be challenging, especially given the complex technology landscape and the need to ensure data security and compliance. Shell has placed a high priority on safeguarding its people data while empowering its HR department with the tools and access they need to make informed decisions. This session will explore the transformation of Shell's Central Data Platform, starting with their HR use case. You’ll hear about:- The role of automation and data governance, quality, and literacy in Shell’s strategy.- Why they chose Databricks and Immuta for enhanced policy-based access control.- The future for Shell and their vision for a data marketplace to truly embrace a culture of global data sharing.The result? A robust, scalable HR Data Platform that is securely driving a brighter future for Shell and its employees.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: MANUFACTURING, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Migrating legacy workloads to a modern, scalable platform like Databricks can be complex and resource-intensive. Impetus, an Elite Databricks Partner and the Databricks Migration Partner of the Year 2024, simplifies this journey with LeapLogic, an automated solution for data platform modernization and migration services. LeapLogic intelligently discovers, transforms, and optimizes workloads for Databricks, ensuring minimal risk and faster time-to-value. In this session, we’ll showcase real-world success stories of enterprises that have leveraged Impetus’ LeapLogic to modernize their data ecosystems efficiently. Join us to explore how you can accelerate your migration journey, unlock actionable insights, and future-proof your analytics with a seamless transition to Databricks.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA LAKE, AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
As a leading personalized product retailer, Shutterfly needed a modern, secure, and performant data foundation to power GenAI-driven customer experiences. However, their existing stack was creating roadblocks in performance, governance, and machine learning scalability. In partnership with Impetus, Shutterfly embarked on a multi-phase migration to Databricks Unity Catalog. This transformation not only accelerated Shutterfly’s ability to provide AI-driven personalization at scale but also improved governance, reduced operational overhead, and laid a scalable foundation for GenAI innovation. Join experts from Databricks, Impetus, and Shutterfly to discover how this collaboration enabled faster data-driven decision-making, simplified compliance, and unlocked the agility needed to meet evolving customer demands in the GenAI era. Learn from their journey and take away best practices for your own modernization efforts.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Join this 20-minute session to learn how Informatica CDGC integrates with and leverages Unity Catalog metadata to provide end-to-end governance and security across an enterprise data landscape. Topics covered will include:
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
As enterprises continue their journey to the cloud, data warehouse and data management modernization is essential to optimize analytics and drive business outcomes. Minimizing modernization timelines is important for reducing risk and shortening time to value – and ensuring enterprise data is clean, curated and governed is imperative to enable analytics and AI initiatives. In this session, learn how Informatica's Intelligent Data Management Cloud (IDMC) empowers analytics and AI on Databricks by helping data teams: · Develop no-code/low-code data pipelines that ingest, transform and clean data at enterprise scale · Improve data quality and extend enterprise governance with Informatica Cloud Data Governance and Catalog (CDGC) and Unity Catalog · Accelerate pilot-to-production with Mosaic AI
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Supercharge advanced analytics and AI insights on Databricks with accurate and consistent master data. This session explores how Informatica’s Master Data Management (MDM) integrates with Databricks to provide high-quality, integrated golden record data like customer, supplier, product 360 or reference data to support downstream analytics, Generative AI and Agentic AI. Enterprises can accelerate and de-risk the process of creating a golden record via a no-code/low-code interface, allowing data teams to quickly integrate siloed data and create a complete and consistent record that improves decision-making speed and accuracy.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD
Technologies: AI/BI, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Agentic AI has the power to revolutionize mission-critical domains. Yet this journey is not without its challenges – the inevitable barriers, frustrations, and setbacks that mark all progress. This session dives into how Infosys Topaz helps enterprises to strategically implement AI at scale in as little as two months to personalize customer journeys, optimize operations, and unlock new revenue streams. Learn how different enterprises have architected foundation capabilities such as Agentic AI factory to build & accommodate hundreds or even thousands of intelligent agents, setting up data fingerprinting and data harvesting to make enterprise Data ready for AI and ensuring interoperability among diverse AI systems
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Agentic AI and multimodal data are the next frontiers for realizing intelligent and autonomous business systems. Learn how Infosys innovates with Databricks for accelerating data to AI agent journey at scale across an enterprise. Hear our pragmatic capability driven approach instead of use case-based approach to bring the data universe, AI foundations, agent management, data and AI governance and collaboration under unified management.
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 20 MIN
Insight will explore a multi-agent system built with LangGraph designed to alleviate the challenges faced by data analysts inundated with requests from business users. This innovative solution empowers users who lack SQL skills to easily access insights from specific Unity Catalog datasets. Discover how the Unity Catalog Agent Assistant streamlines data requests, enhances collaboration, and ultimately drives better decision-making across your organization.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, AI/BI, DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 40 MIN
In highly regulated industries like financial services, maintaining data quality is an ongoing challenge. Reactive measures often fail to prevent regulatory penalties, causing inaccuracies in reporting and inefficiencies due to poor data visibility. Regulators closely examine the origins and accuracy of reporting calculations to ensure compliance. A robust system for data quality and lineage is crucial. Organizations are utilizing Databricks to proactively improve data quality through rules-based and AI/ML-driven methods. This fosters complete visibility across IT, data management, and business operations, facilitating rapid issue resolution and continuous data quality enhancement. The outcome is quicker, more accurate, transparent financial reporting. We will detail a framework for data observability and offer practical examples of implementing quality checks throughout the data lifecycle, specifically focusing on creating data pipelines for regulatory reporting,
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: LLAMA, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Today’s supply chains demand more than historical insights–they need real-time intelligence. In this actionable session, discover how leading enterprises are unlocking the full potential of their SAP data by integrating it with Databricks and AI. See how CPG companies are transforming supply chain planning by combining SAP ERP data with external signals like weather and transportation data–enabling them to predict disruptions, optimize inventory, and make faster, smarter decisions. Powered by Databricks, this solution delivers true agility and resilience through a unified data architecture. Join us to learn how: You can eliminate SAP data silos and make them ML and AI-ready at scale External data sources amplify SAP use cases like forecasting and scenario planning AI-driven insights accelerate time-to-action across supply chain operations Whether you're just starting your data modernization journey or seeking ROI from SAP analytics, this session will show you what’s possible.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, MANUFACTURING
Technologies: MLFLOW, AI/BI, MOSAIC AI
Skill Level: ADVANCED
Duration: 40 MIN
Discover how Xcel Energy and Lovelytics leveraged the power of geospatial analytics and GenAI to tackle one of the energy sector’s most pressing challenges—wildfire prevention. Transitioning from manual processes to automated GenAI unlocked transformative business value, delivering over 3x greater data coverage, over 4x improved accuracy, and 64x faster processing of geospatial data. In this session, you'll learn how Databricks empowers data leaders to transform raw data, like location information and visual imagery, into actionable insights that save costs, mitigate risks, and enhance customer service. Walk away with strategies for scaling geospatial workloads efficiently, building GenAI-driven solutions, and driving innovation in energy and utilities.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, PROFESSIONAL SERVICES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
As enterprises strive to become more data-driven, SAP continues to be central to their operational backbone. However, traditional SAP ecosystems often limit the potential of AI and advanced analytics due to fragmented architectures and legacy tools. In this session, we explore four strategic options for unlocking greater value from SAP data by integrating with Databricks and cloud-native platforms. Whether you're on ECC, S4HANA, or transitioning from BW, learn how to modernize your data landscape, enable real-time insights, and power AI/ML at scale. Discover how SAP Business Data Cloud and SAP Databricks can help you build a unified, future-ready data and analytics ecosystem—without compromising on scalability, flexibility, or cost-efficiency.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, MLFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
In today's fast-paced digital landscape, context is everything. Decisions made without understanding the full picture often lead to missed opportunities or suboptimal outcomes. Powering contextualized intelligence is at the heart of MathCo’s proprietary platform — NucliOS, a Databricks-Native Platform leveraging Databricks features across the data lifecycle like Unity Catalog, Delta Lake, MLFlow, and Notebooks. Join this session to discover how NucliOS reimagines the data journey end-to-end: from data discovery and preparation to advanced analysis, dynamic visualization, and scenario modeling, all the way through to operationalizing insights within business workflows. At every step, intelligent agents act in concert, accelerating innovation and delivering speed at scale.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: LLAMA
Skill Level: INTERMEDIATE
Duration: 40 MIN
Dive into the latest Llama 4 models. See for yourself how to unleash the power of Llama models and achieve next level performance with our curated set of practical tools, techniques and recipes. Join us as we dive into the world of Llama models, exploring their capabilities, developer tools, and exciting use cases. Discover how these innovative models are transforming industries and improving performance in real-world applications.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS WORKFLOWS, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us for this insightful session to learn how you can leverage the power of the Microsoft ecosystem along with Azure Databricks to take your business to the next level. Azure Databricks is a fully integrated, native, first-party solution on Microsoft Azure. Databricks and Microsoft continue to actively collaborate on product development, ensuring tight integration, optimized performance, and a streamlined support experience. Azure Databricks offers seamless integrations with Power BI, Azure Open AI, Microsoft Purview, Azure Data Lake Storage (ADLS) and Foundry. In this session, you’ll learn how you can leverage deep integration between Azure Databricks and the Microsoft solutions to empower your organization to do more with your data estate. You’ll also get an exclusive sneak peek into the product roadmap.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: TRAVEL AND HOSPITALITY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
American Airlines, one of the largest airlines in the world, processes a tremendous amount of data every single minute. With a data estate of this scale, accountability for the data goes beyond the data team; the business organization has to be equally invested in championing the quality, reliability, and governance of data. In this session, Andrew Machen, Senior Manager, Data Engineering at American Airlines will share how his team maximizes resources to deliver reliable data at scale. He'll also outline his strategy for aligning business leadership with an investment in data reliability, and how leveraging Monte Carlo's data + AI observability platform enabled them to reduce time spent resolving data reliability issues from 10 weeks to 2 days, saving millions of dollars and driving valuable trust in the data.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Your model is trained. Your pilot is live. Your data looks AI-ready. But for most teams, the toughest part of building successful AI starts after deployment. In this talk, Shane Murray and Ethan Post share lessons from the development of Monte Carlo’s Troubleshooting Agent – an AI assistant that helps users diagnose and fix data issues in production. They’ll unpack what it really takes to build and operate trustworthy AI systems in the real world, including: The Illusion of Done – Why deployment is just the beginning, and what breaks in production; Lessons from the Field – A behind-the-scenes look at the architecture, integration, and user experience of Monte Carlo’s agent; Operationalizing Reliability – How to evaluate AI performance, build the right team, and close the loop between users and model. Whether you're scaling RAG pipelines or running LLMs in production, you’ll leave with a playbook for building data and AI systems you—and your users—can trust.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: AI/BI, PARTNER CONNECT
Skill Level: INTERMEDIATE
Duration: 20 MIN
Learn how visionaries from the world’s leading organizations use Moveworks to give employees a single place to find information, automate tasks, and be more productive. See the Moveworks AI Assistant in action and experience how its reasoning-based architecture allows it to be a one-stop-shop for all employee requests (across IT, HR, finance, sales, and more), how Moveworks empowers developers to easily build new AI agents atop this architecture, and how we give stakeholders tools to implement effective AI governance. Finally, experience how customers and partners alike leverage information in Databricks to supplement their employees' AI journeys.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Enterprise-grade GenAI needs a unified data strategy for accurate, reliable results. Learn how knowledge graphs make structured and unstructured data AI-ready while enabling governance and transparency. See how GraphRAG (retrieval-augmented generation with knowledge graphs) drives real success: Learn how companies like Klarna have deployed GenAI to build chatbots grounded in knowledge graphs, improving productivity and trust, while a major gaming company achieved 10x faster insights. We’ll share real examples and practical steps for successful GenAI deployment.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: APACHE SPARK, AI/BI, APACHE ICEBERG
Skill Level: INTERMEDIATE
Duration: 20 MIN
You already see the value of the lakehouse. But are you truly maximizing its potential across all workloads, from BI to AI? In this session, Onehouse unveils how our open lakehouse architecture unifies your entire stack, enabling true interoperability across formats, catalogs, and engines. From lightning-fast ingestion at scale to cost-efficient processing and multi-catalog sync, Onehouse helps you go beyond trade-offs. Discover how Apache XTable (Incubating) enables cross-table-format compatibility, how OpenEngines puts your data in front of the best engine for the job, and how OneSync keeps data consistent across Snowflake, Athena, Redshift, BigQuery, and more. Meanwhile, our purpose-built lakehouse runtime slashes ingest and ETL costs. Whether you’re delivering BI, scaling AI, or building the next big thing, you need a lakehouse that’s open and powerful. Onehouse opens everything—so your data can power anything.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: HEALTH AND LIFE SCIENCES, MEDIA AND ENTERTAINMENT, FINANCIAL SERVICES
Technologies: MLFLOW, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Customer data is an organization's most valuable asset. It is also the hardest to govern and use in a dynamic business environment. Consumers can revoke their consent in an instant, regulations continue to grow, and internal data policies change. Most troubling is when cross-functional teams question whether, when, and how they can use customer data. How does an organization—let alone a data governance team and its stakeholders—manage this data and policy fragmentation, while enabling data use? Join product leaders from OneTrust as they explore new data governance practices and technologies for delivering AI-ready data. We’ll demo an integration that orchestrates data policy enforcement through Unity Data Catalog and the OneTrust Data Use Governance solution. Understand how this new offering in addition with OneTrust’s solutions for Consent & Preferences and AI Governance align your data governance & compliance initiatives for AI innovation.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, TRAVEL AND HOSPITALITY
Technologies: AI/BI
Skill Level: BEGINNER
Duration: 20 MIN
Behind every powerful AI system lies a critical foundation: fresh, high-quality web data. This session explores the symbiotic relationship between web scraping and artificial intelligence that's transforming how technical teams build data-intensive applications. We'll showcase how this partnership enables crucial use cases: analyzing trends, forecasting behaviors, and enhancing AI models with real-time information. Technical challenges that once made web scraping prohibitively complex are now being solved through the very AI systems they help create. You'll learn how machine learning revolutionizes web data collection, making previously impossible scraping projects both feasible and maintainable, while dramatically reducing engineering overhead and improving data quality. Join us to explore this quiet but critical partnership that's powering the next generation of AI applications.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY, MANUFACTURING
Technologies: AI/BI, DATABRICKS WORKFLOWS, DLT
Skill Level: INTERMEDIATE
Duration: 20 MIN
Join Sandy Steiger, Head of Advanced Analytics & Automation (formerly at TQL), as she walks through how her team tackled one of the most common and least talked about problems in data teams: report bloat, data blind spots, and broken trust with the business. You’ll learn how TQL went from 3,000 reports to fewer than 500 while gaining better visibility, faster data issue resolution, and cloud agility through practical use of lineage, automated detection, and surprising outcomes from implementing Pantomath (an automated data operations platform). Sandy will share how her team identified upstream issues (before Microsoft did), avoided major downstream breakages, and built the credibility every data team needs to earn trust from the business. Walk away with a playbook for using automation to drive smarter, faster decisions across your organization.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: ENERGY AND UTILITIES, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: DELTA LAKE, AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
AI initiatives often stall when data teams can’t keep up with business demand for ad hoc, self-service data. Whether it’s AI agents, BI tools, or business users—everyone needs data immediately, but the pipeline-centric modern data stack is not built for this scale of agility. Promethium enables the data teams to generate instant, contextual data products called Data Answers based on rapid, exploratory questions from the business. Data Answers empower data teams for AI-scale collaboration with the business. We will demo Promethium’s new agent capability to build data answers on Databricks for self-service data. The Promethium agent leverages and extends Genie with context from other enterprise data and applications to ensure accuracy and relevance.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
Are data teams ready for AI? Prophecy’s exclusive survey, “The Impact of GenAI on Data Teams”, gives the clearest picture yet of GenAI’s potential in data management, and what’s standing in the way. The top two obstacles? Poor governance and slow access to high-quality data. The message is clear: Modernizing your data platform with Databricks is essential. But it’s only the beginning. To unlock the power of AI and analytics, organizations must deliver governed, self-service access to clean, trusted data. Traditional data prep tools introduce risks around security, quality, and cost. It’s no wonder data leaders cited data transformation as the area where GenAI will make the biggest impact. To deliver what’s needed teams need a shift to governed self-service. Data analysts and scientists move fast while staying within IT’s guardrails. Join us to learn more details from the survey and how leading organizations are ahead of the curve, using GenAI to reshape how data gets done.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Still coding data transformations by hand? Struggling with rigid, proprietary data prep tools? AI agents are flipping the script, reshaping data teams and delivering production-ready data preparation. Join this session to see how analysts, data scientists, and data engineers can build powerful, production-ready data pipelines simply by describing their intent in natural language. All in under 7 minutes. No complex UI or coding is required. Select datasets, join tables, apply filters, perform calculations - all just by chatting - and watch the pipeline materialize in real time, ready for deployment with documentation, testing, lineage, and versioning. Ready to leave slow, traditional data prep behind and be part of the next wave of innovation? You won’t want to miss this session.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: HEALTH AND LIFE SCIENCES
Technologies: APACHE SPARK, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Industry-standard data formats can streamline data exchange, but they come with significant complexity and can seem ‘impossible’. One example is FHIR (Fast Healthcare Interoperability Resources), the healthcare data standard with a broad scope of 180 entities. With its widespread adoption, FHIR promises to speed delivering insights to improve patient experiences, optimize operations, and drive better outcomes. Yet working with FHIR data is no small task. Its complexity and extensions push it beyond the reach of most analysts. Instead, it lands on data engineers, who must load, transform, and keep up with changes. This creates bottlenecks, slows down insights, and places a heavy maintenance burden on already backlogged data engineering teams. Learn how to tame FHIR for analysts - if you want to speed time-to-insight, reduce engineering effort, and enable true cross-functional collaboration on industry-specific data - this is a session you don’t want to miss.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: MANUFACTURING, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: APACHE ICEBERG, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Explore how to build use case-specific data products designed to power everything from traditional BI dashboards to machine learning and LLM-enabled applications. Gain an understanding of what data products are and why they are essential for delivering AI-ready data that is integrated, timely, high-quality, secure, contextual, and easily consumable. Discover strategies for unlocking business data from source systems to enable analytics and AI use cases, with a deep dive into the three-tiered data product architecture: the Data Product Engineering Plane (where data engineers ingest, integrate, and transform data), the Data Product Management Plane (where teams manage the full lifecycle of data products), and the Data Product Marketplace Plane (where consumers search for and use data products). Discover how a flexible, composable data architecture can support organizations at any stage of their data journey and drive impactful business outcomes.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, FINANCIAL SERVICES
Technologies: MLFLOW, DATABRICKS SQL, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
Join us for this session on how to build AI finance agents with Databricks and LangChain. This session introduces a powerful approach to building AI agents by combining a modular framework that integrates LangChain, retrieval-augmented generation (RAG), and Databricks' unified data platform to build intelligent, adaptable finance agents. We’ll walk through the architecture and key components, including Databricks Unity Catalog, ML Flow, and Mosaic AI involved in building a system tailored for complex financial tasks like portfolio analysis, reporting automation, and real-time risk insights. We’ll also showcase a demo of one such agent in action - a Financial Analyst Agent. This agent emulates the expertise of a seasoned data analyst, delivering in-depth analysis in seconds - eliminating the need to wait hours or days for manual reports. The solution provides organizations with 24/7 access to advanced data analysis, enabling faster, smarter decision-making.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: APACHE ICEBERG
Skill Level: INTERMEDIATE
Duration: 20 MIN
In this talk, we’ll walk through a complete real-time IoT architecture—from an economical, high-powered ESP32 microcontroller publishing environmental sensor data to AWS IoT, through Redpanda Connect into a Redpanda BYOC cluster, and finally into Apache Iceberg for long-term analytical storage. Once the data lands, we’ll query it using Python and perform linear regression with Prophet to forecast future trends. Along the way, we’ll explore the design of a scalable, cloud-native pipeline for streaming IoT data. Whether you're tracking the weather or building the future, this session will help you architect with confidence—and maybe even predict it.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Enterprises need AI agents that are both powerful and production-ready while being scalable and secure. In this lightning session, you’ll learn how to leverage Retool’s platform and Databricks to design, deploy, and manage intelligent agents that automate complex workflows. We’ll cover best practices for integrating real-time Databricks data, enforcing governance, and ensuring scalability all while avoiding common pitfalls. Whether you’re automating internal ops or customer-facing tasks, walk away with a blueprint for shipping AI agents that actually work in the real world.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 20 MIN
Despite the proliferation of cloud data warehousing, BI tools, and AI, spreadsheets are still the most ubiquitous data tool. Business teams in finance, operations, sales, and marketing often need to analyze data in the cloud data warehouse but don't know SQL and don't want to learn BI tools. AI tools offer a new paradigm but still haven't broadly replaced the spreadsheet. With new AI tools and legacy BI tools providing business teams access to data inside Databricks, security and governance are put at risk. In this session, Row Zero CEO, Breck Fresen, will share examples and strategies data teams are using to support secure spreadsheet analysis at Fortune 500 companies and the future of spreadsheets in the world of AI. Breck is a former Principal Engineer from AWS S3 and was part of the team that wrote the S3 file system. He is an expert in storage, data infrastructure, cloud computing, and spreadsheets.
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
See how Agentforce connects with Databricks to create a seamless, intelligent workspace. With zero copy integration, users can access real-time Databricks data without moving or duplicating it. Explore how Agentforce automatically delegates tasks to a Databricks agent and enables end-to-end execution without leaving the flow of work,, eliminating the swivel-chair effect. Together, these capabilities power a unified, cross-platform experience that drives faster decisions and smarter outcomes.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Empower AI and agents with trusted data and metadata from an end-to-end unified system. Discover how Salesforce Data Cloud, Agentforce, and Databricks work together to fuel automation, AI, and analytics through a unified data strategy—driving real-time intelligence, enabling zero-copy data sharing, and unlocking scalable activation across the enterprise.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 20 MIN
The explosion of AI has helped make the enterprise data landscape more important, and complex, than ever before. Join us to learn how Databricks’ and Tableau’s platforms come together to empower users of all kinds to see, understand, and act on your data in a secure, governed, and performant way.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA SHARING
Skill Level: BEGINNER
Duration: 20 MIN
SAP and Databricks have formed a landmark partnership that brings together SAP's deep expertise in mission-critical business processes and semantically rich data with Databricks' industry-leading capabilities in AI, machine learning, and advanced data engineering. From curated, SAP-managed data products to zero-copy Delta Sharing integration, discover how SAP Business Data Cloud empowers data and AI professionals to build AI solutions that unlock unparalleled business insights using trusted business data.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
Unlock the power of your SAP data with SAP Business Data Cloud—a fully managed SaaS solution that unifies and governs all SAP data while seamlessly connecting it with third-party data. As part of SAP Business Data Cloud, SAP Databricks brings together trusted, semantically rich business data with industry-leading capabilities in AI, machine learning, and data engineering. Discover how to access curated SAP data products across critical business processes, enrich and harmonize your data without data copies using Delta Sharing, and leverage the results across your business data fabric. See it all in action with a demonstration.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: RETAIL AND CPG - FOOD, TRAVEL AND HOSPITALITY, FINANCIAL SERVICES
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
This session will explore how developers can easily select, extract, filter, and control data pre-ingestion to accelerate safe AI. Learn how the Securiti and Databricks partnership empowers Databricks users by providing the critical foundation for unlocking scalability and accelerating trustworthy AI development and adoption.Key Takeaways:● Understand how to leverage data intelligence to establish a foundation for frameworks like OWASP top 10 for LLM’s, NIST AI RMF and Gartner’s TRiSM.● Learn how automated data curation and synching address specific risks while accelerating AI development in Databricks.● Discover how leading organizations are able to apply robust access controls across vast swaths of mostly unstructured data● Learn how to maintain data provenance and control as data is moved and transformed through complex pipelines in the Databricks platform.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
As global energy demands continue to rise, organizations must boost efficiency while staying environmentally responsible. Flogistix uses Sigma and Databricks to build a unified data architecture for real-time, data-driven decisions in vapor recovery systems. With Sigma on the Databricks Data Intelligence Platform, Flogistix gains precise operational insights and identifies optimization opportunities that reduce emissions, streamline workflows, and meet industry regulations. This empowers everyone, from executives to field mechanics, to drive sustainable resource production. Discover how advanced analytics are transforming energy practices for a more responsible future.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: MANUFACTURING, RETAIL AND CPG - FOOD, TRAVEL AND HOSPITALITY
Technologies: DATABRICKS SQL, DELTA SHARING, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Faced with the limitations of a legacy, on-prem data stack and scalability bottlenecks in MicroStrategy, Saddle Creek Logistics Services needed a modern solution to handle massive data volumes and accelerate insight delivery. By migrating to a cloud-native architecture powered by Sigma and Databricks, the team achieved significant performance gains and operational efficiency. In this session, Saddle Creek will walk through how they leveraged Databricks’ cloud-native processing engine alongside a unified governance layer through Unity Catalog to streamline and secure downstream analytics in Sigma. Learn how embedded dashboards and near real-time reporting—cutting latency from 9 minutes to just 3 seconds—have empowered data-driven collaboration with external partners and driven a major effort to consolidate over 30,000 reports and objects to under 1,000.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
To meet the growing internal demand for accessible, reliable data, TradeStation migrated from fragmented, spreadsheet-driven workflows to a scalable, self-service analytics framework powered by Sigma on Databricks. This transition enabled business and technical users alike to interact with governed data models directly on the lakehouse, eliminating data silos and manual reporting overhead. In brokerage trading operations, the integration supports robust risk management, automates key operational workflows, and centralizes collaboration across teams. By leveraging Sigma’s intuitive interface on top of Databricks’ scalable compute and unified data architecture, TradeStation has accelerated time-to-insight, improved reporting consistency, and empowered teams to operationalize data-driven decisions at scale.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: RETAIL AND CPG - FOOD
Technologies: APACHE SPARK, DELTA LAKE, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
Customer Data Platforms (CDPs) promise better engagement, higher operational efficiency, and revenue growth by centralizing and streamlining access to customer data. However, consolidating sensitive information from a variety of sources creates complex challenges around data governance, security, and privacy. We’ve studied, built, and managed data protection strategies at some of the world’s biggest retailers. We’ll showcase business requirements, common architectural components, and best practices to deploy data protection solutions at scale, protecting billions of sensitive records across regions and countries. Learn how a data vault pattern with granular, policy-based access control and monitoring can improve organizational privacy posture and help meet regulatory requirements (e.g., GDPR, CCPA, e-Privacy). Walk away with a clear framework to deploy such architecture and knowledge of real-world issues, performance optimizations, and design trade-offs
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Nasdaq’s rapid growth through acquisitions led to fragmented client data across multiple Salesforce instances, limiting cross-sell potential and sales insights. To solve this, Nasdaq partnered with Slalom to build a unified Client Data Hub on the Databricks Lakehouse Platform. This cloud-based solution merges CRM, product usage, and financial data into a consistent, 360° client view accessible across all Salesforce orgs with bi-directional integration. It enables personalized engagement, targeted campaigns, and stronger cross-sell opportunities across all business units. By delivering this 360 view directly in Salesforce, Nasdaq is improving sales visibility, client satisfaction, and revenue growth. The platform also enables advanced analytics like segmentation, churn prediction, and revenue optimization. With centralized data in Databricks, Nasdaq is now positioned to deploy next-gen Agentic AI and chatbots to drive efficiency and enhance sales and marketing experiences.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, MANUFACTURING, FINANCIAL SERVICES
Technologies: MLFLOW, LLAMA
Skill Level: INTERMEDIATE
Duration: 20 MIN
GenAI systems are evolving beyond basic information retrieval and question answering, becoming sophisticated agents capable of managing multi-turn dialogues and executing complex, multi-step tasks autonomously. However, reliably evaluating and systematically improving their performance remains challenging. In this session, we'll explore methods for assessing the behavior of LLM-driven agentic systems, highlighting techniques and showcasing actionable insights to identify performance bottlenecks and to creating better-aligned, more reliable agentic AI systems.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD, TRAVEL AND HOSPITALITY
Technologies: AI/BI, MOSAIC AI
Skill Level: BEGINNER
Duration: 20 MIN
The web is on the verge of a major shift. Agentic applications will redefine how customers engage with digital experiences—delivering highly personalized, relevant interactions. In this talk, Snowplow CTO Yali Sassoon explores how Snowplow Signals enables agents to perceive users through short- and long-term memory, natively on the Databricks Data Intelligence Platform.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES
Technologies: PARTNER CONNECT
Skill Level: INTERMEDIATE
Duration: 20 MIN
Drawing on BDO Canada’s deep expertise in the electricity sector, this session explores how clean energy innovation can be accelerated through a holistic approach to data quality. Discover BDO’s practical framework for implementing data quality and rebuilding trust in data through a structured, scalable approach. BDO will share a real-world example of monitoring data at scale—from high-level executive dashboards to the details of daily ETL and ELT pipelines. Learn how they leveraged Soda’s data observability platform to unlock near-instant insights, and how they moved beyond legacy validation pipelines with built-in checks across their production Lakehouse. Whether you're a business leader defining data strategy or a data engineer building robust data products, this talk connects the strategic value of clean data with actionable techniques to make it a reality.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: EDUCATION
Technologies: DATABRICKS WORKFLOWS, PARTNER CONNECT
Skill Level: BEGINNER
Duration: 40 MIN
Join this session to hear how Western Governors University leverages Tealium & Databricks to power their data activation strategy with real-time customer data collection, activation and advanced analytics.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENERGY AND UTILITIES, MANUFACTURING
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Learn how Chevron transitioned their central finance and procurement analytics into the cloud using Databricks and ThoughtSpot’s Agentic Analytics Platform. Explore how Chevron leverages ThoughtSpot to unlock actionable insights, enhance their semantic layer with user-driven understanding, and ultimately drive more impactful strategies for customer engagement and business growth. In this session, Chevron explains the dos, don’ts, and best practices of migrating from outdated legacy business intelligence to real time, AI-powered insights.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: N/A
Technologies: N/A
Skill Level: N/A
Duration: 20 MIN
Join us to see how the powerful combination of ThoughtSpot's agentic analytics platform and the Databricks Data Intelligence Platform is changing the game for data-driven organizations. We'll demonstrate how DataSpot breaks down technical barriers to insight. You'll learn how to get trusted, real-time answers thanks to the seamless integration between ThoughtSpot's semantic layer and Databricks Unity Catalog. This session is for anyone looking to leverage data more effectively, whether you're a business leader seeking AI-driven insights, a data scientist building models in Python, or a product owner creating intelligent applications.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: MANUFACTURING
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Manufacturers today need efficient, accurate, and flexible integrated planning across supply, demand, and finance. A leading industrial manufacturer is pursuing a competitive edge in Integrated Business Planning through data and AI.Their strategy: a connected, real-time data foundation with democratized access across silos. Using Databricks, we’re building business-centric data products to enable near real-time, collaborative decisions and scaled AI. Unity Catalog ensures data reliability and adoption. Increased data visibility is driving better on-time delivery, inventory optimization, and forecasting,resulting in measurable financial impact. In this session, we’ll share our journey to the north star of “driving from the windshield, not the rearview,” including key data, organization, and process challenges in enabling data democratization; architectural choices for Integrated Business Planning as a data product; and core capabilities delivered with Tiger’s Accelerator.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this session, we take you inside a new kind of data foundation—one that moves beyond traditional tables and metrics to create meaning. We’ll explore how turning metadata into a system of understanding, not just record-keeping, can unlock powerful agentic workflows. You’ll see how business terms like "cost variance" or "lead time" become executable, traceable, and reusable assets that guide AI agents, ensure consistency, and restore trust across decentralized teams. Drawing on real challenges from the CPG world, we’ll walk through how companies are moving from governance battles to collaborative ownership, from static reports to living data definitions, and from disconnected data to decisions that act with confidence and speed. The future of AI isn't just about smarter models—it’s about smarter context. This is your roadmap for transforming your data foundation into a shared language between people and machines.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: RETAIL AND CPG - FOOD
Technologies: DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 40 MIN
In a landscape where customer expectations are evolving faster than ever, the ability to activate real-time, first-party data is becoming the difference between reactive and intelligent businesses. This fireside chat brings together experts from Capgemini, Twilio Segment, and leading marketplace StockX to explore how organizations are building future-proof data foundations that power scalable, responsible AI.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENERGY AND UTILITIES, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS SQL, DLT
Skill Level: INTERMEDIATE
Duration: 20 MIN
Seismic shift Large Language Models are unleashing on data engineering, challenging traditional workflows. LLMs obliterate inefficiencies and redefine productivity. AI powerhouses automate complex tasks like documentation, code translation, and data model development with unprecedented speed and precision. Integrating LLMs into tools promises to reduce offshore dependency, fostering agile onshore innovation. Harnessing LLMs' full potential involves challenges, requiring deep dives into domain-specific data and strategic business alignment. Session will addresses deploying LLMs effectively, overcoming data management hurdles, and fostering collaboration between engineers and stakeholders. Join us to explore a future where LLMs redefine possibilities, inviting you to embrace AI-driven innovation and position your organization as a leader in data engineering.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: DATABRICKS SQL, DLT, LAKEFLOW
Skill Level: BEGINNER
Duration: 40 MIN
Using SQL for data transformation is a powerful way for an analytics team to create their own data pipelines. However, relying on SQL often comes with tradeoffs such as limited functionality, hard-to-maintain stored procedures or skipping best practices like version control and data tests. Databricks supports building high-performing SQL ETL workloads. Attend this session to hear how Databricks supports SQL for data transformation jobs as a core part of your Data Intelligence Platform. In this session we will cover 4 options to use Databricks with SQL syntax to create Delta tables:
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL, DLT, LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 40 MIN
This session explores how SQL-based ETL can accelerate development, simplify maintenance and make data transformation more accessible to both engineers and analysts. We'll walk through how Databricks Lakeflow Declarative Pipelines and Databricks SQL warehouse support building production-grade pipelines using familiar SQL constructs.Topics include: By the end of the session, you’ll understand how SQL-first approaches can streamline ETL development and support both operational and analytical use cases.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
Organizations face the challenge of managing vast amounts of data to combat emerging threats. The Databricks Data Intelligence platform represents a paradigm shift in cybersecurity at State Street, providing a comprehensive solution for managing and analyzing diverse security data. Through its partnership with Databricks, State Street has created a capability to: Efficiently manage structured and unstructured data. Scale up to analyze 50 petabytes of data in real-time. Ingest and parse data for critical security data streams. Build advanced cybersecurity data products and use automation & orchestration to streamline cybersecurity operations. By leveraging these capabilities, State Street has positioned itself as a leader in the financial services industry when it comes to cybersecurity.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Struggling with runaway cloud costs as your organization grows? Join us for an inside look at how Databricks’ own Data Platform team tackled escalating spend in some of the world’s largest workspaces — saving millions of dollars without sacrificing performance or user experience. We’ll share how we harnessed powerful features like System Tables, Workflows, Unity Catalog, and Photon to monitor and optimize resource usage, all while using data-driven decisions to improve efficiency and ensure we invest in the areas that truly drive business impact. You’ll hear about the real-world challenges we faced balancing governance with velocity and discover the custom tooling and best practices we developed to keep costs in check. By the end of this session, you’ll walk away with a proven roadmap for leveraging Databricks to control cloud spend at scale.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: RETAIL AND CPG - FOOD, TRAVEL AND HOSPITALITY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Unity Catalog (UC) enables governance and security for all data and AI assets within an enterprise’s data lake and is necessary to unlock the full potential of Databricks as a true Data Intelligence Platform. Unfortunately, UC migrations are non-trivial; especially for enterprises that have been using Databricks for more than five years, i.e., 7-Eleven. System Integrators (SIs) offer accelerators, guides, and services to support UC migrations; however, cloud infrastructure changes, anti-patterns within code, and data sprawl can significantly complicate UC migrations. There is no “shortcut” to success when planning and executing a complex UC migration. In this session, we will share how UCX by Databricks Labs, a UC Migration Assistant, allowed 7-Eleven to reorient their UC migration by leveraging assessments and workflows, etc., to assess, characterize, and ultimately plan a tenable approach for their UC migration.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Learn how Databricks and Confluent are simplifying the path from real-time data to governed, analytics- and AI-ready tables. This session will cover how Confluent Tableflow automatically materializes Kafka topics into Delta tables and registers them with Unity Catalog — eliminating the need for custom streaming pipelines. We’ll walk through how this integration helps data engineers reduce ingestion complexity, enforce data governance and make real-time data immediately usable for analytics and AI.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 20 MIN
Earlier this year, we finished migration of all dashboards from a traditional BI system to Databricks AI/BI ecosystem, resulting in annual savings of approximately $900,000. We also unlocked the below advantages: We will speak about our journey and how you can migrate your dashboards from traditional BI to AI/BI. Having listed the advantages above, we will also speak of some challenges faced. Migration steps: Migration shenanigans: We look forward to sharing these lessons learned and insights with you to help you streamline your BI infrastructure and unlock the full potential of Databricks AI/BI.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: HEALTH AND LIFE SCIENCES
Technologies: UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 20 MIN
Think Databricks is just for data and models? Think again. In this session, you’ll see how to build and scale a full-stack AI app capable of handling thousands of queries per second entirely on Databricks. No extra cloud platforms, no patchwork infrastructure. Just one unified platform with native hosting, LLM integration, secure access, and built-in CI/CD. Learn how Databricks Apps, along with services like Model Serving, Jobs, and Gateways, streamline your architecture, eliminate boilerplate, and accelerate development, from prototype to production.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: MLFLOW, DSPY
Skill Level: INTERMEDIATE
Duration: 20 MIN
DSPy is a framework for authoring GenAI applications with automatic prompt optimization, while MLflow provides powerful MLOps tooling to track, monitor, and productize machine learning workflows. In this lightning talk, we demonstrate how to integrate MLflow with DSPy to bring full observability to your DSPy development. We’ll walk through how to track DSPy module calls, evaluations, and optimizers using MLflow’s tracing and autologging capabilities. By the end, you'll see how combining these two tools makes it easier to debug, iterate, and understand your DSPy workflows, then deploy your DSPy program — end to end.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: AI/BI, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Are you striving to build a data-driven culture while managing costs and reducing reporting latency? Are your BI operations bogged down by complex data movements rather than delivering insights? Databricks IT faced these challenges in 2024 and embarked on an ambitious journey to make Databricks AI/BI our enterprise-wide reporting platform. In just two quarters, we migrated 2,000 dashboards from a traditional BI tool — without disrupting business operations. We’ll share how we executed this large-scale transition cost-effectively, ensuring seamless change management and empowering non-technical users to leverage AI/BI. You’ll gain insights into: Join us to learn how your organization can achieve the same transformation with AI-powered enterprise reporting.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, DELTA LAKE, DATABRICKS WORKFLOWS
Skill Level: INTERMEDIATE
Duration: 40 MIN
DigiCert is a digital security company that provides digital certificates, encryption and authentication services and serves 88% of the Fortune 500, securing over 28 billion web connections daily. Our project aggregates and analyzes certificate transparency logs via public APIs to provide comprehensive market and competitive intelligence. Instead of relying on third-party providers with limited data, our project gives full control, deeper insights and automation. Databricks has helped us reliably poll public APIs in a scalable manner that fetches millions of events daily, deduplicate and store them in our Delta tables. We specifically use Spark for parallel processing, structured streaming for real-time ingestion and deduplication, Delta tables for data reliability, pools and jobs to ensure our costs are optimized. These technologies help us keep our data fresh, accurate and cost effective. This data has helped our sales team with real-time intelligence, ensuring DigiCert's success.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
This session is repeated. Peek behind the curtain to learn how Databricks processes hundreds of petabytes of data across every region and cloud where we operate. Learn how Databricks leverages Data and AI to scale and optimize every aspect of the company. From facilities and legal to sales and marketing and of course product research and development. This session is a high-level tour inside Databricks to see how Data and AI enable us to be a better company. We will go into the architecture of things for how Databricks is used for internal use cases like business analytics and SIEM as well as customer-facing features like system tables and assistant. We will cover how data production of our data flow and how we maintain security and privacy while operating a large multi-cloud, multi-region environment.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Think you know everything AI/BI can do? Think again. This session explores the art of the possible with Databricks AI/BI Dashboards and Genie, going beyond traditional analytics to unleash the full power of the lakehouse. From incorporating AI into dashboards to handling large-scale data with ease to delivering insights seamlessly to end users — we’ll showcase creative approaches that unlock insights and real business outcomes. Perfect for adventurous data professionals looking to push limits and think outside the box.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE, ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Whether you're using OpenAI, Anthropic or open-source models like Meta Llama, the Mosaic AI Gateway is the central control plane across any AI model or agent. Learn how you can streamline access controls, enforce guardrails for compliance, ensure an audit trail and monitor costs across providers — without slowing down innovation. Lastly, we’ll dive even deeper into how AI Gateway works with Unity Catalog to deliver a full governance story for your end-to-end AI agents across models, tools and data. Key takeaways:
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: ADVANCED
Duration: 40 MIN
Curious about the cutting-edge technology that's revolutionizing AI model performance? Join us for an in-depth exploration of TAO and discover how this innovative approach is transforming the capabilities of modern AI systems. This research-focused session peels back the layers of theoretical foundations, implementation challenges, and breakthrough applications that make TAO one of the most promising advancements in AI development. Key takeaways: Whether you're a research scientist, AI engineer, or technical leader, this session will equip you with valuable insights into how TAO can be leveraged to push your AI models beyond current limitations.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 60 MIN
Join us for the Tech Industry Forum, formerly known as the Tech Innovators Summit, now part of Databricks Industry Experience. This session will feature keynotes, panels and expert talks led by top customer speakers and Databricks experts. Tech companies are pushing the boundaries of data and AI to accelerate innovation, optimize operations and build collaborative ecosystems. In this session, we’ll explore how unified data platforms empower organizations to scale their impact, democratize analytics across teams and foster openness for building tomorrow’s products. Key topics include: After the session, connect with your peers during the exclusive Industry Forum Happy Hour. Reserve your seat today!
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATA MARKETPLACE, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us to discover how leading tech companies accelerate growth using open ecosystems and built-on solutions to foster collaboration, accelerate innovation and create scalable data products. This session will explore how organizations use Databricks to securely share data, integrate with partners and enable teams to build impactful applications powered by AI and analytics. Topics include: Hear real-world examples of how open ecosystems empower organizations to widen the aperture on collaboration, driving better business outcomes. Walk away with insights into how open data sharing and built-on solutions can help your teams innovate faster at scale.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us for this session focused on how leading tech companies are enabling data intelligence across their organizations while maintaining cost efficiency and governance. Hear the successes and the challenges when Databricks empowers thousands of users—from engineers to business teams—by providing scalable tools for AI, BI and analytics. Topics include: Hear from customers and Databricks experts, followed by a customer panel featuring industry leaders. Gain insights into how Databricks helps tech innovators scale their platforms while maintaining operational excellence.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, MLFLOW, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
The migration to the Databricks Data Intelligence Platform has enabled Techcombank to more efficiently unify data from over 50 systems, improve governance, streamline daily operational analytics pipelines and use advanced analytics tools and AI to create more meaningful and personalized experiences for customers. With Databricks, Techcombank has also introduced key solutions that are reshaping its digital banking services: AI-driven lead management system: Techcombank's internally developed AI program called 'Lead Allocation Curated Engine' (LACE) optimizes lead management and provides relationship managers with enriched insights for smarter lead allocation to drive business growth. AI-powered program for digital banking inclusion of small businesses: An AI-powered GeoSense assists frontline workers with analytics-driven insights about which small businesses and merchants to engage in the bank's digital ecosystem. And more examples, which will be presented.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: MEDIA AND ENTERTAINMENT
Technologies: DATABRICKS SQL, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
How are today’s leading telecom operators transforming customer experience at scale with data and AI? Join us for an inspiring fireside chat with senior leaders from Optus, Plume and AT&T as they share their transformation stories — from the first steps to major milestones and the tangible business impact achieved with Databricks’ Data Intelligence Platform. You’ll hear firsthand how these forward-thinking CSP’s are driving measurable outcomes through unified data, machine learning and AI. Discover the high-impact use cases they’re prioritizing — like proactive care and hyper-personalization — and gain insight into their bold vision for the future of customer experience in telecom. Whether you're just beginning your AI journey or scaling to new heights, this session offers an authentic look at what’s working, what’s next and how data and AI are helping telecoms lead in a competitive landscape.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: MEDIA AND ENTERTAINMENT
Technologies: DATABRICKS SQL, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Join us for an interactive breakout session designed to explore scalable, real-world solutions powered by Partners with Databricks. In this high-energy session, you'll hear from three of our leading partners — Accenture, Capgemini and Wipro — as they each deliver rapid-fire, 5-minute demos of their most impactful, production-grade solutions built for the telecom industry. From network intelligence to customer experience to AI-driven automation, these solutions are already driving tangible outcomes at scale. After the demos, you’ll have the unique opportunity to engage directly with each partner in a “speed dating” style format. Dive deep into the solutions, ask your questions and explore how these approaches can be tailored to your organization’s needs. Whether you're solving for churn, fraud, network ops or enterprise AI use cases, this session is your chance to connect, collaborate and walk away with practical ideas you can take back to your teams.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: MEDIA AND ENTERTAINMENT
Technologies: DATABRICKS SQL, MOSAIC AI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 60 MIN
Introducing the First-Ever Telecom Industry Forum at DAIS 2025 For the first time ever, Data + AI Summit (DAIS) will feature a dedicated Telecom Industry Forum — your exclusive opportunity to connect with telecom peers, exchange ideas, and hear directly from visionary leaders who are redefining the future of communications with data and AI. This forum will showcase how telecom operators are using the Databricks Data Intelligence Platform to deliver measurable results — from optimizing operations and enhancing customer experience to enabling new revenue models — all while maintaining the interoperability and agility needed to thrive in an evolving landscape. Join us on Tuesday, June 10 at 4:00 PM for a compelling lineup of speakers: This is a rare opportunity to hear from the leaders at the forefront of telecom innovation. Be inspired, connect with global industry peers, and take away actionable insights to lead your organization's next wave of transformation. The future of telecom is being built now — with Databricks and TM Forum at the center of it.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR, FINANCIAL SERVICES
Technologies: AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
The Trump 2 AI agenda prioritizes US AI leadership by opposing AI regulation on bias and frontier AI risks, favoring innovation and AI expansion. With comprehensive federal AI regulation unlikely, states are advancing AI laws addressing bias, harmful content, transparency, frontier model risk and other risks. Meanwhile, the EU AI Act effectively imposes global obligations. The emerging patchwork of state rules will burden US companies more than would a unified federal approach, seemingly undermining White House deregulatory goals. So, ironically, the Trump team AI agenda may accelerate disparate state-level regulation and impede AI innovation. US companies therefore face a fragmented landscape similar to privacy regulation where the EU AI Act — in the role of GDPR — has set the stage, and the states are asserting themselves with various incremental requirements. Other recent developments covered will include the finalization of the EU GPAI Code of Practice, certain newly enacted state laws, and a quick overview of AI regulation outside the U.S. and EU.
Type: DEEP DIVE
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: DATABRICKS SQL, MOSAIC AI, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 90 MIN
In this deep-dive technical session, Ivan Trusov (Sr. SSA @ Databricks) and Giran Moodley (SA @ Databricks) — will explore the full-stack development of Databricks Apps, covering everything from frameworks to deployment. We’ll walk through essential topics, including: Expect a highly practical session with several live demos, showcasing the development loop, testing workflows and CI/CD automation. Whether you’re building internal tools or AI-powered products, this talk will equip you with the knowledge to ship robust, scalable Databricks Apps.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK
Skill Level: ADVANCED
Duration: 40 MIN
DSv2, Spark's next-generation Catalog API, is gaining traction among data source developers. It shifts complexity to Apache Spark™, improves connector reliability and unlocks new functionality such as catalog federation, MERGE operations, storage-partitioned joins, aggregate pushdown, stored procedures and more. This session covers the design of DSv2, current strengths and gaps and its evolving roadmap. It's intended for Spark users and developers working with data sources, whether custom-built or off-the-shelf.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, APACHE ICEBERG
Skill Level: ADVANCED
Duration: 40 MIN
Open table formats are evolving quickly. In this session, we’ll explore the latest features of Delta Lake and Apache Iceberg™ , including a look at the emerging Iceberg v3 specification. Join us to learn about what’s driving format innovation, how interoperability is becoming real, and what it means for the future of data architecture.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: DELTA LAKE
Skill Level: INTERMEDIATE
Duration: 40 MIN
As data engineering continues to evolve the shift from batch-oriented to streaming-first has become standard across the enterprise. The reality is these changes have been taking shape for the past decade — we just now also happen to be standing on the precipice of true disruption through automation, the likes of which we could only dream about before. Yes, AI Agents and LLMs are already a large part of our daily lives, but we (as data engineers) are ultimately on the frontlines ensuring that the future of AI is powered by consistent, just-in-time data — and Delta Lake is critical to help us get there. This session will provide you with best practices learned the hard way by one of the authors of The Delta Lake Definitive Guide including:
Type: LIGHTNING TALK
Track: DATA WAREHOUSING
Industry: PROFESSIONAL SERVICES
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
Databricks Odyssey is JLL’s bespoke training program designed to upskill and prepare data professionals for a new world of data lakehouse. Based on the concepts of learn, practice and certify, participants earn points, moving through five levels by completing activities with business application of Databricks key features. Databricks Odyssey facilitates cloud data warehousing migration by providing best practice frameworks, ensuring efficient use of pay-per-compute platforms. JLL/T Insights and Data fosters a data culture through learning programs that develop in-house talent and create career pathways. Databricks Odyssey offers: Benefits include:
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES, FINANCIAL SERVICES
Technologies: DATABRICKS WORKFLOWS, DLT, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Lakeflow brings much excitement, simplicity and unification to Databricks’ engineering experience. Databricks’ Bilal Aslam (Sr. Director of Product Management) and Josue A. Bogran (Databricks MVP & content creator) provide an overview of the history of Lakeflow, current value to your organization and the direction its capabilities are going toward. The session covers: The session will also provide you with an opportunity to ask questions to the team behind Lakeflow.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, APACHE ICEBERG, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
What connects your lakehouse to real data intelligence? The answer: the catalog. But not just any catalog. In this session, we break down why Unity Catalog is purpose-built for the lakehouse, and how it goes beyond operational or business catalogs to deliver cross-platform interoperability and a shared understanding of your data. You’ll walk away with a clear view of how the right data foundation unlocks smarter decisions and trusted AI.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: RETAIL AND CPG - FOOD
Technologies: DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
Consumer-facing industries are evolving faster than ever — and in today’s competitive landscape, it’s supply chains, not companies, that are truly competing. While data and AI offer huge potential for optimization, many organizations struggle to turn use cases into real business impact. In this session, leaders from retail, consumer goods, travel and hospitality will share how they’re building strong data foundations to unlock AI-driven supply chain optimization. Learn how they're using generative AI to boost productivity, streamline operations and improve insights through better data collaboration.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: MLFLOW, MOSAIC AI, PARTNER CONNECT
Skill Level: INTERMEDIATE
Duration: 40 MIN
Curious to know how Adidas is transforming customer experience and business impact with agentic workflows, powered by Databricks? By leveraging cutting-edge tools like MosaicML’s deployment capabilities, Mosaic AI Gateway, and MLflow, Adidas built a scalable GenAI agentic infrastructure that delivers actionable insights from growing 2 million product reviews annually. With remarkable results: Join us to explore how Adidas turned agentic workflows infra into a strategic advantage using Databricks and learn how you can do the same!
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, APACHE ICEBERG
Skill Level: BEGINNER
Duration: 40 MIN
Apache Spark has long been recognized as the leading open-source unified analytics engine, combining a simple yet powerful API with a rich ecosystem and top-notch performance. In the upcoming Spark 4.1 release, the community reimagines Spark to excel at both massive cluster deployments and local laptop development. We’ll start with new single-node optimizations that make PySpark even more efficient for smaller datasets. Next, we’ll delve into a major “Pythonizing” overhaul — simpler installation, clearer error messages and Pythonic APIs. On the ETL side, we’ll explore greater data source flexibility (including the simplified Python Data Source API) and a thriving UDF ecosystem. We’ll also highlight enhanced support for real-time use cases, built-in data quality checks and the expanding Spark Connect ecosystem — bridging local workflows with fully distributed execution. Don’t miss this chance to see Spark’s next chapter!
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Building an AI-ready data platform requires strong governance, performance optimization, and seamless adoption of new technologies. At ThredUp, our Databricks journey began with a need for better data management and evolved into a full-scale transformation powering analytics, machine learning, and real-time decision-making. In this session, we’ll cover: Whether you’re new to Databricks or scaling an existing platform, you’ll gain practical insights on navigating the transition, avoiding pitfalls, and maximizing AI and data intelligence.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: TRAVEL AND HOSPITALITY
Technologies: APACHE SPARK, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
The ability for different AI systems to collaborate is more critical than ever. From traditional ML development to fine-tuning GenAI models, Databricks delivers the stability, cost-optimization and productivity Expedia Group (EG) needs. Learn how to unlock the full potential of AI interoperability with Databricks. Join Shiyi Pickrell to understand the future of AI interoperability, how it’s generating business value and driving the next generation of travel AI-powered experiences.
Type: KEYNOTE
Track: N/A
Industry: N/A
Technologies: N/A
Skill Level: N/A
Duration: 180 MIN
Discover the latest advances on the Data Intelligence Platform and hear from the companies who are already enjoying success.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DLT
Skill Level: ADVANCED
Duration: 40 MIN
Lakeflow Declarative Pipelines simplifies pipeline development and management — but how do you optimize for performance and cost? In this session, we’ll explore practical strategies for tuning Lakeflow Declarative Pipelines, including when and how to use autoscaling, Photon and different node types. We'll also cover how to monitor resource usage and decide when serverless is the right choice. You'll learn best practices drawn from real-world customer implementations, along with an overview of the latest performance enhancements available in serverless Lakeflow Declarative Pipelines.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK
Skill Level: INTERMEDIATE
Duration: 40 MIN
Grace-Blackwell is NVIDIA’s most recent GPU system architecture. It addresses a key concern of query engines: fast data access. In this session, we will take a close look at how GPUs can accelerate data analytics by tracing how a row flows through a GPU-enabled query engine.Query engines read large data from CPU memory or from disk. On Blackwell GPUs, a query engine can rely on hardware-accelerated decompression of compact formats. The Grace-Blackwell system takes data access performance even further, by reading data at up to 450 GB/s across its CPU to GPU interconnect. We demonstrate full end-to-end SQL query acceleration using GPUs in a prototype query engine using industry standard benchmark queries. We compare the results to existing CPU solutions.Using Apache Spark™ and the RAPIDS Accelerator for Apache Spark, we demonstrate the impact GPU acceleration has on the performance of SQL queries at the 100TB scale using NDS, a suite that simulates real-world business scenarios.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
As AI becomes more deeply integrated into data platforms, understanding where data comes from — and where it goes — is essential for ensuring transparency, compliance and trust. In this session, we’ll explore the newest advancements in data and AI lineage across the Databricks Platform, including during model training, evaluation and inference. You’ll also learn how lineage system tables can be used for impact analysis and to gain usage insights across your data estate. We’ll cover newly released capabilities — such as Bring Your Own Lineage — that enable an end-to-end view of your data and AI assets in Unity Catalog. Plus, get a sneak peek at what’s coming next on the lineage roadmap!
Type: LIGHTNING TALK
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: DELTA LAKE, DLT, LAKEFLOW
Skill Level: INTERMEDIATE
Duration: 20 MIN
Organizations continue to struggle under the weight of data that still exists across multiple siloed sources, leaving data teams caught between their crumbling legacy data foundations and the race to build new AI and data-driven applications. Modern enterprises are quickly pivoting to data products that simplify and improve reusable data pipelines by joining data at massive scale and publishing it for internal users and the applications that drive business outcomes. Learn how Quantexa with Databricks enables an internal data marketplace to deliver the value that traditional data platforms never could.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Struggling to implement traditional machine learning models that deliver real business value? Join us for a hands-on exploration of classical ML techniques powered by Databricks' Mosaic AI platform. This session focuses on time-tested approaches like regression, classification and clustering — showing how these foundational methods can solve real business problems when combined with Databricks' scalable infrastructure and MLOps capabilities. Key takeaways:
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: HEALTH AND LIFE SCIENCES
Technologies: DELTA LAKE, AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Eli Lilly and Company, a leading bio-pharma company, is revolutionizing manufacturing with next-gen fully digital sites. Lilly and Tredence have partnered to establish a Databricks-powered Global Manufacturing Data Fabric (GMDF), laying the groundwork for transformative data products used by various personas at sites and globally. By integrating data from various manufacturing systems into a unified data model, GMDF has delivered actionable insights across several use cases such as batch release by exception, predictive maintenance, anomaly detection, process optimization and more. Our serverless architecture leverages Databricks Auto Loader for real-time data streaming, PySpark for automation and Unity Catalog for governance, ensuring seamless data processing and optimization. This platform is the foundation for data driven processes, self-service analytics, AI and more. This session will provide details on the data architecture and strategy and share a few use cases delivered.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: FINANCIAL SERVICES
Technologies: UNITY CATALOG, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
This presentation outlines Rabobank Credit analytics transition to a secure, audit-ready data architecture using Unity Catalog (UC), addressing critical regulatory challenges in credit analytics for IRB and IFRS9 regulatory modeling. Key technical challenges included legacy infrastructure (Hive metastore, ADLS mounts using Service Principals and Credential passthrough) lacking granular access controls, data access auditing and limited visibility into lineage, creating governance and compliance gaps. Details cover a framework for phased migration to UC. Outcomes include data lineage mapping demonstrating compliance with regulatory requirements, granular role based access control and unified audit trails. Next steps involve a lineage visualization toolkit (custom app for impact analysis and reporting) and lineage expansion to incorporate upstream banking systems.
Type: LIGHTNING TALK
Track: DATA STRATEGY
Industry: MANUFACTURING
Technologies: DELTA LAKE, DATABRICKS SQL, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
Rheem's journey from a fragmented data landscape to a robust, scalable data platform powered by Databricks showcases the power of data modernization. In just 1.5 years, Rheem evolved from siloed reporting to 30+ certified data products, integrated with 20+ source systems, including MDM. This transformation has unlocked significant business value across sales, procurement, service and operations, enhancing decision-making and operational efficiency. This session will delve into Rheem's implementation of Databricks, highlighting how it has become the cornerstone of rapid data product development and efficient data sharing across the organization. We will also explore the upcoming enhancements with Unity Catalog, including the full migration from HMS to UC. Attendees will gain insights into best practices for building a centralized data platform, enhancing developer experience, improving governance capabilities as well as tips and tricks for a successful UC migration and enablement.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, HEALTH AND LIFE SCIENCES
Technologies: APACHE SPARK, AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Amgen is advancing its Enterprise Data Fabric to securely manage sensitive multimodal data, such as imaging and research data, across formats.Databricks is already the de facto standard for governance on structured data, and Amgen seeks to extend it for unstructured multi modal data too. This approach will also allow Amgen to standardize its GenAI projects on Databricks. Key priorities include: Learn strategies for implementing a comprehensive multimodal data governance framework using Databricks, as we share our experience on standardizing data governance for GenAI use cases.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, DELTA LAKE, AI/BI
Skill Level: INTERMEDIATE
Duration: 40 MIN
At Capital One, data-driven decision making is paramount to our success. This session explores how a focused proof of concept (POC) accelerated a shift in our data pipeline management strategy, resulting in operational improvements and expanded analytical capabilities. We'll cover the business challenges that motivated POC initiation, including data latency, cost savings and scalability limitations, and real-world results. We'll also dive into an examination of the before-and-after architecture with highlights for key technological levers. This session offers insights for data engineering and machine learning practitioners seeking to optimize their data pipelines for improved performance, scalability and business value.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: FINANCIAL SERVICES
Technologies: DATABRICKS WORKFLOWS, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us to explore the dynamic partnership between FactSet and Databricks, transforming data accessibility and insights. Discover the launch of FactSet’s Structured DataFeeds via Delta Sharing on the Databricks Marketplace, enhancing access to crucial financial data insights. Learn about the advantages of streamlined data delivery and how this integration empowers data ecosystems. Beyond structured data, explore the innovative potential of vectorized data sharing of unstructured content such as news, transcripts, and filings. Gain insights into the importance of seamless vectorized data delivery to support GenAI applications and how FactSet is preparing to simplify client GenAI workflows with AI-ready data. Experience a demo that showcases the complete journey from data delivery to actionable GenAI application responses in a real-world Financial Services scenario. See firsthand how FactSet is simplifying client GenAI workflows with AI-ready data that drives faster, more informed financial decisions.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: PUBLIC SECTOR
Technologies: AI/BI, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
GovTech is an agency in the Singapore Government focused on tech for good. The GovTech Chief Data Office (CDO) has built the GovTech Data Platform with Databricks at the core. As the government tech agency, we safeguard national-level government and citizen data. A comprehensive data strategy is essential to uplifting data maturity. GovTech has adopted the service model approach where data services are offered to stakeholders based on their data maturity. Their maturity is uplifted through partnership, readying them for more advanced data analytics. CDO offers a plethora of data assets in a “data restaurant” ranging from raw data to data products, all delivered via Databricks and enabled through fine-grained access control, underpinned by data management best practices such as data quality, security and governance. Within our first year on Databricks, CDO was able to save 8,000 man-hours, democratize data across 50% of the agency and achieve six-figure savings through BI consolidation.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
Timely and actionable insights are critical for staying competitive in today’s fast-paced environment. At HP Print, manual reporting for executive leadership (ELT) has been labor-intensive, hindering agility and productivity. To address this, we developed the Generative Insights Tool (GenIT) using Databricks Genie and Mosaic AI to create a real-time insights engine automating SQL generation, data visualization, and narrative creation. GenIT delivers instant insights, enabling faster decisions, greater productivity, and improved consistency while empowering leaders to respond to printer trends. With automated querying, AI-powered narratives, and a chatbot, GenIT reduces inefficiencies and ensures quality data and insights. Our roadmap integrates multi-modal data, enhances chatbot functionality, and scales globally. This initiative shows how HP Print leverages GenAI to improve decision-making, efficiency, and agility, and we will showcase this transformation at the Databricks AI Summit.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: FINANCIAL SERVICES
Technologies: MLFLOW, MOSAIC AI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us as we explore how First American Data & Analytics, a leading property-centric information provider, revolutionized its data extraction processes using batch inference on the Databricks Platform. Discover how it overcame the challenges of extracting data from millions of historical title policy images and reduced project timelines by 75%. Learn how First American optimized its data processing capabilities, reduced costs by 70% and enhanced the efficiency of its title insurance processes, ultimately improving the home-buying experience for buyers, sellers and lenders. This session will delve into the strategic integration of AI technologies, highlighting the power of collaboration and innovation in transforming complex data challenges into scalable solutions.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: FINANCIAL SERVICES
Technologies: APACHE SPARK, DATABRICKS SQL, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
In finance, every second counts. That’s why the Data team at J. Goldman & Co. needed to transform trillions of real-time market data records into a single, actionable insight — instantly, and without waiting on development resources. By modernizing their internal data platform with a scalable architecture, they built a streamlined, web-native alternative data interface that puts live market data directly in the hands of investment teams. With Databricks’ computational power and Unity Catalog’s secure governance, they eliminated bottlenecks and achieved the fastest time-to-market for critical investor decisions possible. Learn how J. Goldman & Co. Innovates with Databricks and Sigma to:
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: APACHE SPARK, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Do you trust your data? If you’ve ever struggled to figure out which datasets are reliable, well-governed, or safe to use, you’re not alone. At Databricks, our own internal lakehouse faced the same challenge—hundreds of thousands of tables, but no easy way to tell which data met quality standards. In this talk, the Databricks Data Platform team shares how we tackled this problem by building the Data Governance Score—a way to systematically measure and surface trust signals across the entire lakehouse. You’ll learn how we leverage Unity Catalog, governed tags, and enforcement to drive better data decisions at scale. Whether you're a data engineer, platform owner, or business leader, you’ll leave with practical ideas on how to raise the bar for data quality and trust in your own data ecosystem.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Transform your AI/BI Genie into a text-to-SQL powerhouse using the Genie Conversation APIs. This session explores how Genie functions as an intelligent agent, translating natural language queries into SQL to accelerate insights and enhance self-service analytics. You'll learn practical techniques for configuring agents, optimizing queries and handling errors — ensuring Genie delivers accurate, relevant responses in real time. A must-attend for teams looking to level up their AI/BI capabilities and deliver smarter analytics experiences.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: AI/BI, LAKEFLOW, DATABRICKS APPS
Skill Level: INTERMEDIATE
Duration: 40 MIN
In today’s data-driven landscape, business users expect seamless, interactive analytics without having to switch between different environments. This presentation explores our web application that unifies a Power BI dashboard with Databricks Genie, allowing users to query and visualize insights from the same dataset within a single, cohesive interface. We will compare two integration strategies: one that leverages a traditional webpage enhanced by an Azure bot to incorporate Genie’s capabilities, and another that utilizes Databricks Apps to deliver a smoother, native experience. We use the Genie API to build this solution. Attendees will learn the architecture behind these solutions, key design considerations and challenges encountered during implementation. Join us to see live demos of both approaches, and discover best practices for delivering an all-in-one, interactive analytics experience.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: EDUCATION, PUBLIC SECTOR
Technologies: DELTA LAKE, DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
The Databricks Lakehouse for Public Sector is the only enterprise data platform that allows you to leverage all your data, from any source, on any workload to always offer better citizen services/warfighter support/student success with the best outcomes, at the lowest cost, with the greatest investment protection.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
In today's data landscape, organizations often grapple with fragmented data spread across various databases, data warehouses and catalogs. Lakehouse Federation addresses this challenge by enabling seamless discovery, querying, and governance of distributed data without the need for duplication or migration. This session will explore how Lakehouse Federation integrates external data sources like Hive Metastore, Snowflake, SQL Server and more into a unified interface, providing consistent access controls, lineage tracking and auditing across your entire data estate. Learn how to streamline analytics and AI workloads, enhance compliance and reduce operational complexity by leveraging a single, cohesive platform for all your data needs.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: DELTA LAKE, LAKEFLOW, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
The Databricks Data Intelligence Platform and Lakeflow Connect have transformed how Porsche manages and uses its customer data. By opting to use Lakeflow Connect instead of building a custom solution, the company has reaped the benefits of both operational efficiency and cost management. Internally, teams at Porsche now spend less time managing data integration processes. “Lakeflow Connect has enabled our dedicated CRM and Data Science teams to be more productive as they can now focus on their core work to help innovate, instead of spending valuable time on the data ingestion integration with Salesforce,” says Gruber. This shift in focus is aligned with broader industry trends, where automotive companies are redirecting significant portions of their IT budgets toward customer experience innovations and digital transformation initiatives. This story was also shared as part of a Databricks Success Story — Elise Georis, Giselle Goicochea.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY, PUBLIC SECTOR
Technologies: DATA MARKETPLACE, DELTA SHARING, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
This session will take you on our journey of integrating Databricks as the core serving layer in a large enterprise, demonstrating how you can build a unified data platform that meets diverse business needs. We will walk through the steps for constructing a central serving layer by leveraging Databricks’ SQL Warehouse to efficiently deliver data to analytics tools and downstream applications. To tackle low latency requirements, we’ll show you how to incorporate an interim scalable relational database layer that delivers sub-second performance for hot data scenarios. Additionally, we’ll explore how Delta Sharing enables secure and cost-effective data distribution beyond your organization, eliminating silos and unnecessary duplication for a truly end-to-end centralized solution. This session is perfect for data architects, engineers and decision-makers looking to unlock the full potential of Databricks as a centralized serving hub.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY, MEDIA AND ENTERTAINMENT, FINANCIAL SERVICES
Technologies: AI/BI, LAKEFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The GTM team at Databricks recently launched the GTM Analytics Hub—a native AI/BI platform designed to centralize reporting, streamline insights, and deliver personalized dashboards based on user roles and business needs. Databricks Apps also played a crucial role in this integration by embedding AI/BI Dashboards directly into internal tools and applications, streamlining access to insights without disrupting workflows. This seamless embedding capability allows users to interact with dashboards within their existing platforms, enhancing productivity and collaboration. Furthermore, AI/BI Dashboards leverage Databricks' unified data and governance framework. Join us to learn how we’re using Databricks to build for Databricks—transforming GTM analytics with AI/BI Dashboards, and what it takes to drive scalable, user-centric analytics adoption across the business.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: HEALTH AND LIFE SCIENCES
Technologies: DATABRICKS WORKFLOWS, DLT, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Red Stapler is a streaming-native system on Databricks that merges file-based ingestion and real-time user edits into one Lakeflow Declarative Pipelines for near real-time feedback. Protobuf definitions, managed in the Buf Schema Registry (BSR), govern schema and data-quality rules, ensuring backward compatibility. All records — valid or not — are stored in an SCD Type 2 table, capturing every version for full history and immediate quarantine views of invalid data. This unified approach boosts data governance, simplifies auditing and streamlines error fixes.Running on Lakeflow Declarative Pipelines Serverless and the Kafka-compatible Bufstream keeps costs low by scaling down to zero when idle. Red Stapler’s configuration-driven Protobuf logic adapts easily to evolving survey definitions without risking production. The result is consistent validation, quick updates and a complete audit trail — all critical for trustworthy, flexible data pipelines.
Type: DEEP DIVE
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 90 MIN
Join this deep dive session for practitioners on Unity Catalog, Databricks’ unified data governance solution, to explore its capabilities for managing data and AI assets across workflows. Unity Catalog provides fine-grained access control, automated lineage tracking, quality monitoring and policy enforcement and observability at scale. Whether your focus is data pipelines, analytics or machine learning and generative AI workflows, this session offers actionable insights on leveraging Unity Catalog’s open interoperability across tools and platforms to boost productivity and drive innovation. Learn governance best practices, including catalog configurations, access strategies for collaboration and controls for securing sensitive data. Additionally, discover how to design effective multi-cloud and multi-region deployments to ensure global compliance.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: FINANCIAL SERVICES
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
This presentation outlines the evolution of Databricks and its integration with cloud analytics at Edward Jones. It focuses on the transition from Cloud V1.x to Cloud V2.0, which highlights the challenges faced with initial setup, Unity Catalog implementation and the improvements planned for the future particularly in terms of Data Cataloging, Architecture and Disaster Recovery. Highlights:
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: DLT, LAKEFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Modern data workloads span multiple sources — data lakes, databases, apps like Salesforce and services like cloud functions. But as teams scale, secure data access and governance across shared compute becomes critical. In this session, learn how to confidently integrate external data and services into your workloads using Spark and Unity Catalog on Databricks. We'll explore compute options like serverless, clusters, workflows and SQL warehouses, and show how Unity Catalog’s Lakeguard enforces fine-grained governance — even when concurrently sharing compute by multiple users. Walk away ready to choose the right compute model for your team’s needs — without sacrificing security or efficiency.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT, TRAVEL AND HOSPITALITY, FINANCIAL SERVICES
Technologies: DELTA LAKE, APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
What if you could simplify data management, boost performance, and cut costs-all at once? Join us to discover how Unity Catalog managed tables can slash your storage costs, supercharge query speeds, and automate optimizations with AI on the Data Intelligence Platform. Experience seamless interoperability with third-party clients, and be among the first to preview our new game-changing tool that makes moving to UC managed tables effortless. Don’t miss this exciting session that will redefine your data strategy!
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The Databricks labs project UCX aims to optimize the Unity Catalog (UC) upgrade process, ensuring a seamless transition for businesses. This session will delve into various aspects of the UCX project including the installation and configuration of UCX, the use of the UCX Assessment Dashboard to reduce upgrade risks and prepare effectively for a UC upgrade, and the automation of key components such as group, table and code migration. Attendees will gain comprehensive insights into leveraging UCX and Lakehouse Federation for a streamlined and efficient upgrade process. This session is aimed at customers new to UCX as well as veterans.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Struggling to keep up with data governance at scale? Join us to explore how automated data classification, tag policies and ABAC streamline access control while enhancing security and compliance. Get an exclusive look at the new Governance Hub, built to give your teams deeper visibility into data usage, access patterns and metadata — all in one place. Whether you're managing thousands or millions of assets, discover how to classify, tag and protect your data estate effortlessly with the latest advancements in Unity Catalog.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING, FINANCIAL SERVICES
Technologies: DELTA LAKE, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
In an era of skyrocketing content volumes, companies are sitting on huge libraries — of video, images and audio — just waiting to be leveraged to power targeted advertising and recommendations, as well as reinforce brand safety. Coactive AI will show how fast and accurate AI-driven metadata enrichment, combined with Databricks Unity Catalog and lakehouse, is accelerating and optimizing media workflows. Learn how leading brands are using content metadata to: Transform content from a static asset into a dynamic engine for growth, engagement and compliance.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: RETAIL AND CPG - FOOD
Technologies: DLT, UNITY CATALOG
Skill Level: ADVANCED
Duration: 40 MIN
With regulations like LGPD (Brazil's General Data Protection Law) and GDPR, managing sensitive data access is critical. This session demonstrates how to leverage Databricks Unity Catalog system tables and data lineage to dynamically propagate classification tags, empowering organizations to monitor governance and ensure compliance. The presentation covers practical steps, including system table usage, data normalization, ingestion with Lakeflow Declarative Pipelines and classification tag propagation to downstream tables. It also explores permission monitoring with alerts to proactively address governance risks. Designed for advanced audiences, this session offers actionable strategies to strengthen data governance, prevent breaches and avoid regulatory fines while building scalable frameworks for sensitive data management.
Type: LIGHTNING TALK
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: MLFLOW, AI/BI, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 20 MIN
In an era where insights-driven decision-making is paramount, the insurance industry stands at the cusp of a major technological revolution. This session will delve into how Agentic AI — AI agents act autonomously to achieve critical goals — can be leveraged to transform insurance operation (underwriting, claims, services), enhance customer experiences and drive strategic growth.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA SHARING, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Tired of data silos and the constant need to move copies of your data across different systems? Imagine a world where all your enterprise data is readily available in Databricks without the cost and complexity of duplication and ingestion. Our vision is to break down these silos by enabling seamless, zero-copy data sharing across platforms, clouds, and regions. This unlocks the true potential of your data for analytics and AI, empowering you to make faster, more informed decisions leveraging your most important enterprise data sets. This session you will hear from Databricks, SAP, and Salesforce product leaders on how zero-copy data sharing can unlock the value of enterprise data. Explore how Delta Sharing makes this vision a reality, providing secure, zero-copy data access for enterprises.SAP Business Data Cloud: See Delta Sharing in action to unlock operational reporting, supply chain optimization, and financial planning. Salesforce Data Cloud: Enable customer analytics, churn prediction, and personalized marketing.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK
Skill Level: ADVANCED
Duration: 40 MIN
Don’t you just hate telling your customers “No”? “No, I can’t get you the data that quickly”, or “No that logic isn’t possible to implement” really aren’t fun to say. But what if you had a tool that would allow you to implement those use cases? What if it was in a technology you were already familiar with — say, Spark Structured Streaming? There is a brand new arbitrary stateful operations API called TransformWithState, and after attending this deep dive you won’t have to say “No” anymore. During this presentation we’ll go through some real-world use cases and build them step-by-step. Everything from state variables, process vs. event time, watermarks, timers, state TTL, and even how you can initialize state with the checkpoint of another stream. Unlock your use cases with the power of Structured Streaming’s TransformWithState!
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Effective Identity and Access Management (IAM) is essential for securing enterprise environments while enabling innovation and collaboration. As companies scale, ensuring users have the right access without adding administrative overhead is critical. In this session, we’ll explore how Databricks is simplifying identity management by integrating with customers’ Identity Providers (IDPs). Learn about Automatic Identity Management in Azure Databricks, which eliminates SCIM for Entra ID users and ensures scalable identity provisioning for other IDPs. We'll also cover externally managed groups, PIM integration and upcoming enhancements like a bring-your-own-IDP model for Google Cloud. Through a customer success story and live demo, see how Databricks is making IAM more scalable, secure and user-friendly.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENERGY AND UTILITIES, MANUFACTURING, RETAIL AND CPG - FOOD
Technologies: DELTA SHARING, MOSAIC AI
Skill Level: BEGINNER
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: PUBLIC SECTOR
Technologies: MLFLOW, AI/BI, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join us to learn how the UK's Department for Environment, Food & Rural Affairs (DEFRA) transformed data use with Databricks’ Unity Catalog, enabling nationwide projects through secure, scalable analytics. DEFRA safeguards the UK's natural environment. Historical fragmentation of data, talent and tools across siloed platforms and organizations, made it difficult to fully exploit the department’s rich data. DEFRA launched its Data Analytics & Science Hub (DASH), powered by the Databricks Data Intelligence Platform, to unify its data ecosystem. DASH enables hundreds of users to access and share datasets securely. A flagship example demonstrates its power, using Databricks to process aerial photography and satellite data to identify peatlands in need of restoration — a complex task made possible through unified data governance, scalable compute and AI. Attendees will hear about DEFRA’s journey, learn valuable lessons about building a platform crossing organizational boundaries.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Getting started with data and AI governance in the modern data stack? Unity Catalog is your gateway to secure, discoverable and well-governed data and AI assets. In this session, we’ll break down what Unity Catalog is, why it matters and how it simplifies access control, lineage, discovery, auditing, business semantics and secure, open collaboration — all from a single place. We’ll explore how it enables open interoperability across formats, tools and platforms, helping you avoid lock-in and build on open standards. Most importantly, you’ll learn how Unity Catalog lays the foundation for data intelligence — by unifying governance across data and AI, enabling AI tuned to your business. It helps build a deep understanding of your data and delivers contextual, domain-specific insights that boost productivity for both technical and business users across any workload.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: RETAIL AND CPG - FOOD
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
This session will explore Databricks Unity Catalog (UC) implementation by P&G to enhance data governance, reduce data redundancy and improve the developer experience through the enablement of a Lakehouse architecture. The presentation will cover: The distinction between data treated as a product and standard application data, highlighting how UC's structure maximizes the value of data in P&G's data lake. Real-life examples from two years of using Unity Catalog, demonstrating benefits such as improved governance, reduced waste and enhanced data discovery. Challenges related to disaster recovery and external data access, along with our collaboration with Databricks to address these issues. Sharing our experience can provide valuable insights for organizations planning to adopt Unity Catalog on an enterprise scale.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: MANUFACTURING
Technologies: AI/BI, DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
Industrial data is the foundation for operational excellence, but sharing and leveraging this data across systems presents significant challenges. Fragmented approaches create delays in decision-making, increase maintenance costs, and erode trust in data quality. This session explores how the partnership between AVEVA and Databricks addresses these issues through CONNECT, which integrates directly with Databricks via Delta Sharing. By accelerating time to value, eliminating data wrangling, ensuring high data quality, and reducing maintenance costs, this solution drives faster, more confident decision-making and greater user adoption. We will showcase how Agnico Eagle Mines—the world’s third-largest gold producer with 10 mines across Canada, Australia, Mexico, and Finland—is leveraging this capability to overcome data intelligence barriers at scale. With this solution, Agnico Eagle is making insights more accessible and actionable across its entire organization.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: MEDIA AND ENTERTAINMENT
Technologies: DLT
Skill Level: INTERMEDIATE
Duration: 40 MIN
Streaming data is hard and costly — that's the default opinion, but it doesn’t have to be.In this session, discover how SEGA simplified complex streaming pipelines and turned them into a competitive edge. SEGA sees over 40,000 events per second. That's no easy task, but enabling personalised gaming experiences for over 50 million gamers drives a huge competitive advantage. If you’re wrestling with streaming challenges, this talk is your next checkpoint.We’ll unpack how Lakeflow Declarative Pipelines helped SEGA, from automated schema evolution and simple data quality management to seamless streaming reliability. Learn how Lakeflow Declarative Pipelines drives value by transforming chaos emeralds into clarity, delivering results for a global gaming powerhouse. We'll step through the architecture, approach and challenges we overcame.Join Craig Porteous, Microsoft MVP from Advancing Analytics, and Felix Baker, Head of Data Services at SEGA Europe, for a fast-paced, hands-on journey into Lakeflow Declarative Pipelines’ unique powers.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATA MARKETPLACE
Skill Level: BEGINNER
Duration: 40 MIN
Curious about how to get real value from the Databricks Marketplace—whether you're consuming data or sharing it? This demo-heavy session answers the top 10 questions we hear from both data consumers and providers, with real examples you can put into practice right away. We’ll show consumers how to find the right product listing whether that's tables, files, AI models, solution accelerators, or Partner Connect integrations, try them out using sample notebooks, and access them with ease. You’ll also see how the Private Marketplace helps teams work more efficiently with a curated catalog of approved data. For providers, learn how to list your product in a way that stands out, use notebooks and documentation to help users get started, reach new audiences, and securely share data across your company or with trusted partners using the Private Marketplace. If you’ve ever asked, “How do I get started?” or “How do I make my data available internally or externally?”—this session has the answers, with demos to match.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: ENTERPRISE TECHNOLOGY, MANUFACTURING
Technologies: DATA MARKETPLACE, DELTA SHARING, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Lely, a Dutch company specializing in dairy farming robotics, helps farmers with advanced solutions for milking, feeding and cleaning. This session explores Lely’s implementation of an Internal Data Marketplace, built around Databricks' Private Exchange Marketplace. The marketplace serves as a central hub for data teams and business users, offering seamless access to data, analytics and dashboards. Powered by Delta Sharing, it enables secure, private listing of data products across business domains, including notebooks, views, models and functions. This session covers the pros and cons of this approach, best practices for setting up a data marketplace and its impact on Lely’s operations. Real-world examples and insights will showcase the potential of integrating data-driven solutions into dairy farming. Join us to discover how data innovation drives the future of dairy farming through Lely’s experience.
Type: LIGHTNING TALK
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: MEDIA AND ENTERTAINMENT
Technologies: APACHE ICEBERG, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 20 MIN
This session showcases our journey of adopting Apache Iceberg™ to build a modern lakehouse architecture and leveraging Databricks advanced Iceberg support to take it to the next level. We’ll dive into the key design principles behind our lakehouse, the operational challenges we tackled and how Databricks enabled us to unlock enhanced performance, scalability and streamlined data workflows. Whether you’re exploring Apache Iceberg™ or building a lakehouse on Databricks, this session offers actionable insights, lessons learned and best practices for modern data engineering.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: RETAIL AND CPG - FOOD
Technologies: AI/BI, DLT
Skill Level: BEGINNER
Duration: 40 MIN
Retail Media Networks (RMNs) are transforming how brands engage and connect with consumers throughout the omnichannel. In this session, Databricks and Hightouch will explore how data-driven advertising is reshaping retail promotions and enabling real-time activation of customer insights. Learn how unified data architectures and composable customer stacks are driving hyper-personalized, high-ROI campaigns. Whether you're a retailer monetizing first-party data or a brand optimizing ad spend, this session offers practical strategies and real-world examples to thrive in the evolving RMN landscape.
Type: BREAKOUT
Track: DATA LAKEHOUSE ARCHITECTURE AND IMPLEMENTATION
Industry: ENTERPRISE TECHNOLOGY
Technologies: DELTA LAKE, APACHE ICEBERG, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
As data architectures evolve to meet the demands of real-time GenAI applications, organizations increasingly need systems that unify streaming and batch processing while maintaining compatibility with existing tools. The Ursa Engine offers a Kafka-API-compatible data streaming engine built on Lakehouse (Iceberg and Delta Lake). Designed to seamlessly integrate with data lakehouse architectures, Ursa extends your lakehouse capabilities by enabling streaming ingestion, transformation and processing — using a Kafka-compatible interface. In this session, we will explore how Ursa Engine augments your existing lakehouses with Kafka-compatible capabilities. Attendees will gain insights into Ursa Engine architecture and real-world use cases of Ursa Engine. Whether you're modernizing legacy systems or building cutting-edge AI-driven applications, discover how Ursa can help you unlock the full potential of your data.
Type: BREAKOUT
Track: ARTIFICIAL INTELLIGENCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: MLFLOW, LLAMA, MOSAIC AI
Skill Level: INTERMEDIATE
Duration: 40 MIN
In this session you will learn how to leverage a wide set of GenAI models in Databricks, including external connections to cloud vendors and other model providers. We will cover establishing connection to externally served models, via Mosaic AI Gateway. This will showcase connection to Azure, AWS & Google Cloud models, as well as model vendors like Anthropic, Cohere, AI21 Labs and more. You will also discover best practices on model comparison, governance and cost control on those model deployments.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY, TRAVEL AND HOSPITALITY, FINANCIAL SERVICES
Technologies: UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
The ability to enforce data management controls at scale and reduce the effort required to manage data pipelines is critical to operating efficiently. Capital One has scaled its data management capabilities and invested in platforms to help address this need. In the past couple of years, the role of “the catalog” in a data platform architecture has transitioned from just providing SQL to providing a full suite of capabilities that can help solve this problem at scale. This talk will give insight into how Capital One is thinking about leveraging Databricks Unity Catalog to help tackle these challenges.
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: MEDIA AND ENTERTAINMENT, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DELTA SHARING
Skill Level: BEGINNER
Duration: 40 MIN
Databricks Clean Rooms make privacy-safe collaboration possible for data, analytics, and AI — across clouds and platforms. Built on Delta Sharing, Clean Rooms enable organizations to securely share and analyze data together in a governed, isolated environment — without ever exposing raw data. In this session, you’ll learn how to get started with Databricks Clean Rooms and unlock advanced use cases including: Whether you're a data scientist, engineer or data leader, this session will equip you to drive high-value collaboration while maintaining full control over data privacy and governance.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES
Technologies: DELTA LAKE, LAKEFLOW, UNITY CATALOG
Skill Level: INTERMEDIATE
Duration: 40 MIN
Change data feeds are a common tool for synchronizing changes between tables and performing data processing in a scalable fashion. Serverless architectures offer a compelling solution for organizations looking to avoid the complexity of managing infrastructure. But how can you bring CDFs into a serverless environment? In this session, we'll explore how to integrate Change Data Feeds into serverless architectures using Delta-rs and Delta-kernel-rs—open-source projects that allow you to read Delta tables and their change data feeds in Rust or Python. We’ll demonstrate how to use these tools with Lakestore’s serverless platform to easily stream and process changes. You’ll learn how to:
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Managing authentication effectively is key to securing your data platform. In this session, we’ll explore best practices from Databricks for overcoming authentication challenges, including token visibility, MFA/SSO, CI/CD token federation and risk containment. Discover how to map your authentication maturity journey while maximizing security ROI. We'll showcase new capabilities like access token reports for improved visibility, streamlined MFA implementation and secure SSO with token federation. Learn strategies to minimize token risk through TTL limits, scoped tokens and network policies. You'll walk away with actionable insights to enhance your authentication practices and strengthen platform security on Databricks.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: HEALTH AND LIFE SCIENCES, PUBLIC SECTOR
Technologies: DATABRICKS WORKFLOWS, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Clinical Trial Data is undergoing a renaissance with new insights and data sources being added daily. The speed of new innovations and modalities that are found within trials poses an existential dilemma for 21CFR Part 11 compliance. In these validated environments, new components and methods need to be tested for reproducibility and restricted data access. In classical systems, this validation process would often have taken three months or more due to the manual validation process via validation scripts like Installation Qualification (IQ) and Operational Qualification (OQ) scripts. In conjunction with Databricks, Purgo AI has developed a new technology leveraging generative AI to automate the execution of IQ and OQ scripts and has drastically reduced the amount of time for validating Databricks from three months to less than a day. This drastic speedup of validation will enable the continuous flow of new ideas and implementations for clinical trials.
Type: LIGHTNING TALK
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 20 MIN
Dynamic policy enforcement is increasingly critical in today's landscape, where data compliance is a top priorities for companies, individuals, and regulators alike. In this talk, Walaa explores how LinkedIn has implemented a robust dynamic policy enforcement engine, ViewShift, and integrated it within its data lake. He will demystify LinkedIn's query engine stack by demonstrating how catalogs can automatically route table resolutions to compliance-enforcing SQL views. These SQL views possess several noteworthy properties: Auto-Generated: Created automatically from declarative data annotations. User-Centric: They honor user-level consent and preferences. Context-Aware: They apply different transformations tailored to specific use cases. Portable: Despite the SQL logic being implemented in a single dialect, it remains accessible across all engines. Join this session to learn how ViewShift helps ensure that compliance is seamlessly integrated into data processing workflows.
Type: KEYNOTE
Track: N/A
Industry: N/A
Technologies: N/A
Skill Level: N/A
Duration: 180 MIN
Be first to witness the latest breakthroughs from Databricks and share the success of innovative data and AI companies.
Type: LIGHTNING TALK
Track: DATA SHARING AND COLLABORATION
Industry: ENERGY AND UTILITIES, ENTERPRISE TECHNOLOGY
Technologies: DELTA SHARING
Skill Level: INTERMEDIATE
Duration: 20 MIN
At DXC, we helped our customer FastWeb with their "Welcome Lakehouse" project - a data warehouse transformation from on-premises to Databricks on AWS. But the implementation became something more. Thanks to features such as Lakehouse Federation and Delta Sharing, from the first day of the Fastweb+Vodafone merger, we have been able to connect two different platforms with ease and make the business focus on the value of data and not on the IT integration. This session will feature our customer Alessandro Gattolin of Fastweb to talk about the experience.
Type: BREAKOUT
Track: DATA STRATEGY
Industry: RETAIL AND CPG - FOOD
Technologies: AI/BI, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
In this session, Joëlle van der Bijl, Chief Data & Analytics Officer at FrieslandCampina, shares the bold journey of replacing legacy data systems with a single, unified data, analytics, and AI platform built on Databricks. Rather than evolving gradually, the company took a leap: transforming its entire data foundation in one go. Today, this data-centric vision is delivering high-value impact: from optimizing milk demand and supply to enabling commercial AI prediction models and scaling responsible AI across the business. Learn how FrieslandCampina is using Databricks to blend tradition with innovation, and unlock a smarter, more sustainable future for dairy.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
Ready to take your AI/BI dashboards to the next level? This session dives into the latest capabilities in Databricks AI/BI Dashboards and how to maximize impact across your organization. Learn how data authors can tailor visualizations for different audiences, optimize performance and seamlessly integrate with Genie for a unified analytics experience. We’ll also share practical tips on how business users and data teams can better collaborate — ensuring insights are accessible, actionable and aligned to business goals.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: APACHE SPARK
Skill Level: INTERMEDIATE
Duration: 40 MIN
Join this session for a concise tour of Apache Spark™ 4.0’s most notable enhancements: Whether you’re a seasoned Spark user or new to the ecosystem, this talk will prepare you to leverage Spark 4.0’s latest innovations for modern data and AI pipelines.
Type: BREAKOUT
Track: DATA WAREHOUSING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
No description available.
Type: BREAKOUT
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY, PROFESSIONAL SERVICES, FINANCIAL SERVICES
Technologies: APACHE SPARK
Skill Level: INTERMEDIATE
Duration: 40 MIN
PySpark’s DataFrame API is evolving to support more expressive and modular workflows. In this session, we’ll introduce two powerful additions: table-valued functions (TVFs) and the new subquery API. You’ll learn how to define custom TVFs using Python User-Defined Table Functions (UDTFs), including support for polymorphism, and how subqueries can simplify complex logic. We’ll also explore how lateral joins connect these features, followed by practical tools for the PySpark developer experience—such as plotting, profiling, and a preview of upcoming capabilities like UDF logging and a Python-native data source API. Whether you're building production pipelines or extending PySpark itself, this talk will help you take full advantage of the latest features in the PySpark ecosystem.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
In this session, we’ll walk through the latest advancements in platform security and compliance on Databricks — from networking updates to encryption, serverless security and new compliance certifications across AWS, Azure and Google Cloud. We’ll also share our roadmap and best practices for how to securely configure workloads on Databricks SQL Serverless, Unity Catalog, Mosaic AI and more — at scale. If you're building on Databricks and want to stay ahead of evolving risk and regulatory demands, this session is your guide.
Type: BREAKOUT
Track: DATA AND AI GOVERNANCE
Industry: ENTERPRISE TECHNOLOGY
Technologies: UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Join the Unity Catalog product team for an exclusive deep dive into the latest innovations and upcoming features of Unity Catalog! Explore cutting-edge advancements in access control, discovery, lineage and monitoring — plus get a sneak peek at what’s coming next. Packed with live demos, expert insights and best practices from thousands of customers running Unity Catalog in production, this session is also your chance to engage directly with product experts and get answers to your most pressing questions. Don’t miss this opportunity to stay ahead of the curve and elevate your data governance strategy!
Type: BREAKOUT
Track: DATA SHARING AND COLLABORATION
Industry: HEALTH AND LIFE SCIENCES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: DATA MARKETPLACE, DELTA SHARING, UNITY CATALOG
Skill Level: BEGINNER
Duration: 40 MIN
Databricks continues to redefine how organizations securely and openly collaborate on data. With new innovations like Clean Rooms for multi-party collaboration, Sharing for Lakehouse Federation, cross-platform view sharing and Databricks Apps in the Marketplace, teams can now share and access data more easily, cost-effectively and across platforms — whether or not they’re using Databricks. In this session, we’ll deliver live demos of key capabilities that power this transformation: Join us to see how these tools enable trusted data sharing, accelerate insights and drive innovation across your ecosystem. Bring your questions and walk away with practical ways to put these capabilities into action today.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY
Technologies: AI/BI, DATABRICKS SQL
Skill Level: INTERMEDIATE
Duration: 40 MIN
Databricks Assistant helps you get from initial exploration all the way to production faster and easier than ever. In this session, we'll show you how Assistant simplifies and accelerates common workflows, boosting your productivity across notebooks and the SQL editor. You'll get practical tips, see end-to-end examples in action, and hear about the latest capabilities we're excited about. We'll also discuss how we're continually improving Assistant to make your development experience faster, more contextual and more customizable. Join us to discover how to get the most out of Databricks Assistant and empower your team to build better and faster.
Type: LIGHTNING TALK
Track: DATA ENGINEERING AND STREAMING
Industry: ENTERPRISE TECHNOLOGY
Technologies: DLT
Skill Level: BEGINNER
Duration: 20 MIN
Lakeflow Declarative Pipelines Serverless offers a range of benefits that make it an attractive option for organizations looking to optimize their ETL (Extract, Transform, Load) processes.Key benefits of Lakeflow Declarative Pipelines Serverless: By moving to Lakeflow Declarative Pipelines Serverless, organizations can achieve faster, more reliable, and cost-effective data pipeline management, ultimately driving better business insights and outcomes.
Type: LIGHTNING TALK
Track: ANALYTICS AND BI
Industry: PROFESSIONAL SERVICES, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: AI/BI, DLT, UNITY CATALOG
Skill Level: BEGINNER
Duration: 20 MIN
“I don’t want to spend time filtering through another dashboard — I just need an answer now.” We’ve all experienced the frustration of wading through dashboards, yearning for immediate answers. Traditional reports and visualizations, though essential, often complicate the process for decision-makers. The digital enterprise demands a shift towards conversational, natural language interactions with data. At KPMG, AI|BI Genie is reimagining our approach by allowing users to inquire about data just as they would consult a knowledgeable colleague, obtaining precise and actionable insights instantly. Discover how the KPMG Contract to Cash team leverages AI|BI Genie to enhance data engagement, drive insights and foster business growth. Join us to see AI|BI Genie in action and learn how you can transform your data interaction paradigm.
Type: BREAKOUT
Track: ANALYTICS AND BI
Industry: ENTERPRISE TECHNOLOGY, RETAIL AND CPG - FOOD, FINANCIAL SERVICES
Technologies: AI/BI, DATABRICKS SQL
Skill Level: BEGINNER
Duration: 40 MIN
Picture the scene — you're exploring a deep, dark cave looking for insights to unearth when, in a burst of smoke, Genie appears and offers you not three but unlimited data wishes. This isn't a folk tale, it's the growing wave of Generative BI that is going to be a part of analytics platforms. Databricks Genie is a tool powered by a SQL-writing LLM that redefines how we interact with data. We'll look at the basics of creating a new Genie room, scoping its data tables and asking questions. We'll help it out with some complex pre-defined questions and ensure it has the best chance of success. We'll give the tool a personality, set some behavioural guidelines and prepare some hidden easter eggs for our users to discover. Generative BI is going to be a fundamental part of the analytics toolset used across businesses. If you're using Databricks, you should be aware of Genie, if you're not, you should be planning your Generative BI Roadmap, and this session will answer your wishes.