Special day at Conference on Knowledge Discovery and Data Mining (KDD) 2025
KDD 2025 | Toronto, ON, Canada
Thursday, August 7, 2025
Reasoning models break down complex problems into smaller problems that can be solved systematically. Unlike the first generation of LLMs which might provide direct answers, reasoning models follow a more structured, and systematic progress to solve this.
Examples of reasoning models include OpenAI's o1 family of models (e.g o1–mini, o3–mini), DeepSeek, Gemini, Phi4–Reasoning, and more.
This special KDD day on AI reasoning invites researchers, practitioners from the industry, and students to share state-of-art techniques on the use of AI reasoning in math, coding, how to reason with common sense, approaches to automated reasoning, and using LLMs reasoning.
Through a set of keynotes, invited talks, panel discussion and lightning talks, the KDD Day on AI Reasoning will be an interactive and exciting forum for discussions on this important topic of AI Reasoning, and how it is emerging as a promising approach to solving complex problems in many fields and industries with reasoning model.
Here's the program outline with details on each of the events happening.
AI models are increasingly capable of solving sophisticated tasks that require reasoning. But how do we improve the quality of that reasoning, especially when the models operate as black boxes? In this talk, I’ll share practical strategies for improving AI reasoning in the domain of code and structured tasks. First, we can capture richer forms of user intent. Input-output examples not only enable post-hoc validation, but also guide the model toward correct generations up front. Temporal context (such as recent user actions) can help infer evolving intent and keep users in flow. Second, we can give the model an escape mechanism—allowing it to abstain or initiate collaborative interaction when it lacks sufficient information. This raises new challenges in evaluating interactive workflows, which we address through rubric-based assessments of conversation quality (grounded in principles like the Gricean maxims) and automation using simulated user proxies. Third, we can strengthen reasoning via automated inspection. Symbolic checkers or programmatic validators can uncover hallucinations and inconsistencies in both online and offline settings. These signals can then guide the model through iterative refinement or prompt updates. I’ll illustrate these ideas through real-world applications spanning spreadsheet tasks and software development, highlighting how AI reasoning can be improved using structured intent, collaborative interaction, and systematic inspection.
This session introduces KumoRFM, the world's first Foundation Model for relational enterprise data, designed to perform zero-shot predictions directly on structured, multi-table data. Built on a scalable Graph Transformer architecture, KumoRFM generalizes across schemas and tasks, eliminating the need for hand-crafted features, data pipelines, or model training. Developers can try KumoRFM with their own data—no feature engineering required. By transforming relational data into heterogeneous graphs and applying in-context learning, the model delivers high-accuracy predictions across a wide range of applications—from churn and fraud detection to personalized recommendations and forecasting. We'll share key technical design choices, pretraining methodology, and lessons learned from deploying KumoRFM in real-world enterprise environments. This session highlights how the architecture enables fast time-to-value for AI initiatives while maintaining model transparency and performance. Attendees will gain insights into the future of predictive modeling, where structured data can be leveraged with foundation models as directly and flexibly as text or images.
In recent years, large language models (LLMs) have been widely used to enhance the generalization of graph models, both across graph tasks and domains—a trend known as "LLM for Graph." Recently, especially with the advent of models like Deepseek R1, research focus has shifted toward "Graph for LLM": leveraging graph reasoning tasks to fundamentally improve the general reasoning capabilities of LLMs. Graph reasoning, with its inherent structural complexity and multi-step logic, provides an ideal testbed for advancing LLMs' abilities in mathematical, logical, and commonsense inference. This talk will explore why graph reasoning is a key scenario for boosting LLMs' general reasoning skills, and discuss the latest progress and future directions in this emerging field.
Graph learning is often studied under a closed-world assumption, limiting its reliability when encountering noisy inputs or novel classes. In this talk, I will present EviNet, a framework for open-world graph learning that leverages symbolic reasoning and uncertainty estimation. EviNet integrates Beta embeddings with subjective logic, enabling two complementary modules: Dissonance Reasoning for misclassification detection and Vacuity Reasoning for out-of-distribution detection. Experiments show that EviNet achieves strong performance across classification and uncertainty tasks, offering a principled solution for graph learning in open and noisy environments.
Large Language Models (LLMs) and AI coding agents are fundamentally changing how software is built—assisting with tasks like generation, testing, documentation, and complex reasoning. In this talk, we’ll explore how LLMs can be harnessed across the software development lifecycle, and share practical techniques that unlock their full potential, including prompt engineering, agentic planning, iterative refinement, test-time scaling, and fine-tuning. We’ll also highlight how these approaches enable the creation of collaborative AI agents that act as intelligent teammates—accelerating development, improving code quality, and transforming how developers work.
RAG (Retrieval-Augmented Generation) has demonstrated strong capabilities in enhancing LLM performance by leveraging external knowledge. However, most current methods focus on flat or graph-based representations. Hierarchical structures are prevalent in real-world data such as code, JSON files, and scientific documents, underscoring the need for LLMs to effectively understand and reason over such organization. Our work shows that LLMs still struggle with complex hierarchical reasoning—highlighting key challenges and opportunities for future progress in this area.
Find out more about KDD Special Days – https://kdd2025.kdd.org/special–days/
Contact us at AIReasoning2025@kdd.org
Thank you to SciForDL'24