Department of Industrial Engineering & Decision Analytics [Joint IEDA/ISOM] seminar - Calibration Error for Decision Making
Supporting the below United Nations Sustainable Development Goals:支持以下聯合國可持續發展目標:支持以下联合国可持续发展目标:
In modern AI workflows where prediction and decision-making are separated, an upstream prediction team must serve unknown or heterogeneous downstream decision makers. Calibration, a property of predictive models, enables this separation. It requires reported probabilities to be conditionally unbiased, so they can be trusted and directly used by any downstream decision maker without task-specific correction.
We propose Calibration Decision Loss (CDL) as a decision-theoretic calibration error. CDL is defined as the largest improvement in expected payoff obtainable by recalibrating a predictor across all bounded decision problems. By definition, CDL captures the worst-case decision loss from miscalibration and therefore directly measures whether a predictive model is trustworthy for heterogeneous downstream users.
Our results highlight a fundamental difference between decision-theoretic error and known calibration errors. CDL counts only biases that can change decisions, whereas standard metrics such as Expected Calibration Error (ECE) averages over all prediction biases, regardless of their impact on decisions. In an online setting, this perspective leads to a prediction algorithm with near-optimal expected CDL, improving upon decision-making guarantees obtained by optimizing canonical calibration errors.
Yifan Wu is a postdoctoral researcher in the EconCS group at Microsoft Research New England. She received her Ph.D. in Computer Science from Northwestern University, advised by Jason Hartline, and her B.S. in Computer Science from Peking University (Turing Class). Her research lies at the intersection of theoretical computer science, machine learning, and economics. Her recent research focuses on developing a decision-theoretic foundation for trustworthy AI, with topics spanning calibration, information elicitation, and the interface between prediction and decision-making.