机器学习与数据科学博士生系列论坛(第八十七期)—— A Regularized Online Newton Method for Stochastic Convex Bandits with Linear Vanishing Noise
报告人:詹景昕(304am永利集团)
时间:2025-04-17 16:00-17:00
地点:腾讯会议 531-8098-3912
摘要:
Bandit convex optimization is an online version of the zeroth-order optimization problem, a fundamental issue in optimization with many applications in operations research and other fields. Recently, Lumbreras and Tomamichel [2024] proposed a vanishing noise model in which the subgaussian noise parameter is assumed to decrease linearly as the learner selects actions closer to the minimizer of the convex loss function.
In this talk, we introduce a Regularized Online Newton Method (RONM), which is based on the Online Newton Method (ONM) of Fokkema et al. [2024]. The method reaches a polylogarithmic regret in the time horizon n when the loss function grows quadratically in the constraint set, which recovers the results of Lumbreras and Tomamichel [2024] in linear bandits.
论坛简介:该线上论坛是由张志华教授机器学习实验室组织,每两周主办一次(除了公共假期)。论坛每次邀请一位博士生就某个前沿课题做较为系统深入的介绍,主题包括但不限于机器学习、高维统计学、运筹优化和理论计算机科学。