About 4 results
Open links in new tab
  1. ABSTRACT Bilevel optimization has recently attracted considerable attention due to its abun-dant applications in machine learning problems. However, existing methods rely on prior knowledge …

  2. We optimize the model with the Adan optimizer (Xie et al.,2022) and a base learning rate of 0.0008. The total training time is about 3 days. Self-correction. The teacher-student mutual …

  3. Xingyu Xie, Pan Zhou, Huan Li, Zhouchen Lin, and Shuicheng Yan. Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models, 2023. Zhangchen Xu, Fengqing …

  4. Shuicheng Yan. Adan: Adaptive nesterov momentum algorithm for faster op imizing deep models. IEEE Transactions on Pattern Analysis and Mach hen. Seeing and hearing: Open-domain …