RecNet
Generalized Preference Optimization: A Unified Approach to Offline AlignmentYunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Rémi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Ávila Pires, Bilal Piot
Feb, 2024
Read
Anne Wu recommended on on 2/13/2024
A formulation for offline alignment; the authors consider recent work (DPO, IPO, SLiC, etc.) as special cases using different losses. They consider reward modeling as a binary classification problem, and also analyzed the connections between offline regularization & KL (in RLHF).
Open Problems and Fundamental Limitations of Reinforcement Learning from Human FeedbackStephen Casper, Xander Davies et al.
Sep, 2013
Read
Anne Wu recommended on on 1/2/2024
A survey paper about challenges in RLHF, categorized by human feedback, reward model, and policy. The authors also distinguish tractable/fundamental limitations, and discuss safety, governance and transparency. Sometimes I wonder about the problem specification in RLHF.
LLF-Bench: Benchmark for Interactive Learning from Language FeedbackChing-An Cheng, Andrey Kolobov, Dipendra Misra, Allen Nie, Adith Swaminathan
Dec, 2023
Read
Anne Wu recommended on on 12/19/2023
The bench includes several sequential decision-making tasks (reco, navigation, etc.), and the agent needs to solve them using NL instructions+language feedback (synthetic). They use paraphrasing to avoid overfitting. Some design choices could be relevant, but no baseline provided
Prediction-Oriented Bayesian Active LearningFreddie Bickford Smith, Andreas Kirsch, Sebastian Farquhar, Yarin Gal, Adam Foster, Tom Rainforth
Apr, 2023
Read
Anne Wu recommended on on 11/28/2023
Many information-theoretic approaches in active learning optimizes the BALD score. This work argues that it can be suboptimal & proposes the expected predictive information gain (EPIG) measuring info gain in prediction space (and not in parameter space).
Vision Transformers Need RegistersTimothee Darcet, Maxime Oquab, Julien Mairal & Piotr Bojanowski
2023
Read
Anne Wu recommended on on 11/21/2023
Artifacts are identified in both SL/SSL ViTs - redundant tokens with high norms appearing during inference, primarily in low-informative background areas of images, suggesting that those tokens are repurposed to store global information. The work proposes to add a register token.
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?Ahmed Alajrami, Katerina Margatina, Nikolaos Aletras
2023
Read
Anne Wu recommended on on 11/14/2023
Interesting paper probing how information loss in input token characters affects the performance of pretrained LMs. Pretraining with subsets of token chars, ex. just 1 char/token (extreme case), could still lead to 90% on SuperGLUE: some subsets may be more informative.
© 2024 RecNet. All rights reserved.