NHacker Next
login
▲Reinforcement Learning from Human Feedbackarxiv.org
23 points by onurkanbkrc 2 hours ago | 1 comment
Loading comments...
klelatti 52 minutes ago [-]
Web version with links, etc:

https://rlhfbook.com/

iisweetheartii 37 minutes ago [-]
[dead]