(Q115570683)

English

Reinforcement Learning from Human Feedback

variant of reinforcement learning

  • RLHF
  • Reinforcement learning from human feedback
  • reinforcement learning from human preferences

Statements