Other FAQs about RLHF#
How many humans are involved in RLHF?#
It varies by company, but we're talking about hundreds or thousands of human reviewers for major models like ChatGPT. These aren't just random people off the street — they're often trained reviewers who understand the guidelines and can provide consistent feedback.
Is RLHF expensive?#
Very. Paying humans to carefully read and rate AI responses is time-consuming and costly. This is one reason why RLHF is typically the final step in training — you want to get the model as good as possible through cheaper methods first.
Can RLHF go wrong?#
Absolutely. If the human feedback is biased or inconsistent, the model learns those biases. If reviewers prefer responses that sound confident even when wrong, the model might become overconfident. This is why companies spend so much effort on training their human reviewers and establishing clear guidelines.
Do all AI models use RLHF?#
Not all, but most consumer-facing AI products do some form of it. Any AI that's designed to interact with humans conversationally has probably been through RLHF or similar human feedback processes. It's become the industry standard for making AI helpful rather than just knowledgeable.
What happens after RLHF?#
RLHF isn't a one-time thing. Companies continuously collect feedback from real users and periodically retrain their models with new human feedback data. It's an ongoing process of refinement to keep the AI aligned with human preferences and expectations.