Fine-Tuning

Course module.

Module Contents

1. Fine-Tuning Basics

2. LoRA and PEFT

3. RLHF

Chapter 03

RLHF

Learn how RLHF works, from the reward model to the PPO algorithm, to align language models with human preferences.

Start Learning