machine-learning
an archive of posts in this category
| Jun 15, 2026 | Safety Reasoning |
|---|---|
| May 29, 2026 | DPO and Safety Fine Tuning |
| May 26, 2026 | Reinforcement Learning from Human Feedback (RLHF) |
| May 22, 2026 | Supervised Fine Tuning (SFT) |
| May 21, 2026 | Notes on Transformers |