reinforcement learning Reinforcement Learning with Human Feedback Notes on alignment by preference: RLHF to DPO combinatorics Ramsey Theory Early steps in Additive Combinatorics Roth's Theorem Fourier analytic proof of Roth's Theorem on 3APs Szemerédi's Regularity Lemma nlp Language Modeling Foundations An Introduction to Language Modeling algebra Semidirect Product An abstraction of the direct product