Week 3 - Research Papers

These notes were developed using lectures/material/transcripts from the DeepLearning.AI & AWS - Generative AI with Large Language Models course

Reinforcement Learning from Human Feedback (RLHF)

Training language models to follow instructions with human feedback - Paper by OpenAI introducing a human-in-the-loop process to create a model that is better at following instructions (InstructGPT).
Learning to summarize from human feedback - This paper presents a method for improving language model-generated summaries using a reward-based approach, surpassing human reference summaries.

Proximal Policy Optimization Algorithms - The paper from researchers at OpenAI that first proposed the PPO algorithm. The paper discusses the performance of the algorithm on a number of benchmark tasks including robotic locomotion and game play.
Direct Preference Optimization: Your Language Model is Secretly a Reward Model - This paper presents a simpler and effective method for precise control of large-scale unsupervised language models by aligning them with human preferences.

Constitutional AI: Harmlessness from AI Feedback - ****This paper introduces a method for training a harmless AI assistant without human labels, allowing better control of AI behavior with minimal human input.

Chain-of-thought Prompting Elicits Reasoning in Large Language Models - Paper by researchers at Google exploring how chain-of-thought prompting improves the ability of LLMs to perform complex reasoning.
PAL: Program-aided Language Models - This paper proposes an approach that uses the LLM to read natural language problems and generate programs as the intermediate reasoning steps.
ReAct: Synergizing Reasoning and Acting in Language Models - This paper presents an advanced prompting technique that allows an LLM to make decisions about how to interact with external applications.

LangChain Library (GitHub) - This library is aimed at assisting in the development of those types of applications, such as Question Answering, Chatbots and other Agents. You can read the documentation here.
Who Owns the Generative AI Platform? - The article examines the market dynamics and business models of generative AI.