Andrej Karpathy, the GOAT who gifted us the Neural Networks from Zero To Hero videos, released a 1hr Intro to LLMs video (slides)
Below are my takeaways.
- LLMs are neutal networks with billions of parameters dispersed through them, we know how to iteratively adjust the parameters to make them better, but we don't know how they collaborate to do it
- training LLMs = lossy compression (100x compression ratio), there is close relationship between compression and performance
- hallucinations: LLMs dreams, think of them as mostly inscrutable artifacts, develop correspondingly sophisticated evaluations
- how LLMs are build
- 1) pre-training for knowledge
- get large amounts of text (~10 TB), and get expensive compute (~6000 GPUs)
- compress text into neural network (pay ~$2m and wait ~12 days)
- get a base model
- 2) fine-tuning for alignment
- write labelling instructions
- hire people (scale.ai) to collect 100k high quality ideal Q&A instructions or comparisons
- finetune base model on this data (wait ~1 day)
- obtain assistant model
- run lots of evaluations
- deploy
- monitor, collect misbehaviours, go back to step 1
- 3) RLHF with comparisons data, train model on good/bad outputs
- 1) pre-training for knowledge
- scaling laws:
- performance of LLMs is a smooth, well-behaved, predictable function of
- N: no. of parameters in network
- D: amount of text
- expect a lot of "general capability" across all areas of knowledge
- performance of LLMs is a smooth, well-behaved, predictable function of
- system 2 thinking
- LLMs currently only have system 1 fast thinking, just next word prediction
- the goal is system 2, where they take time to think through a problem, providing more accurate answers
- how? create tree of thought and reflect on question before answering
- self-improvement
- what is equivalent of AlphaGO self-play for LLMs?
- main challenge is the lack of reward criterion (language is a large space and not well defined)
- LLM OS
- "LLMs is the kernel process of an emergent operating system"
- RAM = working memory = context window
- OS systems had closed and open source (llama2 vs gpt4)
- llm security
- LLMs are vulnerable to jailbreaks, prompt injection, data poisoning
Reading list
- Transformers and Language Models
- Running Language Models Locally
- Reinforcement Learning and Optimization in LLMs
- RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- Training Compute Optimal Language Models
- Scaling Laws for Neural Language Models
- Sparks of Artificial General Intelligence: Early experiments with GPT-4
- System One vs System Two Thinking
- LLM Operating System and Tool Use
- Retrieval Augmented Generation (RAG)
- Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP
- Toolformer: Language Models Can Teach Themselves to Use Tools
- Large Language Models as Tool Makers
- ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
- Multimodal Interaction and Peripheral Device I/O
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
- CLIP - Learning Transferable Visual Models From Natural Language Supervision
- ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding
- NExT-GPT: Any-to-any multimodal large language models
- LLaVA - Visual Instruction Tuning
- LaVIN - Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models
- CoCa: Contrastive Captioners are Image-Text Foundation Models
- Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning
- Security and Ethical Challenges in LLMs
- Jailbroken: How Does LLM Safety Training Fail?
- Universal and Transferable Adversarial Attacks on Aligned Language Models
- Visual Adversarial Examples Jailbreak Aligned Large Language Models
- Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
- Hacking Google Bard - From Prompt Injection to Data Exfiltration
- Poisoning Language Models During Instruction Tuning
- Poisoning Web-Scale Training Datasets is Practical