Intro to LLMs

Andrej Karpathy, the GOAT who gifted us the Neural Networks from Zero To Hero videos, released a 1hr Intro to LLMs video (slides)

Below are my takeaways.

LLMs are neutal networks with billions of parameters dispersed through them, we know how to iteratively adjust the parameters to make them better, but we don't know how they collaborate to do it
training LLMs = lossy compression (100x compression ratio), there is close relationship between compression and performance
hallucinations: LLMs dreams, think of them as mostly inscrutable artifacts, develop correspondingly sophisticated evaluations
how LLMs are build
- 1) pre-training for knowledge
  - get large amounts of text (~10 TB), and get expensive compute (~6000 GPUs)
  - compress text into neural network (pay ~$2m and wait ~12 days)
  - get a base model
- 2) fine-tuning for alignment
  - write labelling instructions
  - hire people (scale.ai) to collect 100k high quality ideal Q&A instructions or comparisons
  - finetune base model on this data (wait ~1 day)
  - obtain assistant model
  - run lots of evaluations
  - deploy
  - monitor, collect misbehaviours, go back to step 1
- 3) RLHF with comparisons data, train model on good/bad outputs
scaling laws:
- performance of LLMs is a smooth, well-behaved, predictable function of
  - N: no. of parameters in network
  - D: amount of text
- expect a lot of "general capability" across all areas of knowledge
system 2 thinking
- LLMs currently only have system 1 fast thinking, just next word prediction
- the goal is system 2, where they take time to think through a problem, providing more accurate answers
- how? create tree of thought and reflect on question before answering
self-improvement
- what is equivalent of AlphaGO self-play for LLMs?
- main challenge is the lack of reward criterion (language is a large space and not well defined)
LLM OS
- "LLMs is the kernel process of an emergent operating system"
- RAM = working memory = context window
- OS systems had closed and open source (llama2 vs gpt4)
llm security
- LLMs are vulnerable to jailbreaks, prompt injection, data poisoning

Reading list

Transformers and Language Models
Running Language Models Locally
Reinforcement Learning and Optimization in LLMs
System One vs System Two Thinking
LLM Operating System and Tool Use
Multimodal Interaction and Peripheral Device I/O
Security and Ethical Challenges in LLMs

BENEDICT NEO 梁耀恩

Intro to LLMs

Reading list