ray summit day 1

a dump of notes i took for ray summit day 1

ion stoica

mid 2000s : big data and classical ML with hadoop and spark
mid 2010s: deep learning and RL. GPU started to become indispensible
mid 2020s: GenAI

5 trends

Scale
- growing 5x every year
- cost of training 10x every 2 years
- this happens on inference too: o1 model takes 10s seconds and more context
massive unstructured data
- text, audio, image, video
sophisticated post-training
- pruning & distillation
AI powering teh AI stack
- ai is used to optimize model development
compound AI and agentic system
- involves 100s of models
- exploring LLM based intelligent agents paper

these trends spurs innovation in

hardware accelerators by NVDA, AMD, Aws, Intel, etc.
GPU pods (clusters)

CPU-centric -> accelerator centric world

AI clouds (lambda, aws)
Frameworks (SGL, VLLM, TensorRT-LLM)
Tools for monitoring
Models (hugging face)

engineers spend time writing yaml files and troubleshooting kubernetes

we need a software engine

support any ml workload
any data types and model architecture
fully utilize any accelerators
scale from laptop to thousands of GPUs
abstract away complexity of infra from end developer
serve as flexible and unifying platform for entire AI ecosystem

AI compute engine

3 core problems

managing compute resources
- autoscaling, spot instance support, hardware failure handling
managing data
- distributed object store, shared memory, futures, optimized data movement (NCCL, RDMA)
executing workloads
- scheduling, fault tolerance, management of stateless and stateful tasks, dynamic and compiled graphs
instacart training on 100x more data
niantic cut LOC by 85%
canva cut cloud cost by 50%

announcements

default execution to ray

dynamic memory allocation
expensive copy GPU-to-CPU memory
expensive transfer over slow network
pass args and references

solution: compiled graphs

create and compile a static graph to execute repeated tasks
pre-allocate static buffers; reuse them in many places
no need to pass args and result references
direct GPU-to-GPU transfer

ray data

unstructured data is the fastest growing use case

they require mixed CPU and GPU compute

ray handles

streaming ingest
last-mile preprocessing
ingest for training

spark, hadoop are all CPU-based, and works best on structured, tabular data

AI workloads are GPU-centric and requires unstructured data

amazon cut cost by 82% moving from spark to ray data, cutting $120 mil a year

runway

runway ML is focused on world modeling with visual data.

many aspects of the world are not captured through language, it's a lossy state. using video data, their gen-3 alpha has emergent capabilities of understanding physics like how liquid flows, water splashes, even though not trained on it

they referenced Scalable Diffusion Models with Transformers which sparked SOTA image generation

other research papers on their website

gen-3 alpha challenges

size of samples are orders of magnitude greater than language models, network challenges, handle communication computation overlap well
data preprocessing challenges, dataset curation, quality of data is important

science is about modeling the distribution

art is going out of distribution

the more you can model reality, the more you can build very accurate distribution of the world, and the more you can out of distribution.

the future involves AI in film making, they've hosted an AI film festival and supports professionals working on AI-augmented film projects with the hundred film fund

marc andreessen

it took 70-80 years to prove AI was possible, (2012) image net -> self-driving -> transformers (2017)
ai systems = new kinds of computers
traditional computers are deterministic systems, you always get the same output. ai systems are probabilistic computers
there's a fine line between hallucination and creation, people hallucinate too
why AI is better now? moore's law provided compute power and internet provided the data
are people in the future going to use video or photo editing software? or will they just speak to get what they want the computers to do
adding AI to your product is like adding flowers to a cake, it doesn't really work well. if you want to build a good product, the flower has to be in the recipe
bullet point no 6. phenomenon a 5 year old company adding AI to their 5 bullet points slides
biotech: challenge of data, gathering all human genome data, in china it is fine, but in the US it is illegal
ai and geopolitics: mid 2010s AI and autonomy is the third offset in military. 1st was nuclear, and 2nd was maneuver warfare (advancement of GPS).
DARPA self-driving challenge was in 2005. once you talk the pilot out of the plane, you can do all kinds of things
ukraine war, russians have guys in tanks, whereas ukranians have autonomous drones and javelins
iranian war, USA using millions of dollars of tomahawk missles to destroy drones costing only thousands of dollars, it's like a slippage of time, these technologies are in the same era
strongest military force: who has the best technology and money
will you still have human soldiers at risk in planes and submarines in 20 years?
having two kinds of conversations in D.C.
- tuesday conversations : US vs CHINA, what can SV do to advance technology
- thursday convo: focused on US, technology is freaking us out, we need to regulate and shutdown, we have to slow down, etc.
why has technology gone political?
- it's all our fault
- the dog that caught the bus, it catches on the tailpipe and just keeps being dragged across the street. we are the dog
- people in holywood are freaked out about AI being able to generate full movies
- people are going on strike against automation and technology
can AI level up this discourse?
- computing used to be a 30 mil technology that slowly trickled down
- today AI is released to everyone, there's a general uplift of intelligence for everyone, access to intelligence on their fingertips
open source
- people in California are lobbying to slow down, existential threat
- EU implemented a stifling blanket of regulation
robotics
- the whole history of AI and robotics is you get low-level first, robots to pack your suitcase and clean, and robots that can play songs and draws
- today it's the reverse, it can be creative, but we don't have robotics yet
- Unitree in china has a huge supply chain of robotics
- robotics is very close, we might be a few years fromm humanoids gathering data like tesla cars
who will be the AI winners
- you need a strategy
- a lot of questions are on the economic side, where the value is going to be?
- are LLMs going to be a question of who has the best model? that's what happened to google search
- or is it the race to the bottom? where intelligence is like selling rice? anyone can use an open-source model, anyone can buy GPUs to get the same result.
  - google paper: anyone who has the same data, can get the same results
  - evidence: price per token cost has dropped 100x in the last year
- having full competitive open source changes things, llama models release changed things
NVIDIA gpu
- the other argument, they draw a huge profit lead, which draws competition and other startups who wants the piece of the pie
- developing chips from scratch might do better than GPUs who were originally made for graphics
- NVIDIA might do well for 5 years, and other competitions take over
advice for founders
- big thing is it rarely makes sense to just start a company and go search an idea
- it's usually deep domain experts who's been in an industry for 5-15 years, deep in the trenches trying to figure out better ways to solve problems
- how to operate in a rapidly changing environment: always be running experiments
- doing a new thing is always scary, what if it doesn't work?
- run the smallest experiment, smallest customer segment, learn as you go, without having huge downsides to risk

some thoughts from talking to people at booths

questions to ask people at sponsor booths

what does your company do?
- follow up questions if possible
who are your competitors? what makes you different?
who are your customers?
what are some interesting use cases you've seen for your product? success stories?
what is the future roadmap?
if consumer product, do you personally use it? have you built anything interesting?
how long have you been there?
what gets you excited about the company?

thank them for answering your questions, get swag, move on.

BENEDICT NEO 梁耀恩

ray summit day 1

announcements

runway

marc andreessen