ray summit day 1

October 1, 2024


a dump of notes i took for ray summit day 1

ion stoica

  • mid 2000s : big data and classical ML with hadoop and spark
  • mid 2010s: deep learning and RL. GPU started to become indispensible
  • mid 2020s: GenAI

5 trends

  1. Scale
    • growing 5x every year
    • cost of training 10x every 2 years
    • this happens on inference too: o1 model takes 10s seconds and more context
  2. massive unstructured data
    • text, audio, image, video
  3. sophisticated post-training
    • pruning & distillation
  4. AI powering teh AI stack
    • ai is used to optimize model development
  5. compound AI and agentic system
    • involves 100s of models
    • exploring LLM based intelligent agents paper

these trends spurs innovation in

  • hardware accelerators by NVDA, AMD, Aws, Intel, etc.
  • GPU pods (clusters)

CPU-centric -> accelerator centric world

  • AI clouds (lambda, aws)
  • Frameworks (SGL, VLLM, TensorRT-LLM)
  • Tools for monitoring
  • Models (hugging face)

engineers spend time writing yaml files and troubleshooting kubernetes

we need a software engine

  • support any ml workload
  • any data types and model architecture
  • fully utilize any accelerators
  • scale from laptop to thousands of GPUs
  • abstract away complexity of infra from end developer
  • serve as flexible and unifying platform for entire AI ecosystem

AI compute engine

3 core problems

  • managing compute resources

    • autoscaling, spot instance support, hardware failure handling
  • managing data

    • distributed object store, shared memory, futures, optimized data movement (NCCL, RDMA)
  • executing workloads

    • scheduling, fault tolerance, management of stateless and stateful tasks, dynamic and compiled graphs
  • instacart training on 100x more data

  • niantic cut LOC by 85%

  • canva cut cloud cost by 50%

announcements

default execution to ray

  • dynamic memory allocation
  • expensive copy GPU-to-CPU memory
  • expensive transfer over slow network
  • pass args and references

solution: compiled graphs

  • create and compile a static graph to execute repeated tasks
  • pre-allocate static buffers; reuse them in many places
  • no need to pass args and result references
  • direct GPU-to-GPU transfer

ray data

unstructured data is the fastest growing use case

they require mixed CPU and GPU compute

ray handles

  • streaming ingest
  • last-mile preprocessing
  • ingest for training

spark, hadoop are all CPU-based, and works best on structured, tabular data

AI workloads are GPU-centric and requires unstructured data

amazon cut cost by 82% moving from spark to ray data, cutting $120 mil a year

runway

runway ML is focused on world modeling with visual data.

many aspects of the world are not captured through language, it's a lossy state. using video data, their gen-3 alpha has emergent capabilities of understanding physics like how liquid flows, water splashes, even though not trained on it

they referenced Scalable Diffusion Models with Transformers which sparked SOTA image generation

other research papers on their website

gen-3 alpha challenges

  • size of samples are orders of magnitude greater than language models, network challenges, handle communication computation overlap well
  • data preprocessing challenges, dataset curation, quality of data is important

science is about modeling the distribution

art is going out of distribution

the more you can model reality, the more you can build very accurate distribution of the world, and the more you can out of distribution.

the future involves AI in film making, they've hosted an AI film festival and supports professionals working on AI-augmented film projects with the hundred film fund

marc andreessen

  • it took 70-80 years to prove AI was possible, (2012) image net -> self-driving -> transformers (2017)
  • ai systems = new kinds of computers
  • traditional computers are deterministic systems, you always get the same output. ai systems are probabilistic computers
  • there's a fine line between hallucination and creation, people hallucinate too
  • why AI is better now? moore's law provided compute power and internet provided the data
  • are people in the future going to use video or photo editing software? or will they just speak to get what they want the computers to do
  • adding AI to your product is like adding flowers to a cake, it doesn't really work well. if you want to build a good product, the flower has to be in the recipe
  • bullet point no 6. phenomenon a 5 year old company adding AI to their 5 bullet points slides
  • biotech: challenge of data, gathering all human genome data, in china it is fine, but in the US it is illegal
  • ai and geopolitics: mid 2010s AI and autonomy is the third offset in military. 1st was nuclear, and 2nd was maneuver warfare (advancement of GPS).
  • DARPA self-driving challenge was in 2005. once you talk the pilot out of the plane, you can do all kinds of things
  • ukraine war, russians have guys in tanks, whereas ukranians have autonomous drones and javelins
  • iranian war, USA using millions of dollars of tomahawk missles to destroy drones costing only thousands of dollars, it's like a slippage of time, these technologies are in the same era
  • strongest military force: who has the best technology and money
  • will you still have human soldiers at risk in planes and submarines in 20 years?
  • having two kinds of conversations in D.C.
    • tuesday conversations : US vs CHINA, what can SV do to advance technology
    • thursday convo: focused on US, technology is freaking us out, we need to regulate and shutdown, we have to slow down, etc.
  • why has technology gone political?
    • it's all our fault
    • the dog that caught the bus, it catches on the tailpipe and just keeps being dragged across the street. we are the dog
    • people in holywood are freaked out about AI being able to generate full movies
    • people are going on strike against automation and technology
  • can AI level up this discourse?
    • computing used to be a 30 mil technology that slowly trickled down
    • today AI is released to everyone, there's a general uplift of intelligence for everyone, access to intelligence on their fingertips
  • open source
    • people in California are lobbying to slow down, existential threat
    • EU implemented a stifling blanket of regulation
  • robotics
    • the whole history of AI and robotics is you get low-level first, robots to pack your suitcase and clean, and robots that can play songs and draws
    • today it's the reverse, it can be creative, but we don't have robotics yet
    • Unitree in china has a huge supply chain of robotics
    • robotics is very close, we might be a few years fromm humanoids gathering data like tesla cars
  • who will be the AI winners
    • you need a strategy
    • a lot of questions are on the economic side, where the value is going to be?
    • are LLMs going to be a question of who has the best model? that's what happened to google search
    • or is it the race to the bottom? where intelligence is like selling rice? anyone can use an open-source model, anyone can buy GPUs to get the same result.
      • google paper: anyone who has the same data, can get the same results
      • evidence: price per token cost has dropped 100x in the last year
    • having full competitive open source changes things, llama models release changed things
  • NVIDIA gpu
    • the other argument, they draw a huge profit lead, which draws competition and other startups who wants the piece of the pie
    • developing chips from scratch might do better than GPUs who were originally made for graphics
    • NVIDIA might do well for 5 years, and other competitions take over
  • advice for founders
    • big thing is it rarely makes sense to just start a company and go search an idea
    • it's usually deep domain experts who's been in an industry for 5-15 years, deep in the trenches trying to figure out better ways to solve problems
    • how to operate in a rapidly changing environment: always be running experiments
    • doing a new thing is always scary, what if it doesn't work?
    • run the smallest experiment, smallest customer segment, learn as you go, without having huge downsides to risk

some thoughts from talking to people at booths

questions to ask people at sponsor booths

  • what does your company do?
    • follow up questions if possible
  • who are your competitors? what makes you different?
  • who are your customers?
  • what are some interesting use cases you've seen for your product? success stories?
  • what is the future roadmap?
  • if consumer product, do you personally use it? have you built anything interesting?
  • how long have you been there?
  • what gets you excited about the company?

thank them for answering your questions, get swag, move on.