sunday links #3

Sunflowers along the Seine, ca. 1885-1886, Gustave Caillebotte

first time playing the piano at a church in SF. felt a ton of performance anxiety but i reminded myself that i'm playing for God and not for people. was still able to enjoy it most of the time. grateful i found this opportunity to play again.

Figma: The best advice to bring into work in 2025
- look beyond design for inspiration, draw from emotions, memories, scientific discoveries, or nature.
- life constantly throws things at you to slow you down. the important thing is to keep moving
- embrace in-bewteen moments, let your mind wander to get creativity flowing again
The Qualities of Great Design
- what is quality?
  - what we agree is good
  - not random
  - a lot of care and time that went into it
  - feels like someone has thought of you
  - not painful to use in any way
  - feels state of the art
  - easier to get things done
  - all your concentration goes to the task at hand.
- the design process
  - "It's impossible to rush the creative process...You can't hire more people to get there faster. You just have to sit with it and continue working on it until the best solutions arise"
  - "After you've used your own work for some amount of time, you naturally develop some blind spots"
  - "You have to be very mindful that how you think about things or your opinions are not always, first of all, correct ones, nor the only ones"
Funny thing about curiosity
- true curiosity comes from emotional engagement (fun, rage) rather than intellectually identifying what "should" be interesting
  
  "Everyone has something unrepeatable to bring into the world. And we can manifest it by going in our direction of maximal interestingness. The pattern we make as our curiosity pulls us hither and thither like a dog chasing a smell across a field—that pattern is a gift we give the world"
- follow your genuine interest rather than falling into the trap of seeking ideas that would impress others
  
  "It is easy to mistake status-seeking for curiosity"
On The Edge
- "many of the most influential people are on the edge of insanity"
how to ask good questions
- "questions came from knowledge that your soul had but forgot and is trying to remember" 🔗
- "it's harder to ask questions in a void, because without some basic knowledge, there is no expectations. And with no exepctations, there's no surprise; with no surprise, there's no question (of the clarifying type)"
- "a bad question is better than no question"
- reliable question templates
  - asking for concrete examples/illustration of an abstract concept
  - asking the inverse i.e. what is something not
  - going meta (ask how it came to be, why it matters)
self-adaptive LLMs – Sakana.ai
- two-pass inference for dynamic adaptation
  - first pass: task identification via prompt,engineering, a classifier, or few-shot examples to select/compose expert vectors
  - second pass: adjust model weights by scaling singular values (SVD) using task-specific expert vectors
- singular value fine-tuning (SVF)
  - decompose weight weight matrices into UDV^T and train expert vectors via RL to scale the signular values (D)
  - full-rank adaptation: modifiesi all singular values (v.s. low-rank updates in LoRA)
pitfalls when building genai apps - Chip
- use genai when you don't need gen ai
- confuse bad product with bad AI (AI is the easy part, product is the hard part)
- start too complex (don't start with new frameeworks and finetuning)
- over-index on early sucess (initial success is misleading, demo -> production is difficult)
- forgo human eval (AI judges should be valuated and correlated with systematic human eval)
- crowdsource use cases (have a big-picture strat to maximize return on investment)
LLMs are good at k-order thinking
- don't start by telling an LLM to "give me a good idea". start with hunches, and use LLMs to flesh out and validate
How does batching work on modern GPUs
- batching should theoretically take more time, but on modern GPUs it's often "free" up to a certain batch
- the efficiency comes from GPU concurrency and memory handling. GPUs load matrix blocks into small shared memory units and reuse them across the batch
- performance is governed by two main bottlenecks
  - memory bandwidth when transferring weights
  - GPU FLOPS (floating operations per second) for calculations
- the optimal batch size depends on the ratio between GPU FLOPS and memory bandwidth, i.e. TF GPU, performance starts degrading around batch size 128
- batching effects different architectures
  - MLPs and transformers benefit from batching (kv cache)
  - CNN benefit less because they reuse weights
  - MoE models work similarly to transformers but require careful handling of batch splitting across experts
vision language models
- two big decisions in training VLM
  - 1) how to combine image/text inputs
  - 2) whether or not to train the visual and language features separately
- standard recipe righ tnow
  - use ViT for image encoder, initialized from large open source model (SigLip or CLIP)
  - use pretrained LLM as base model
  - finetuned the combined model
- no difference whether image/text data is added as inputs to model, Qwen or DeepSeek do well just having image features concatenated to the text embeddings ('early fusion')

BENEDICT NEO 梁耀恩

sunday links #3