How to Win Kaggle Competitions

BENEDICT NEO 梁耀恩

💭📰now curius projects library findme archive 🎲

How to Win Kaggle Competitions

November 10, 2023

Christof Henkel, the current #2 on Kaggle shares in an interview how to succeed in a competition.

His track record: first place in 7 out of 75 challenges

How to pick a challenge

First you have to decide what challenge to work on.

Success is positively correlated with how much time you spend on the challenge.

Pick a challenge that motivates you, not what seems the easiest.

Also, investigate the data of a challenge before deciding on it.

Once you have one in mind, here's his 3 step approach.

1. Create an End-to-end pipeline

Create a very simple pipeline of reading in data, creating features, training a (simple) model, and computing the competition-specific metric
replicate as closely as possible to the validation setup for leaderboard

2. Experiment and research

start with a simple model and iterate through many ideas
read research papers, check other comps
look at data, even external data, reduce noise, augment the data
use different losses
post-process the predictions
more experimentation = better, evaluate them with validation setup

3. Scale your approach

now you've converged on a final model, and are no longer experimenting
it's time to use all of the data and "scale up" the model
this means hyperparameter turning, using a deeper model, etc.
happens at the last 2-3 weeks of comp

Summary

Pick something that motivates you. First goal: create a feedback loop that allows you to experiment. Then experiment. Optimize in the end.

For toolkits for challenges, check out this winning toolkit article.

Next:

10 Principles for Good Design

Previous:

Clean Code - Google