Christof Henkel, the current #2 on Kaggle shares in an interview how to succeed in a competition.
His track record: first place in 7 out of 75 challenges
How to pick a challenge
First you have to decide what challenge to work on.
Success is positively correlated with how much time you spend on the challenge.
Pick a challenge that motivates you, not what seems the easiest.
Also, investigate the data of a challenge before deciding on it.
Once you have one in mind, here's his 3 step approach.
1. Create an End-to-end pipeline
- Create a very simple pipeline of reading in data, creating features, training a (simple) model, and computing the competition-specific metric
- replicate as closely as possible to the validation setup for leaderboard
2. Experiment and research
- start with a simple model and iterate through many ideas
- read research papers, check other comps
- look at data, even external data, reduce noise, augment the data
- use different losses
- post-process the predictions
- more experimentation = better, evaluate them with validation setup
3. Scale your approach
- now you've converged on a final model, and are no longer experimenting
- it's time to use all of the data and "scale up" the model
- this means hyperparameter turning, using a deeper model, etc.
- happens at the last 2-3 weeks of comp
Summary
Pick something that motivates you. First goal: create a feedback loop that allows you to experiment. Then experiment. Optimize in the end.
For toolkits for challenges, check out this winning toolkit article.