Saturday, June 27, 2009

10% already ?!

There has not been much progress for me in the past four months. As my "new" ideas no longer made noticeable contribution to the final result, I stopped working on improvements, instead, tried to develop some theory about the methods in the Netflix competition. It really bugs me to fiddle with hundreds of predictors and billions of parameters, without understanding exactly what I am doing and why.

It surely is more enjoyable thinking than computing the unchanging RMSEs. And I thought I could do that for a long time, anyway, the 10% seemed residing in a distant future. It's kinda funny in retrospect, because part of my theory was to measure the signal-to-noise ratio, thinking it might show that the data is simply too noisy to achieve 10% improvement. The team "BellKor's Pragmatic Chaos" just proved how wrong I was. It's like watching a marathon runner finished the race with a 15-meter triple jump. Amazing! Kudos to them!

It will be interesting to watch what happens in the next 30 days.

7 comments:

  1. Perhaps the most interesting element of this contest is not the final algorithms, but the methodology for getting to them. It was here that I was finally getting significant traction, and then the contest accelerates to a close. too bad. When two different models, or the same model with variations, produces similar RMSE, but shows some blending merit, it seems the next necessary step is to isolate the properties of the ratings that each model excels at. This kind of analysis didn't get any treatment in the Netflix specific publications that I saw.

    In the end, it's all one algorithm. Blending is just a clumsy way to put it together when you don't understand the pieces. I suspect there's even more room for improvement beyond 10% for the reason that blending is still such an important piece of the solution.

    ReplyDelete
  2. Hi Aron,
    Exactly the same feeling I have! In essence, it's all one algorithm. Everything is about how to get there.
    I saw you have developed a 0.8818 (probe) single model. That's a VERY impressive result. If you are interested in more discussion, could you please email me at netdace at gmail dot com ? Sorry I failed to find your email.

    ReplyDelete
  3. Did you think about joining the Grand Prize Team? It would be interesting to see how GPT and Dace perform together.

    ReplyDelete
  4. I agree with urod. Now it makes sense for *everyone* else to join grand prize team for a cut of the prize. If GPT beats 10.05, then even the leaders will have to join, and it'll be one big party!

    ReplyDelete
  5. I third the last two comments. Join GPT or one of the other teams, and reap the reward. Hurry!

    ReplyDelete