RapidMiner Data Science Competition 4: DrivenData's "Pover-T" - $15,000 prize for charity
hello RapidMiners -
Greetings and happy new year. First of all huge congratulations again to @maros_plsik for winning the Fantasy Football challenge. And kudos to @florian_ziegler and @yzan for their great efforts as well. Unfortunately those were the ONLY submissions to this challenge so I'm going to change gears for the next one in hopes that we can get more people involved.
Setup: DrivenData is a data science competition website (similar to Kaggle) where they only host challenges that benefit the public good. They have a new challenge called "Pover-T", sponsored by the World Bank, where they are trying to use survey data to predict which households are classified as "poor" or not. It's pretty straightforward. The measuring stick is simply minimizing logarithmic loss on the "poor" class predictions. They have standard training and testing data sets, a scoreboard, and so on. Here's the link where you can read all about it: https://www.drivendata.org/competitions/50/worldbank-poverty-prediction/page/97/
Challenge: I have formed a "RapidMiners" team on DrivenData where I am inviting ANY and ALL RapidMiner data scientists who wish to join me. If we win (I have no doubt we will!), we all agree to donate the full $15,000 prize money to Doctors Without Borders / Médecins Sans Frontières. Why? First because I think it's a very good cause. Second because splitting prize money among n RapidMiners in k countries would be horribly difficult. Third because I want to show those folks at SAS that they are not the only ones doing #data4good (look it up on Twitter).
RapidMiner processes: I worked on it for a couple of hours last night and we're already in 29th place. Let's keep trading processes in this thread and go from there. I have a folder on OneDrive that I am sharing here; probably the "right" way to do this is on GitHub but I'm a git-idiot. Anyone want to get this going?
Deadline: The "Pover-T" challenge is going on now and ends Wednesday, February 28, 2018, 11:59 p.m. UTC.
Who's in? Sign up at DrivenData, send me your username, and let's show the world our data science chops!
Scott
(image source: Wikimedia Commons)