An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
My idea was to simply make a model for each flag and predict the remaining 20.000 answers, obtaining percentages regarding how many employees value each flag.
1. I was wondering if there is another approach to this and what would be the advantage over simply sampling the 20.000, getting percentages and extrapolating those, statistically, regardless of predictive models based on text.
2. Another valid question would be what is the difference between text mining and simply a tag cloud, but that is something that remains to be seen and I guess it depends on each individual problem. For example a more neutral question like "What do you think about your job?" may contain positive and negative sentiments using the same words, but right now I'm working on a question biased towards recieving positive sentiments.