🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Analyze 77000 tweets

User: "vittorio_confuo"
New Altair Community Member
Updated by Jocelyn

Dear community,

 I have to deal with a dataset of 77000 tweet with the following attributes: post_id, username, hash_tag, sent_time, text, user_id, source, is_retweet, is_reply, lang, retweet_count, reply_count, latitude, longitude. I must do an analysis using association rules and clustering but I'm new on RM and I hope someone can give me advice on how to proceed. 

My first problem is the free license: I can read only 10000 lines. Do operators exist that generate a significant sample? 

Second problem: what kind of association rules can I use? I'm thinking of "manual" sentiment analysis ( I have seen that there is Aylien extension but it has limitation and it doesn't work with italian language): is there a way to find the most important words in the tweet in order to do a positive/negative classification? 

 

Can you suggest me some association rules and/or clustering algorithms that I could use? How could I interpret them?

 

I apologize for all these questions and I would be very greatful if someone wants is kind enough to help me!

Regards, 

Vittorio Confuorto

Find more posts tagged with