Dear Miners,
I am a total noob in the field of data mining, i have just googled along in the past few days about definitions and possibilities and here is my project (and question).
In a sort of "digital humanities" research, i would like to use an electronic mailing list archive to conduct an analysis in the form of :
1) a text mining project : twenty years of email exchange may host valuable information. i have no doubt that RapidMiner can do that.
2) an "evolution of topics over time" project : is the "time series extension" (or other rapidminer function) a possibility to extarct information and plot it over time ?
3) a "social network" analysis : in the same manner than twitter or facebook mapping can be done, would it be possible to show relations between participants in the mailing list ?
is an electronic mailing list an easy (and known) corpus to extract information from in the three aspects mentionned above ? (i could not find anything about mailing lists rapid mining in google)
My targeted mailing list is available as an archive on the web (either as downloadable text files or as a html pages on the web) :
http://www.ccl.net/chemistry/resources/messages/index.shtmlIs processing of this kind of corpus something easy ? something already done elsewhere ?
Thanks for your comments and apologies for my english,
Alexandre Hocquet