[SOLVED] Getting TF-IDF from unpivoted data
louism
New Altair Community Member
Hi, I am trying to do text mining. I don't have the original documents, but my words are already in a database. For example:
Doc A: How are you?
Doc B: I am fine
What I have is a mysql table like
A How
A are
A you
B I
B am
B fine
The fact being I am a total newbie and relying heavily on text mining tutorials, it would perhaps be easier for me to go back to the document form so I can take that and "plug it" with what I see in most text mining tutorials and then generate my TF-IDF word vectors after my data clean up.
Doc A: How are you?
Doc B: I am fine
What I have is a mysql table like
A How
A are
A you
B I
B am
B fine
The fact being I am a total newbie and relying heavily on text mining tutorials, it would perhaps be easier for me to go back to the document form so I can take that and "plug it" with what I see in most text mining tutorials and then generate my TF-IDF word vectors after my data clean up.
Tagged:
0
Answers
-
Solved this by using the GROUP_CONCAT operator in MySQL to rebuild a table with one row per document that includes a text field with all words appended one after the other.0