Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Any Ideas?
Scotty
Hi All,
I am trying to convert the following output from
Link cluster able adsl adsl_faceplate alarms
http://test1
cluster_2 .0 .0 .0 .0
http://test2
cluster_2 .0 .0 .0 .0
http://test3
cluster_0 .1 .0 .0 .0
http://test4
cluster_2 .0 .0 .0 .0
http://test5
cluster_1 .0 .1 .0 .0
http://test6
cluster_1 .0 .0 .0 .0
http://test7
cluster_0 .0 .0 .0 .0
http://test8
cluster_2 .0 .0 .0 .0
http://test9
cluster_1 .0 .0 .0 .0
http://test10
cluster_0 .1 .0 .0 .0
to
Link Cluster Word Score
http://test1
cluster_2 able .0
http://test2
cluster_2 able .0
http://test3
cluster_0 able .1
http://test4
cluster_2 able .0
http://test5
cluster_1 able .0
http://test6
cluster_1 able .0
http://test7
cluster_0 able .0
http://test8
cluster_2 able .0
http://test9
cluster_1 able .0
http://test10
cluster_0 able .1
http://test1
cluster_2 adsl .0
http://test2
cluster_2 adsl .0
http://test3
cluster_0 adsl .0
http://test4
cluster_2 adsl .0
http://test5
cluster_1 adsl .1
http://test6
cluster_1 adsl .0
http://test7
cluster_0 adsl .0
http://test8
cluster_2 adsl .0
http://test9
cluster_1 adsl .0
Any ideas how this could be done?
There are thousands of rows and columns
Thanks
S
Find more posts tagged with
AI Studio
Accepted answers
All comments
StaryVena
Hi,
maybe if you describe rules used for conversion, it will be easer to help you. Because I don't see any. Look at operators for generating attributes (
Generate Attributes, Generate Aggregation, ...)
Cheers,
Vaclav
Scotty
Hi Vaclav,
Sorry, I will explain a bit more.
I use the k-means clustering operator to cluster text from a webcrawl that have been pre-processed (split into tokens, stop words removed etc).
The cluster set result which consists of 3500 examples of data detailing the URL, the cluster result and the 8500 attributes from the text looks like
Link cluster able adsl adsl_faceplate alarms .......................(8500)...............z
http://test1
cluster_2 .0 .0 .0 .0 .....................................0
http://test2
cluster_2 .0 .0 .0 .0 .......................................0
http://test3
cluster_0 .1 .0 .0 .0 ...................................0
http://test4
cluster_2 .0 .0 .0 .0 ......................................0
http://test5
cluster_1 .0 .1 .0 .0 ......................................0
http://test6
cluster_1 .0 .0 .0 .0 ......................................0
http://test7
cluster_0 .0 .0 .0 .0 ......................................0
http://test8
cluster_2 .0 .0 .0 .0 ......................................0
http://test9
cluster_1 .0 .0 .0 .0 ......................................0
http://test10
cluster_0 .1 .0 .0 .0 ......................................0
....
....
....
(3500)
...
...
http://test3500
cluster_0 .1 .0 .0 .0 ......................................0
I am looking to try and get the data into the following format.
Link Cluster Word TF-IDF Score
http://test1
cluster_2 able .0
http://test1
cluster_2 adsl .0
http://test1
cluster_2 adsl_faceplate .0
http://test1
cluster_2 alarms .0
http://test1
cluster_2 ....... .0
http://test1
cluster_2 z .0
http://test2
cluster_2 able .0
http://test2
cluster_2 adsl .0
http://test2
cluster_2 adsl_faceplate .0
http://test2
cluster_2 alarms .0
http://test2
cluster_2 ....... .0
http://test2
cluster_2 z .0
http://test3
cluster_0 able .0
http://test3
cluster_0 adsl .0
http://test3
cluster_0 adsl_faceplate .0
http://test3
cluster_0 alarms .0
http://test3
cluster_0 ....... .0
http://test3
cluster_0 z .0
....
....
http://test3500
cluster_0 able .0
http://test3500
cluster_0 adsl .0
http://test3500
cluster_0 adsl_faceplate .0
http://test3500
cluster_0 alarms .0
http://test3500
cluster_0 ....... .0
http://test3500
cluster_0 z .0
Does this make a bit more sense?
Thanks
Scott
IngoRM
Hi,
you can use the operator "Pivot" and "De-Pivot" for tasks like this. You can find examples on myexperiment.org:
http://www.myexperiment.org/search?filter=TYPE_ID%28%2262%22%29&
;query=pivoting
Simply install the Community Extension for RapidMiner to access and directly download the processes uploaded there (search the forum for more information about the Community Extension).
Cheers,
Ingo
Scotty
Hi Ingo,
Thanks for the advice. Maybe you could point me to the example that is closest to what I am trying to do. Although similar I think the output I am after is very different.
I suspect de-pivot is somehow involved.
Many Thanks
Scott
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups