Recent Discussions

The most recent content from our members.

Recent Discussions

TF-IDF calculation
I'm trying to understand how "Generate TF-IDF operator" calculates TF-IDF's. Please let me know the formulae for Rapidminer's TF and IDF calculation. Unfortunately, past rapidminer community page does not seem valid any more..
HELP: FFT CHART OF A FINANCIAL FUTURE MOVING AVERAGE
Dear All, I am a terrific newbie on rapidminer. I need to extract dominant cycles (peaks of frequencies) on a time series of the financial tools, in the example of S&P future, candlesticks on 1-minute timeframe. I have previously calculated per minute the average price value (high+low)/2 and then the moving average on 10…
Regarding Semantic Analysis
Can someone tell how to implement or add semantic analysis into my process? means i need to compare not only similar words but also the meanings of 2 words are same in a single document.
Select column with non-zero value
Hi everybody! I've calculated TF-IDF with "Process document from data" and I found a matrix that have a word in every column and a body for every row and every cell of the matrix cointains TF-IDF's value. Now I filter by cluster, creates with k.means, and I want to see only columns with values non-zero. I firstly thought…
interpreting the sum of TF-IDF scores of words across documents
hi guys! after doing a clustering on a list of documents with the k-means, I would like to analyze the words in each cluster (in order to correlate them with other attributes). About this I added up the value of tf-idf for each words, but I think that this solution can be wrong. Could it be more correct to use term…
Calculate Cosine Similarity based on SVD
Dear Community, is it technically or rather mathematically possible to calculate the Cosine Similarity measure based on results derived by SVD Feature Extraction? Or does the distance metric only operate on measures like TF-IDF? Thank you in advance for your help!
Classifiation SubProcess for Feature Selection
Hello together, do you have a recommendation with regard to the question of which classification model sould be used within Feature Selection (e.g. Optimize Selection or Backward Elimination) to be able to efficiently select attributes or rather dimensions based on a high-dimensional TF-IDF matrix? Thank you in advance for…
Count word attribute Negative and Positive
I have a word vector attribute Negative and Positive. when I run this process result is incorrect. in the picture you will see word attribute in column six (กรุณา) value is two. how to count word attribute in column confidence(P) and column confidence(N)?