Analysis and normalization of instantaneous data

New Altair Community Member
Updated by Jocelyn
Hello friends
I have a sensor that gives me information at any time (10 milliseconds once). E.g. x, y I have thousands of these x and y. I know clustering and classification in rapidminer.
I ask experienced friends
What suggestions do you have for this data?
How can I predict x, y?
And analyze the data?
I ask you to help me
Thanks to the very good rapidminer
Find more posts tagged with
Sort by:
1 - 30 of
341
Hello
Thank you so much for your help
I have my data in this way
It's time to be in a pillar
Insert x in a column and type y in a column
like this:
time x y
-----------------------
-----------------------
21 45 8
35 52 12
35 52 12
Now I do not know how to normalize
And I can predict values of x, y at a later time?
Or do I analyze the data?
If anyone has experience
Maybe help
Thankful
Take a look at the new Time Series operators, they are part of the standard Studio operator set.
There is an operator for Normalizing time series data. There are also operators for forecasting time series data such as ARIMA or Holt-Winters. I would probably start with ARIMA.
There is an operator for Normalizing time series data. There are also operators for forecasting time series data such as ARIMA or Holt-Winters. I would probably start with ARIMA.
Take a look at the ARIMA examples, specifically the ARIMA model for Lake Huron. To better understand ARIMA, do a search on Rob Hyndman. He wrote the forecast package for R and there are a lot of examples that you could duplicate in Rapidminer. You will have to understand normalization and what it means for your time series to be stationary. Don't take this part lightly as it can make or break your forecast.
Hello
thanks for your help
I searched a lot about the time series
But it is still ambiguous to me
My data is as below.


I do not know how to normalize the data in the RapidMiner program. And does not need normalization at all?
How to stack the series?
How to use ARIMA? So I can predict the x and y values at a later time?
I ask you to help me
Thankful
best regard
best regard

New Altair Community Member
Updated by hughesfleming68
Did you look at the operators to see what they do? @student_compute, I have read a lot of your posts and you seem quite lost. Unfortunately, there are no shortcuts. You have to put in the time to learn the material. Is this for school? There are already standard ARIMA examples. It would be helpful if you could be more specific about what exactly you are having difficulty with. Do you understand what normalization means? Do you understand why a time series might need to be de-trended? When your question is so broad, it is hard to figure out where to begin. Post a process. That is the best way to get help. It is much quicker to solve problems that way.
Hallo student_compute,
Take 15mn to look this training
How to normalize data in RapidMiner by Markus Hofmann.
I enclose as well an example given some months ago in the RM Forum:
<?xml version="1.0" encoding="UTF-8"?><process version="9.1.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="9.1.000" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="9.1.000" expanded="true" height="68" name="Retrieve Sonar" width="90" x="45" y="85">
<parameter key="repository_entry" value="//Samples/data/Sonar"/>
</operator>
<operator activated="true" class="set_role" compatibility="9.1.000" expanded="true" height="82" name="Set Role" width="90" x="179" y="85">
<parameter key="attribute_name" value="class"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="concurrency:cross_validation" compatibility="9.1.000" expanded="true" height="145" name="Cross Validation" width="90" x="380" y="85">
<parameter key="split_on_batch_attribute" value="false"/>
<parameter key="leave_one_out" value="false"/>
<parameter key="number_of_folds" value="10"/>
<parameter key="sampling_type" value="automatic"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<parameter key="enable_parallel_execution" value="true"/>
<process expanded="true">
<operator activated="true" class="normalize" compatibility="9.1.000" expanded="true" height="103" name="Normalize" width="90" x="112" y="136">
<parameter key="return_preprocessing_model" value="false"/>
<parameter key="create_view" value="false"/>
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="numeric"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="real"/>
<parameter key="block_type" value="value_series"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_series_end"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="method" value="Z-transformation"/>
<parameter key="min" value="0.0"/>
<parameter key="max" value="1.0"/>
<parameter key="allow_negative_values" value="false"/>
</operator>
<operator activated="true" class="h2o:logistic_regression" compatibility="9.0.000" expanded="true" height="124" name="Logistic Regression" width="90" x="246" y="34">
<parameter key="solver" value="AUTO"/>
<parameter key="reproducible" value="false"/>
<parameter key="maximum_number_of_threads" value="4"/>
<parameter key="use_regularization" value="false"/>
<parameter key="lambda_search" value="false"/>
<parameter key="number_of_lambdas" value="0"/>
<parameter key="lambda_min_ratio" value="0.0"/>
<parameter key="early_stopping" value="true"/>
<parameter key="stopping_rounds" value="3"/>
<parameter key="stopping_tolerance" value="0.001"/>
<parameter key="standardize" value="true"/>
<parameter key="non-negative_coefficients" value="false"/>
<parameter key="add_intercept" value="true"/>
<parameter key="compute_p-values" value="true"/>
<parameter key="remove_collinear_columns" value="true"/>
<parameter key="missing_values_handling" value="MeanImputation"/>
<parameter key="max_iterations" value="0"/>
<parameter key="max_runtime_seconds" value="0"/>
</operator>
<connect from_port="training set" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Logistic Regression" to_port="training set"/>
<connect from_op="Normalize" from_port="preprocessing model" to_port="through 1"/>
<connect from_op="Logistic Regression" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
<portSpacing port="sink_through 2" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="9.1.000" expanded="true" height="82" name="Apply Model" width="90" x="112" y="85">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
<operator activated="true" class="apply_model" compatibility="9.1.000" expanded="true" height="82" name="Apply Model (2)" width="90" x="246" y="34">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
<operator activated="true" class="performance" compatibility="9.1.000" expanded="true" height="82" name="Performance" width="90" x="380" y="34">
<parameter key="use_example_weights" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_port="through 1" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="performance 1"/>
<connect from_op="Performance" from_port="example set" to_port="test set results"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="source_through 2" spacing="0"/>
<portSpacing port="sink_test set results" spacing="0"/>
<portSpacing port="sink_performance 1" spacing="0"/>
<portSpacing port="sink_performance 2" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve Sonar" from_port="output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Cross Validation" to_port="example set"/>
<connect from_op="Cross Validation" from_port="model" to_port="result 3"/>
<connect from_op="Cross Validation" from_port="test result set" to_port="result 1"/>
<connect from_op="Cross Validation" from_port="performance 1" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="9.1.000" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="9.1.000" expanded="true" height="68" name="Retrieve Sonar" width="90" x="45" y="85">
<parameter key="repository_entry" value="//Samples/data/Sonar"/>
</operator>
<operator activated="true" class="set_role" compatibility="9.1.000" expanded="true" height="82" name="Set Role" width="90" x="179" y="85">
<parameter key="attribute_name" value="class"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="concurrency:cross_validation" compatibility="9.1.000" expanded="true" height="145" name="Cross Validation" width="90" x="380" y="85">
<parameter key="split_on_batch_attribute" value="false"/>
<parameter key="leave_one_out" value="false"/>
<parameter key="number_of_folds" value="10"/>
<parameter key="sampling_type" value="automatic"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<parameter key="enable_parallel_execution" value="true"/>
<process expanded="true">
<operator activated="true" class="normalize" compatibility="9.1.000" expanded="true" height="103" name="Normalize" width="90" x="112" y="136">
<parameter key="return_preprocessing_model" value="false"/>
<parameter key="create_view" value="false"/>
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="numeric"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="real"/>
<parameter key="block_type" value="value_series"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_series_end"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="method" value="Z-transformation"/>
<parameter key="min" value="0.0"/>
<parameter key="max" value="1.0"/>
<parameter key="allow_negative_values" value="false"/>
</operator>
<operator activated="true" class="h2o:logistic_regression" compatibility="9.0.000" expanded="true" height="124" name="Logistic Regression" width="90" x="246" y="34">
<parameter key="solver" value="AUTO"/>
<parameter key="reproducible" value="false"/>
<parameter key="maximum_number_of_threads" value="4"/>
<parameter key="use_regularization" value="false"/>
<parameter key="lambda_search" value="false"/>
<parameter key="number_of_lambdas" value="0"/>
<parameter key="lambda_min_ratio" value="0.0"/>
<parameter key="early_stopping" value="true"/>
<parameter key="stopping_rounds" value="3"/>
<parameter key="stopping_tolerance" value="0.001"/>
<parameter key="standardize" value="true"/>
<parameter key="non-negative_coefficients" value="false"/>
<parameter key="add_intercept" value="true"/>
<parameter key="compute_p-values" value="true"/>
<parameter key="remove_collinear_columns" value="true"/>
<parameter key="missing_values_handling" value="MeanImputation"/>
<parameter key="max_iterations" value="0"/>
<parameter key="max_runtime_seconds" value="0"/>
</operator>
<connect from_port="training set" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Logistic Regression" to_port="training set"/>
<connect from_op="Normalize" from_port="preprocessing model" to_port="through 1"/>
<connect from_op="Logistic Regression" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
<portSpacing port="sink_through 2" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="9.1.000" expanded="true" height="82" name="Apply Model" width="90" x="112" y="85">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
<operator activated="true" class="apply_model" compatibility="9.1.000" expanded="true" height="82" name="Apply Model (2)" width="90" x="246" y="34">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
<operator activated="true" class="performance" compatibility="9.1.000" expanded="true" height="82" name="Performance" width="90" x="380" y="34">
<parameter key="use_example_weights" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_port="through 1" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="performance 1"/>
<connect from_op="Performance" from_port="example set" to_port="test set results"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="source_through 2" spacing="0"/>
<portSpacing port="sink_test set results" spacing="0"/>
<portSpacing port="sink_performance 1" spacing="0"/>
<portSpacing port="sink_performance 2" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve Sonar" from_port="output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Cross Validation" to_port="example set"/>
<connect from_op="Cross Validation" from_port="model" to_port="result 3"/>
<connect from_op="Cross Validation" from_port="test result set" to_port="result 1"/>
<connect from_op="Cross Validation" from_port="performance 1" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
Bonne chance,
Maerkli

New Altair Community Member
Updated by hughesfleming68
In addition to the video and process kindly posted by @Maerkli, with time series data, you will need to know what first order differencing is and why you might need to use a moving average to de-trend your data. You will have to understand your data first so plot it out and take a look.
Hello to all
Thank you very much for helping my dear friends

I am a beginner in time series.
I studied the basic concepts
But it is difficult to understand and generalize the concepts of theory to practical
My data is related to a sensor that is received at different times.
I want to anticipate new values for later on these data
But do not know where to start 

So, I asked experienced friends at the forum for help.
I'm sure to try to create a process. So friends can guide me.
Thank you all
good day
Are you trying to predict quality score as a function of time? If so then try looking at the data with the time series operators. You can plot this series and look at it using the Classic Decomposition operator or the Moving Average operator to detect patterns in the data. Then you can choose an appropriate forecast method such as Holt Winters or ARIMA.
Hello
Thank you so much for your reply 

Yes . I want to analyze my data first. And say how data is.
Then, for future periods, I predict the quality and I can report the accuracy of the forecast. But do not know how And what operators should I do?
I do not know which operators and data mining algorithms I use to analyze this kind of data? 

Please help my experienced friends present my example.
Thankful
@student_compute sorry but we've gone over this many times. You MUST learn how to post your XML and your data sets on this forum: https://community.rapidminer.com/discussion/37047.
Others - you are all too kind. Please note.
Scott
Others - you are all too kind. Please note.

Scott
Yes . You are right.
This is an example of my data
But I'm sorry to say that. I really do not know how to use the time series for analysis and forecasting. I searched in the forum but I do not know how to do it for my data? 

I know there is a lot of demand and I ask the community to do it for me. I tried a lot. So I can do it myself. That I did not succeed.
I request your dear friends, if possible, to help me once more.
And provide a process example that will use the time series to analyze and predict my data.
And can I use clustering, classification, or Associative rules mining? How?
Thankful
Sorry for the time of the forum
Thanks for the good rapidminer and good friends 

hello @student_compute - ok THANK YOU for your data. That helps. It looks to me like your data is very straightforward. Hence I would next strongly recommend going through these posts and following Dr. Temme's steps:
https://community.rapidminer.com/discussion/41717/time-series-extension-release-of-the-alpha-version-0-1-2
https://community.rapidminer.com/discussion/42585/time-series-extension-features-of-version-0-1-2
note that the Time Series operators are no longer an extension; they are part of the core.
Scott
cc @eackley29
https://community.rapidminer.com/discussion/41717/time-series-extension-release-of-the-alpha-version-0-1-2
https://community.rapidminer.com/discussion/42585/time-series-extension-features-of-version-0-1-2
note that the Time Series operators are no longer an extension; they are part of the core.
Scott
cc @eackley29
Hello
thanks for your help
I saw the links
But
Questions were made to me
Is it with this data? Can I predict the next value of quality at a later time by time series?
Is there a possibility of clustering?
In the links you introduced, I did not see the sample xml file. Is there a sample XML file for me?
I really need your help
Thank you
Hello
Dear friends and professors
I hope you are healthy
I read the following link below
https://community.rapidminer.com/discussion/52339/time-series-extension-features-of-version-0-1-2
And I tried to know and understand a lot.

And I tried to know and understand a lot.


But I could not get the result.
What exactly are binom, simple, and what is the purpose
That What are aic, bic, aicc values in the output of samples in the rapidminer program? Great values for them? Or small?
I know I have a lot of expectations.
But I do not know how to use the time series for their data and their future values?
Please guide me
Do you give me a useful link to know the concepts of time series and arima in rapidminer?
And that
Do you have the examples listed on this link?
https://community.rapidminer.com/discussion/52339/time-series-extension-features-of-version-0-1-2
https://community.rapidminer.com/discussion/52339/time-series-extension-features-of-version-0-1-2
Thank you so much
And
I am waiting for your help
good day 

Hi @student_compute,
As the time series extension is now part of RM Core, you can find the examples mentioned in https://community.rapidminer.com/discussion/52339/time-series-extension-features-of-version-0-1-2 directly in RapidMiner in the Samples/Time Series folder in the repository panel (as well as some more templates showing the functionality added in later updates).
For simple and binom, these are only the names of two different kind of filter weights (simple = all weights the same; binom = expansion of binomial expression, example given in the thread).
For AIC, BIC and AICc please have a look on the operator help text or this wikipedia link (https://en.wikipedia.org/wiki/Akaike_information_criterion).
For a better understanding of time series analysis in general I would suggest this free online text book: https://otexts.com/fpp2/ (Though the author is not using RapidMiner, but still concepts are greatly explained).
Best regards,
Fabian
As the time series extension is now part of RM Core, you can find the examples mentioned in https://community.rapidminer.com/discussion/52339/time-series-extension-features-of-version-0-1-2 directly in RapidMiner in the Samples/Time Series folder in the repository panel (as well as some more templates showing the functionality added in later updates).
For simple and binom, these are only the names of two different kind of filter weights (simple = all weights the same; binom = expansion of binomial expression, example given in the thread).
For AIC, BIC and AICc please have a look on the operator help text or this wikipedia link (https://en.wikipedia.org/wiki/Akaike_information_criterion).
For a better understanding of time series analysis in general I would suggest this free online text book: https://otexts.com/fpp2/ (Though the author is not using RapidMiner, but still concepts are greatly explained).
Best regards,
Fabian
Hello dear professor
Thank you very much for your help and links. 

I can give you examples of this tutorial.
I am a beginner. Maybe you are a respected professor. Please If possible, depending on the data I sent. Send me a simple forecast sample using time series or Arima algorithm? How do you know the process?
I'm sorry for my request.
Thanks a lot
With respect 

Hi @student_compute,
The templates (of which @hughesfleming68 posted this nice screenshot, thanks by the way) and the free text book I linked, should give you enough insight into learning how to analyse time series data and create forecasts, also for your problems.
By the way, I am in no way a professor, but thanks ;-)
Best regards,
Fabian
The templates (of which @hughesfleming68 posted this nice screenshot, thanks by the way) and the free text book I linked, should give you enough insight into learning how to analyse time series data and create forecasts, also for your problems.
By the way, I am in no way a professor, but thanks ;-)
Best regards,
Fabian
Hello @student_compute
As I said I am not a professor.
Nice to hear that I could help you. If you have further problems, feel free to ask here again in the community.
Best regards,
Fabian
As I said I am not a professor.
Nice to hear that I could help you. If you have further problems, feel free to ask here again in the community.
Best regards,
Fabian

New Altair Community Member
OPUpdated by student_compute
Hello
I tried hard to predict the future values for the quality variable in the RapidMiner
I will process my own, according to the data I have already provided. I created
I sent the results
But I got confused
I do not know which one is my prediction. And which one is correct and correct?
Why are some values "?" In the output?
How do I determine the best value for the Arima parameters?
Please guide my friends
I do not know the meaning of the graphs
Thankful






You are making good progress @student_compute. Your forecast of quality is your prediction. You would expect it to be an extrapolation and it is so you are on the right track. Quality and forecast is a join of your input data and your forecast. The question marks just show you where your input ends and your forecast begins. This is normal.
Please read the otexts.org link. It will tell you everything that you need to know about setting values. There really is a mountain of info on the net on this subject.
Keep in mind that forecasting is as much an art as a science. It is not about having the correct forecast. It is about having the least wrong forecast.
Please read the otexts.org link. It will tell you everything that you need to know about setting values. There really is a mountain of info on the net on this subject.
Keep in mind that forecasting is as much an art as a science. It is not about having the correct forecast. It is about having the least wrong forecast.

New Altair Community Member
OPUpdated by student_compute
Hello
thanks for your response
Is my process right?
How do I find out which value is best for arima parameters?
Should the aic, bic, aicc values be the lowest? These negative values are obtained. It is true?
How to use the optimization operator to find the optimal values for arima parameters?
And how can I use Svm, decision tree to predict future values of variable quality and report accuracy of prediction and compare results with arima results?
Please guide
thanks a lot
..
And The book link you mentioned. I saw It is very crowded. And my time is low. I ask you to give me a brief summary, if possible, in which case I would like to thank you very much....
..
And The book link you mentioned. I saw It is very crowded. And my time is low. I ask you to give me a brief summary, if possible, in which case I would like to thank you very much....

New Altair Community Member
Updated by hughesfleming68
With all due respect @student_compute, all your questions have been covered in previous posts. It is your job to study the material. It is not for us to summarize anything. If you don't have the time, I can guarantee you, no one here has the time either. I posted the link to the material on the 8th of January. You couldn't find one afternoon to read it?
1 - 30 of
341
What is it that you are trying to do with this data---predict some outcome?