Using mod output to isolate variables
iason
New Altair Community Member
I am not sure I am using the correct terms here, I will try to be descriptive.
I want to isolate (or control for) selected variables. That is, from a dataset (x,y,z,result) to create a model and then plot (x,result) considering y,z to be fixed. The final plot will be extrapolated to a wider range of x values.
I am at the point of having the model created (I used linear regression for testing purposes). The remaining steps, which I can't find out how to perform, are to create the appropriate dataset and apply the model.
Is there a data generator suitable for that or should I manually create the tables like (1,0,0),(2,0,0),(3,0,0),(4,0,0)... ?
After data is entered, or generated, how do I use the "mod" output to predict the result value?
Finally, am I using a totally wrong approach for the task I am trying to achieve? Is there a better way to visualize that kind of dependancy than this one?
Thank you all in advance.
I want to isolate (or control for) selected variables. That is, from a dataset (x,y,z,result) to create a model and then plot (x,result) considering y,z to be fixed. The final plot will be extrapolated to a wider range of x values.
I am at the point of having the model created (I used linear regression for testing purposes). The remaining steps, which I can't find out how to perform, are to create the appropriate dataset and apply the model.
Is there a data generator suitable for that or should I manually create the tables like (1,0,0),(2,0,0),(3,0,0),(4,0,0)... ?
After data is entered, or generated, how do I use the "mod" output to predict the result value?
Finally, am I using a totally wrong approach for the task I am trying to achieve? Is there a better way to visualize that kind of dependancy than this one?
Thank you all in advance.
Tagged:
0
Answers
-
Hey,
Can you post a few rows of example data?
Best regards,
Wessel
0 -
Sure. Here are a few lines.
I mostly need to visualize mes=f(res), for a given set (t,r).
Physical modeling of the problem says I should expect mes=a*res^2+b*res+c, but given the effect t and the fact that the order of magnitude is so different it is quite difficult. The values of a and b are not independent of t and r.
I thought of getting a number of examples, large enough to have a lot of cases with the same t,r but that seems impossible.
t;r;res;mes
264;109,68;0,030;29441,9
298;95,07;0,198;31200,2
322;92,27;0,782;41563,1
476;101,09;0,152;51838,0
181;109,53;0,454;24379,3
496;108,89;0,497;67559,6
246;103,28;0,719;34732,9
247;101,86;0,946;37258,2
239;108,7;0,536;33074,8
33;97,8;0,883;4889,7
436;104,02;0,420;54985,0
370;97,12;0,901;52325,9
155;100,89;0,446;19224,7
367;94,5;0,914;50789,2
291;102,38;0,537;37936,9
147;99,8;0,321;17075,6
490;104,62;0,254;57837,4
230;107,42;0,197;27214,50 -
Hey,
I'm sure I'm missing something.
I generated an attribute res^2 and ran linear regression.
And then after made a scatter plot.
I used 0,1-normalization to make it all fit.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Process">
<process expanded="true" height="409" width="840">
<operator activated="true" class="retrieve" compatibility="5.1.008" expanded="true" height="60" name="Retrieve" width="90" x="153" y="146">
<parameter key="repository_entry" value="//RS/A"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.008" expanded="true" height="76" name="Set Role" width="90" x="179" y="30">
<parameter key="name" value="mes"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.1.008" expanded="true" height="76" name="Generate Attributes" width="90" x="313" y="30">
<list key="function_descriptions">
<parameter key="res^2" value="res^2"/>
</list>
</operator>
<operator activated="true" class="linear_regression" compatibility="5.1.008" expanded="true" height="94" name="Linear Regression" width="90" x="447" y="30">
<parameter key="feature_selection" value="none"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.1.008" expanded="true" height="76" name="Apply Model" width="90" x="581" y="120">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.1.008" expanded="true" height="94" name="Normalize" width="90" x="715" y="30">
<parameter key="include_special_attributes" value="true"/>
<parameter key="method" value="range transformation"/>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Linear Regression" to_port="training set"/>
<connect from_op="Linear Regression" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Linear Regression" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Normalize" to_port="example set input"/>
<connect from_op="Apply Model" from_port="model" to_port="result 1"/>
<connect from_op="Normalize" from_port="example set output" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="162"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
0 -
Thank you again for replying. Your help is very much appreciated.
The problem is that with mes=f(res)=a*res^2 + b*res + c the values of a,b,c are not independent of t,r.
Doing that kind of regression would only acounf for c=g(t,r).
What I want to do is find a and b, given the values of t,r.
To put it in a proper form, the function is:
mes(res, t, r) = a(t,r)*res^2 + b(t,r)*res + c(t,r)
The quest is to find a(t,r), b(t,r) and c(t,r).0 -
And what is the form of a(t,r), b(t,r) and c(t,r)?
This does not seem like a problem suitable for Rapid Miner.
You can use a fuzzy neural network, or a genetic algorithm to solve this problem.
But you will have to write your own Java code.
Best regards,
Wessel0 -
Actually the exact form of a(r,t), b(r,t), c(r,t) is not known. But it is not needed.
A visual representation of mes vs res for 4-5 pairs of (r,t) would be enough.
Still, is it reasonable to ask for enough data values with the same t and r? Gathering 100 examples for each pair will take around 8 months.
And then I could train the model 5 times and get the required 5 values for a,b,c.
I was hoping I could find a workaround and work with randomly collected values but it seems quite difficult.
0