Hi Everybody,
I have some questions whether it's feasible to get something like partial derivates for certain data quantiles via rapidminer. Before I explain my idea and problem I will give you a brief background on my data.
I currently doing an analyses on aggregated data on structural change in German agriculture. The use of aggregate data implies some drawbacks e.g. the cause - effect relations actually exist only at the individual level. Therefore on the aggregate level a closed theoretical model (esp. refering to functional form of the relation) between the dependent and inpedendent variables is not available. Furthermore some information on the aggregate level can at best be conceived as a rough dummy for the factor influencing the individual decision.
The linear and nested linear regressions show that for some variables the relation between indendent and dependent variable is clearly non-linear and may even show some breaks.
My idea for the analysis is the following:
a) take the data set and remove the outlier's at least the most dramatic ones based on an indicator e.g. Cook's D.
b) conduct a non-parametric regression using either a SVM or nearest neighbor approach (which looks to me as being the most equivalent to what is generally refered to kernel based regression; in the use of:
http://en.wikipedia.org/wiki/Kernel_regression).c) get the information on the partial derivates (first order would be sufficient) across the range of the variable.
d) investigate these derivates for marked non-linearities
a) and b) is quite straightforward but is it possible to do c) and d) in RapidMiner and if how?
Best Norbert