"Calculated Performance Measurement"

Charles54
Charles54 New Altair Community Member
edited November 5 in Community Q&A
Hello all.

First off, let me say how much I am enjoying RapidMiner. It is an amazing application. When I first started using statistics (many decades ago) no one could have imagined such a thing.

I am having trouble figuring something out. What I want to do is generate a performance measurement combining confidence(true) and an attribute from my example set: confidence(true)*RecidivismImpact. This is being done from within an X-Validation.

I tried using "generate_attributes" to define an attribute based on the formula and then getting the average from Extract Performance. This works fine on the first loop -- if I set a break point I can see the results I want. However, on the second loop, this method causes a "duplicate attribute name" exception.

In hopes of side-stepping the error, I tried renaming the new attribute with Rename Generic (after performing the performance extraction.) I still got the same error. I also tried filtering out the attribute with Select Attributes. That didn't help either. Any suggestions would be appreciated.

Regards, Charles

Answers

  • steffen
    steffen New Altair Community Member
    Hello and welcome to RapidMiner

    In short, RapidMiner stores the data using a model-view concept so that the data is copied less times but you can access it from multiple positions. This one of the reasons why rapidminer is so fast. Hence your attribute construction operator adds column which is filtered out in the view, but of cause still present in the model.

    how to solve the problem: use the operator "MaterializeDataInMemory". This one persists the current view in a separate model without touching the original data. I recommend to use it before or after the modelapplier-operator.

    hope this was helpful

    regards,

    Steffen
  • Charles54
    Charles54 New Altair Community Member
    Thanks for the response, Steffen. I had Materialize Data occurring at the end of the subprocess -- after the Performance operators. Moving it to just before Generate Attribute operation solved the problem. Not exactly sure why, but it worked. That was a big help, thanks again.

    Have a great day, Charles.