Problem with xvalidationparallel / process log

Roberto
Roberto New Altair Community Member
edited November 5 in Community Q&A
Hi all,

So here's my issue.  I am running rapid-i enterprise edition with the feature selection plugin, and im trying to speed the process up by using the xvalidationparallel operator.  I have a machine with dual quad core i7 processors, so in theory there are 8 processors to which i should be able to assign a thread each (so i should be able to specify 8 threads, if i understand the operator correctly). Now i get an illegalthreadexception:null error message if i do all that after about 5 mins, but none if i either a-remove the process log entirely or b-lower the number of threads to 4 and place the process log within the first operator chain, which slows down the algorithm exponentially over time.  If i put the process log anywhere else, the algorithm throws that error.  Any suggestions on what im doing wrong?

<operator name="Root" class="Process" expanded="yes">
   <operator name="CSVExampleSource" class="CSVExampleSource">
       <parameter key="filename" value="\\MONOLITH\Public\Documents\OP Methylation Machine data\Complete Aggressive datasheet OP tumors 9-4-09.csv"/>
       <parameter key="label_column" value="2"/>
       <parameter key="id_column" value="1"/>
   </operator>
   <operator name="ExampleSetTranspose" class="ExampleSetTranspose">
   </operator>
   <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
       <parameter key="name" value="AGGRESSIVE AT 24M"/>
       <parameter key="target_role" value="label"/>
   </operator>
   <operator name="MissingValueReplenishment" class="MissingValueReplenishment">
       <parameter key="default" value="zero"/>
       <list key="columns">
       </list>
   </operator>
   <operator name="NominalNumbers2Numerical" class="NominalNumbers2Numerical">
   </operator>
   <operator name="WrapperXValidation" class="WrapperXValidation" expanded="yes">
       <parameter key="leave_one_out" value="true"/>
       <operator name="AdvancedForwardSelection" class="AdvancedForwardSelection" expanded="yes">
           <parameter key="maximal_number_of_attributes" value="500"/>
           <parameter key="speculative_rounds" value="10"/>
           <parameter key="stopping_behavior" value="without significant increase"/>
           <operator name="XValidationParallel" class="XValidationParallel" expanded="yes">
               <parameter key="number_of_threads" value="4"/>
               <parameter key="leave_one_out" value="true"/>
               <parameter key="sampling_type" value="shuffled sampling"/>
               <operator name="JMySVMLearner" class="JMySVMLearner">
                   <parameter key="calculate_weights" value="true"/>
               </operator>
               <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                   <operator name="ModelApplier" class="ModelApplier">
                       <list key="application_parameters">
                       </list>
                       <parameter key="create_view" value="true"/>
                   </operator>
                   <operator name="ClassificationPerformance" class="ClassificationPerformance">
                       <parameter key="accuracy" value="true"/>
                       <list key="class_weights">
                       </list>
                   </operator>
                   <operator name="MinMaxWrapper" class="MinMaxWrapper">
                   </operator>
               </operator>
           </operator>
       </operator>
       <operator name="LibSVMLearner" class="LibSVMLearner">
           <list key="class_weights">
           </list>
           <parameter key="confidence_for_multiclass" value="false"/>
       </operator>
       <operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
           <operator name="ModelWriter" class="ModelWriter">
               <parameter key="model_file" value="C:\Users\RLleras.HNSCC\Documents\Lab Projects\Rapid-I projects\Agressive OP model.mod"/>
           </operator>
           <operator name="ModelApplier (2)" class="ModelApplier">
               <list key="application_parameters">
               </list>
               <parameter key="create_view" value="true"/>
           </operator>
           <operator name="ClassificationPerformance (2)" class="ClassificationPerformance">
               <parameter key="accuracy" value="true"/>
               <parameter key="classification_error" value="true"/>
               <parameter key="kappa" value="true"/>
               <parameter key="weighted_mean_recall" value="true"/>
               <parameter key="weighted_mean_precision" value="true"/>
               <parameter key="spearman_rho" value="true"/>
               <parameter key="kendall_tau" value="true"/>
               <parameter key="absolute_error" value="true"/>
               <parameter key="relative_error" value="true"/>
               <parameter key="relative_error_lenient" value="true"/>
               <parameter key="relative_error_strict" value="true"/>
               <parameter key="correlation" value="true"/>
               <list key="class_weights">
               </list>
           </operator>
       </operator>
   </operator>
</operator>
Tagged:

Answers

  • haddock
    haddock New Altair Community Member
    Hi there Roberto,

    Sadly this issue has been around for a while....

    http://rapid-i.com/rapidforum/index.php/topic,563.0.html
  • land
    land New Altair Community Member
    Hi Roberto,
    unfortunately you have removed the processLog operator from your process, so that I cannot take a look on its parameter settings.
    Could you please insert the crashing process here?
    In normal cases I would try to reproduce the error, but unless someone sponsors me a dual quad i7 processor, this setting outperforms my laptop :) Since I believe that this is a race condition I will not be able to reproduce it at all. It would be of great help if you could paste the stack trace of the error here.
    For doing so, you have to enable the debug mode. Therefore choose Tools / Preferences and select in the general tab the "rapidminer.general.debugmode" property. After apply and saving the settings, there should be a possibility in the error dialog to get the error description.

    Greetings,
      Sebastian
  • Roberto
    Roberto New Altair Community Member
    Thanks for the quick reply!

    Here's the error I get from the debug mode...

    Exception: java.lang.IllegalThreadStateException
    Message: null
    Stack trace:

      java.lang.Thread.start(Thread.java:595)
      com.rapidminer.operator.validation.ParallelXValidation.estimatePerformance(ParallelXValidation.java:156)
      com.rapidminer.operator.validation.ValidationChain.apply(ValidationChain.java:218)
      com.rapidminer.operator.Operator.apply(Operator.java:671)
      com.rapidminer.operator.features.selection.ForwardAttributeSelectionOperator.applyInnerLearner(ForwardAttributeSelectionOperator.java:278)
      com.rapidminer.operator.features.selection.ForwardAttributeSelectionOperator.apply(ForwardAttributeSelectionOperator.java:181)
      com.rapidminer.operator.Operator.apply(Operator.java:671)
      com.rapidminer.operator.validation.WrapperValidationChain.useMethod(WrapperValidationChain.java:134)
      com.rapidminer.operator.validation.WrapperXValidation.apply(WrapperXValidation.java:114)
      com.rapidminer.operator.Operator.apply(Operator.java:671)
      com.rapidminer.operator.OperatorChain.apply(OperatorChain.java:424)
      com.rapidminer.operator.Operator.apply(Operator.java:671)
      com.rapidminer.Process.run(Process.java:735)
      com.rapidminer.Process.run(Process.java:704)
      com.rapidminer.Process.run(Process.java:694)
      com.rapidminer.gui.ProcessThread.run(ProcessThread.java:59)

    And here's the code from which I received that message...

    <operator name="Root" class="Process" expanded="yes">
        <operator name="CSVExampleSource" class="CSVExampleSource">
            <parameter key="filename" value="\\MONOLITH\Public\Documents\OP Methylation Machine data\Complete Aggressive datasheet OP tumors 9-4-09.csv"/>
            <parameter key="label_column" value="2"/>
            <parameter key="id_column" value="1"/>
        </operator>
        <operator name="ExampleSetTranspose" class="ExampleSetTranspose">
        </operator>
        <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
            <parameter key="name" value="AGGRESSIVE AT 24M"/>
            <parameter key="target_role" value="label"/>
        </operator>
        <operator name="MissingValueReplenishment" class="MissingValueReplenishment">
            <parameter key="default" value="zero"/>
            <list key="columns">
            </list>
        </operator>
        <operator name="NominalNumbers2Numerical" class="NominalNumbers2Numerical">
        </operator>
        <operator name="WrapperXValidation" class="WrapperXValidation" expanded="yes">
            <parameter key="leave_one_out" value="true"/>
            <operator name="AdvancedForwardSelection" class="AdvancedForwardSelection" expanded="yes">
                <parameter key="maximal_number_of_attributes" value="500"/>
                <parameter key="speculative_rounds" value="5"/>
                <operator name="XValidationParallel" class="XValidationParallel" expanded="yes">
                    <parameter key="number_of_threads" value="8"/>
                    <operator name="JMySVMLearner" class="JMySVMLearner">
                        <parameter key="max_iterations" value="10000"/>
                    </operator>
                    <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                        <operator name="ModelApplier" class="ModelApplier">
                            <list key="application_parameters">
                            </list>
                        </operator>
                        <operator name="ClassificationPerformance" class="ClassificationPerformance">
                            <parameter key="accuracy" value="true"/>
                            <list key="class_weights">
                            </list>
                        </operator>
                    </operator>
                </operator>
                <operator name="ProcessLog" class="ProcessLog">
                    <list key="log">
                      <parameter key="number of attributes" value="operator.AdvancedForwardSelection.value.number of attributes"/>
                      <parameter key="performance" value="operator.AdvancedForwardSelection.value.performance"/>
                    </list>
                </operator>
            </operator>
            <operator name="JMySVMLearner (2)" class="JMySVMLearner">
            </operator>
            <operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
                <operator name="ModelWriter" class="ModelWriter">
                    <parameter key="model_file" value="C:\Users\RLleras.HNSCC\Documents\Lab Projects\Rapid-I projects\Agressive OP model.mod"/>
                </operator>
                <operator name="ModelApplier (2)" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                    <parameter key="create_view" value="true"/>
                </operator>
                <operator name="ClassificationPerformance (2)" class="ClassificationPerformance">
                    <parameter key="accuracy" value="true"/>
                    <parameter key="classification_error" value="true"/>
                    <parameter key="kappa" value="true"/>
                    <parameter key="weighted_mean_recall" value="true"/>
                    <parameter key="weighted_mean_precision" value="true"/>
                    <parameter key="spearman_rho" value="true"/>
                    <parameter key="kendall_tau" value="true"/>
                    <parameter key="absolute_error" value="true"/>
                    <parameter key="relative_error" value="true"/>
                    <parameter key="relative_error_lenient" value="true"/>
                    <parameter key="relative_error_strict" value="true"/>
                    <parameter key="correlation" value="true"/>
                    <list key="class_weights">
                    </list>
                </operator>
            </operator>
        </operator>
    </operator>


    I get this error about 30s in with the following posting in the Log: G Sep 17, 2009 1:08:59 PM: [Fatal] IllegalThreadStateException occured in 9593rd application of ClassificationPerformance (ClassificationPerformance)
  • land
    land New Altair Community Member
    Ok,
    I will take a look on this matter as soon as possible.

    Greetings,
      Sebastian
  • Roberto
    Roberto New Altair Community Member
    Sebastian,

    Here's some more info on my issue that I think might be able to help you.  So I have found that even if I take out the process log that I have the same issues.  However, here's what I have found that changes things

    1) Replacing XValParallel with XVal- algorithm runs slower, but no problems seen at all.
    2) If i start from scratch and make the EXACT same project in a new project file, the first time I execute it there is no problem, churns along fine...UNTIL it reaches a point when suddenly the memory usage suddenly goes bonkers and it uses all available RAM allocated to the JVM which in turn crashes Java, to the point that I have to force close it with the taskmanager, so I can't even give you the error message from debug mode.  Now, whats even more interesting is if I then restart rapid-i, then load the SAME project that got up to that same point the last time rapid-i was used, I get the error message I already showed you before (illegalthreadstateexception message:null). Weird!  I have reproduced this 3X, with the timing of the thread exception variable, as has always been the case...

    Hope that helps!
    Roberto
  • Roberto
    Roberto New Altair Community Member
    Sebastian,

    So after doing a little more troubleshooting I have figured out a few more things.

    1)  When I force rapid miner to shut down, it wasnt always terminating the Java threads that were being opened by the JVM.  So, i found that if i force-closed the JVM through the task manager then re-opened the same process that I no longer get the subsequent Illegalthreadstateexception error when i execute the process.
    2)  Everytime i run that process, I now get the memory exhaustion problem at the exact same time: when the algorithm exits the advanced feature selection part of the algorithm, before it tests the model on the second JSVMlearner operator.  Or at least thats how it seems, as I still can't get the stack trace because I have to use the taskmanager to force close the program, as well as Java, in order for it to release the memory.  Hope all the extra info helps.

    Roberto
  • land
    land New Altair Community Member
    Hi Roberto,
    I'm currently attending a conference, so I cannot take a deeper look into this until I'm back in office, but as I heard from my colleges, they have found an thread synchronization problem and solved it already. If it turns out, that this solves your problem, we will send the enterprise customer an updated version.

    Greetings,
      Sebastian