Deep Learning Extension

Gottfried
Gottfried New Altair Community Member
edited November 5 in Community Q&A

I am checking out the new (?) deep learning operator that comes as extension. I do not mean Keras. I mean the one with following specs:

Version 0.8.0
Release date Aug 7, 2018
File size 783 MB
License AGPL
Dependencies  

This extension provides operators to create Deep Learning models using different types of layers. Networks can be executed both on CPU and on GPU. This extension uses the java library DeepLearning4J.

 

When trying to run it I get an error message about a data parsing problem with the example set. It tells me to make sure all attributes are numerical, which they are. Does anyone know how to fix this?

 

Thanks in advance!

 

Deep Learning dataset parsing error.PNG

Best Answer

  • jczogalla
    jczogalla New Altair Community Member
    Answer ✓

    When we tested it before and by accident used the 32-bit version, we had the exact same error. We will fix it for the next release so that it spits out a better error message. :)

Answers

  • Telcontar120
    Telcontar120 New Altair Community Member

    We can see the error message, but it is hard to troubleshoot without your process or data.  Could you post examples of both?  And also perhaps the log file that is referenced in the error message as well?

  • jczogalla
    jczogalla New Altair Community Member

    Hi @Gottfried!

     

    Yes, this is a new extension. :)

     

    Can you tell me if you are running RapidMiner as a 32-bit program/with a 32-bit JVM? Because DL4J sadly does not support that.

    Can you also please provide your RapidMiner Studio log? This might give more insight.

     

    Cheers

    Jan

  • Gottfried
    Gottfried New Altair Community Member

    This is the process:

    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.000">
    <operator activated="true" class="split_data" compatibility="9.0.000" expanded="true" height="103" name="Test / Train" width="90" x="447" y="442">
    <enumeration key="partitions">
    <parameter key="ratio" value="0.5"/>
    <parameter key="ratio" value="0.5"/>
    </enumeration>
    <parameter key="sampling_type" value="automatic"/>
    <parameter key="use_local_random_seed" value="false"/>
    <parameter key="local_random_seed" value="1992"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.000">
    <operator activated="true" class="deeplearning:dl4j_sequential_neural_network" compatibility="0.8.000" expanded="true" height="103" name="Deep Learning" width="90" x="581" y="340">
    <parameter key="loss_function" value="Cross Entropy (Binary Classification)"/>
    <parameter key="epochs" value="100"/>
    <parameter key="use_miniBatch" value="false"/>
    <parameter key="batch_size" value="32"/>
    <parameter key="updater" value="Adam"/>
    <parameter key="learning_rate" value="0.01"/>
    <parameter key="momentum" value="0.9"/>
    <parameter key="rho" value="0.95"/>
    <parameter key="epsilon" value="1.0E-6"/>
    <parameter key="beta1" value="0.9"/>
    <parameter key="beta2" value="0.999"/>
    <parameter key="RMSdecay" value="0.95"/>
    <parameter key="weight_initialization" value="Normal"/>
    <parameter key="bias_initialization" value="0.0"/>
    <parameter key="use_regularization" value="false"/>
    <parameter key="l1_strength" value="0.1"/>
    <parameter key="l2_strength" value="0.1"/>
    <parameter key="optimization_method" value="Stochastic Gradient Descent"/>
    <parameter key="infer_input_shape" value="true"/>
    <parameter key="network_type" value="Simple Neural Network"/>
    <parameter key="log_each_epoch" value="true"/>
    <parameter key="epochs_per_log" value="10"/>
    <parameter key="use_local_random_seed" value="false"/>
    <parameter key="local_random_seed" value="1992"/>
    <process expanded="true">
    <operator activated="true" class="deeplearning:dl4j_dense_layer" compatibility="0.8.000" expanded="true" height="68" name="Add Fully-Connected Layer" width="90" x="112" y="34">
    <parameter key="number_of_neurons" value="10"/>
    <parameter key="activation_function" value="ReLU (Rectified Linear Unit)"/>
    <parameter key="use_dropout" value="false"/>
    <parameter key="dropout_rate" value="0.25"/>
    <parameter key="overwrite_networks_weight_initialization" value="false"/>
    <parameter key="weight_initialization" value="Normal"/>
    <parameter key="overwrite_networks_bias_initialization" value="false"/>
    <parameter key="bias_initialization" value="0.0"/>
    </operator>
    <operator activated="true" class="deeplearning:dl4j_dense_layer" compatibility="0.8.000" expanded="true" height="68" name="Add Fully-Connected Layer (2)" width="90" x="313" y="34">
    <parameter key="number_of_neurons" value="2"/>
    <parameter key="activation_function" value="ReLU (Rectified Linear Unit)"/>
    <parameter key="use_dropout" value="false"/>
    <parameter key="dropout_rate" value="0.25"/>
    <parameter key="overwrite_networks_weight_initialization" value="false"/>
    <parameter key="weight_initialization" value="Normal"/>
    <parameter key="overwrite_networks_bias_initialization" value="false"/>
    <parameter key="bias_initialization" value="0.0"/>
    </operator>
    <connect from_port="layerArchitecture" to_op="Add Fully-Connected Layer" to_port="layerArchitecture"/>
    <connect from_op="Add Fully-Connected Layer" from_port="layerArchitecture" to_op="Add Fully-Connected Layer (2)" to_port="layerArchitecture"/>
    <connect from_op="Add Fully-Connected Layer (2)" from_port="layerArchitecture" to_port="layerArchitecture"/>
    <portSpacing port="source_layerArchitecture" spacing="0"/>
    <portSpacing port="sink_layerArchitecture" spacing="0"/>
    </process>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.000">
    <operator activated="true" class="apply_model" compatibility="9.0.000" expanded="true" height="82" name="Apply Model" width="90" x="715" y="442">
    <list key="application_parameters"/>
    <parameter key="create_view" value="false"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.000">
    <operator activated="true" class="performance_binominal_classification" compatibility="9.0.000" expanded="true" height="82" name="Performance" width="90" x="849" y="442">
    <parameter key="main_criterion" value="first"/>
    <parameter key="accuracy" value="true"/>
    <parameter key="classification_error" value="false"/>
    <parameter key="kappa" value="false"/>
    <parameter key="AUC (optimistic)" value="false"/>
    <parameter key="AUC" value="true"/>
    <parameter key="AUC (pessimistic)" value="false"/>
    <parameter key="precision" value="false"/>
    <parameter key="recall" value="false"/>
    <parameter key="lift" value="false"/>
    <parameter key="fallout" value="false"/>
    <parameter key="f_measure" value="false"/>
    <parameter key="false_positive" value="false"/>
    <parameter key="false_negative" value="false"/>
    <parameter key="true_positive" value="false"/>
    <parameter key="true_negative" value="false"/>
    <parameter key="sensitivity" value="false"/>
    <parameter key="specificity" value="false"/>
    <parameter key="youden" value="false"/>
    <parameter key="positive_predictive_value" value="false"/>
    <parameter key="negative_predictive_value" value="false"/>
    <parameter key="psep" value="false"/>
    <parameter key="skip_undefined_labels" value="true"/>
    <parameter key="use_example_weights" value="true"/>
    </operator>
    </process>

     

     

    I try to attach the data. But I think any dataset would do.

     

    The error log:

    Aug 8, 2018 2:44:24 PM WARNING: Problem converting DataSet to ExampleSet: null | null

    Aug 8, 2018 2:44:24 PM WARNING: Input Data Format: (5000, 3)

    Aug 8, 2018 2:44:24 PM SEVERE: Process failed: The provided training data couldnt be parsed correctly. Please ensure that all attributes (except the label) are numerical only.

     

    For your info, I just got a hint that running this extension on a 32bit system could be the issue. I will try again on my 64bit Mac back home. Let's see...

     

    Thanks for your support!!!

    Data.zip 436.5K
  • Gottfried
    Gottfried New Altair Community Member

    Thanks!

    Ah... Indeed, I am currently trying to run it on a 32bit PC. I will test it again back home on my Mac which runs 64bit OS. Hopefully it'll solve the problem. Do you think that's it?

    Regards,

    G.

  • jczogalla
    jczogalla New Altair Community Member
    Answer ✓

    When we tested it before and by accident used the 32-bit version, we had the exact same error. We will fix it for the next release so that it spits out a better error message. :)

  • agibsonccc
    agibsonccc New Altair Community Member

    Hey folks, this extension just came to my attention. I'm from the deeplearning4j team (the underlying library) if I am lead to believe correctly, the version of the library being used is at least 1.5 years old?

     

    I'd like to encourage the rapidminer community to collaborate with us in the future. Dl4j has a wide variety of features and configurations for cpu and gpu that are likely not being properly exposed by this extension. Dl4j also supports various ETL utilities that could make things easier to use. I'd like to encourage folks to ping us directly on github issues: https://github.com/deeplearning4j/deeplearning4j/issues or on our gitter: https://gitter.im/deeplearning4j/deeplearning4j  if there are feature requests or problems with the underlying library. Thanks for using dl4j!

  • pschlunder
    pschlunder New Altair Community Member

    Hi @agibsonccc,

     

    thanks for stopping by =)

    I'm responsible for the development of this extension. It is based on version 1.0.0-beta. We're just not done adding features and will be in contact with you in the future. Just to make sure we're talking about the same extension, I'm referring to this one and not the also available old version of a partner, that can be found on github.

     

    Thanks for creating such a great library and continuing to improve it! Collaborating by providing bug fixes and features as these things pop-up is already planned.

     

    Regards,

    Philipp

  • pschlunder
    pschlunder New Altair Community Member

    Sorry for the lack of information about this extension so far. There'll be a post about it very soon.

     

    Regards,

    Philipp

  • Telcontar120
    Telcontar120 New Altair Community Member

    @pschlunder I am having a related (?) problem with this extension--when I run RapidMiner Studio 9.0, after startup a popup says that this extension is not compatible with the current version of RapidMiner, so would I like to uninstall it.  Is that expected?  I thought this new extension was designed specifically for RapidMiner Studio 9?

     Edit: here's a screenshot of the popup:

    deep learning ext.PNG

     

  • jczogalla
    jczogalla New Altair Community Member

    Hi @Telcontar120!
    Can you please provide the log file? There can be issues if the CUDA installation is not on the path or if an interfering mkl library is found. The log should give an indication as to way it could not be loaded.

     

    Cheers

    Jan

  • Telcontar120
    Telcontar120 New Altair Community Member

    Sure, here's the log file @jczogalla.  Let me know if there is anything else you need.

     

    P.S.

    @sgenzer you may want to look at modifying the community site settings to allow *.log files as well.  As I just found out, if you try to post a log file you get an error.  So the only way to provide a log file currently is to rename it or zip it, which seems like an unnecessary extra step. 

     

  • jczogalla
    jczogalla New Altair Community Member

    Thanks!

     

    So, here is your problem:

    Aug 09, 2018 8:41:24 AM com.rapidminer.extension.deeplearning.PluginInitDeepLearning testForCUDAInstallation
    INFO: Could not find cudartlibrary.
    Aug 09, 2018 8:41:24 AM com.rapidminer.extension.deeplearning.PluginInitDeepLearning testForMKLOnLibraryPath
    INFO: Error while testing mkl library. An incompatible version was found on the library path.
    Aug 09, 2018 8:41:24 AM com.rapidminer.extension.deeplearning.PluginInitDeepLearning testForMKLOnLibraryPath
    INFO: Library loaded from C:\Program Files\Anaconda3\Library\bin
    Aug 09, 2018 8:41:24 AM com.rapidminer.tools.plugin.Plugin callInitMethod
    WARNING: Plugin initializer com.rapidminer.extension.deeplearning.PluginInitDeepLearning.initPlugin of Plugin Deep Learning caused an error: null
    java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.rapidminer.tools.plugin.Plugin.callInitMethod(Plugin.java:1376)
    at com.rapidminer.tools.plugin.Plugin.callPluginInitMethods(Plugin.java:1345)
    at com.rapidminer.tools.plugin.Plugin.initPlugins(Plugin.java:1315)
    at com.rapidminer.tools.plugin.Plugin.initAll(Plugin.java:1551)
    at com.rapidminer.RapidMiner.init(RapidMiner.java:762)
    at com.rapidminer.RapidMiner.init(RapidMiner.java:688)
    at com.rapidminer.gui.RapidMinerGUI.run(RapidMinerGUI.java:354)
    at com.rapidminer.gui.RapidMinerGUI.launch(RapidMinerGUI.java:825)
    at com.rapidminer.gui.RapidMinerGUI.main(RapidMinerGUI.java:801)
    at com.rapidminer.launcher.GUILauncher.main(GUILauncher.java:332)
    Caused by: java.lang.RuntimeException: Cannot load any backend because CPU is blocked by interfering libraries
    and GPU is not supported. Check log for more information
    at com.rapidminer.extension.deeplearning.PluginInitDeepLearning.testAndSetBackendState(PluginInitDeepLearning.java:251)
    at com.rapidminer.extension.deeplearning.PluginInitDeepLearning.initPlugin(PluginInitDeepLearning.java:236)
    ... 14 more

    As you can see, there is a problem with both CPU and CUDA. The CPU backend is blocked by the mkl library provided by Anaconda. This is a known problem with nd4j/dl4j applications and we are looking into resolving that issue for the next versions. Another possible source of the mkl library can be OracleDB (as I found out during development). You can solve this for now by removing the interfering library from your PATH. That's the reason we logged this. :)

    Regarding CUDA, I'm not sure if you have it not installed at all or if it just is not on the class path. You would need CUDA 9.1 right now to make it work.

    Since both backends are not available, the extension initialization fails and thus it is shown as incompatible at start.

     

    I hope this helps. Thanks for the feedback so far!

    Jan

  • Telcontar120
    Telcontar120 New Altair Community Member

    @jczogalla thanks for the quick diagnosis.  Indeed I do not have CUDA installed, but I assumed my CPU backend should work since I don't have any problems running the Python extension using the Anaconda distribution.  I guess I'll just wait for the next version if you expect to resolve the mkl library issues shortly.  Thanks. 

  • jczogalla
    jczogalla New Altair Community Member

    @Telcontar120 The problem actually arrises from the fact that nd4j/dl4j comes with it's own mkl library. So in theory it could work with other mkl libraries from Anaconda or somewhere else, but it's not guaranteed. Therefore we check if another mkl library exists and block the CPU backend if that is the case.

    Maybe @agibsonccc can tell us a bit more about that? As I said, I did have a problem with the mkl provided by OracleDB during the development. We are looking into ways to circumvent that in the future. :)

  • agibsonccc
    agibsonccc New Altair Community Member

    It might be your LD_LIBRARY_PATH when you start the process. Generally when nd4j (the underlying library that dl4j uses for matrix math also built by us) loads mkl we load it from the LD_LIBRARY_PATH. Dl4j outputs a few folders in there including a:
    .javacpp (this is where our native artifacts go)

    and there should be logs somewhere. Dl4j itself is based on slf4j logging. I'm imagining if the plugin itself integrates this, the nd4j backend that gets loaded should be logged somewhere. That would give me more context in to the problem you're running in to.

  • jczogalla
    jczogalla New Altair Community Member

    For Windows, the LD_LIBRARY_PATH is simply PATH, and ends up being the "java.library.path". When I encountered the problem, I could solve it by removing the entry of that particular path both from the classloader with some reflection, and also from the system property.

    I will get back to you with the logging, since we use another logging system in RM and because of time constraints we did not integrate the dl4j/nd4j logging into the extension yet. As soon as there is progress, I will let you know!

  • pschlunder
    pschlunder New Altair Community Member

    Hi @Telcontar120

    you could try removing Anaconda from your PATH (for testing), this should allow the extension to be loaded properly. Search for "environment" in the Windows Start Search and klick on "Environment Variables" search for "PATH" and click edit. You should see a list of file paths. One contains conda. If you remove this one, it should work. Make sure to copy the Path so that you can re-add it later on. I'll provide screenshots if needed, when I'm on a windows machine.

     

    Hope this helps,

    Philipp

  • Telcontar120
    Telcontar120 New Altair Community Member

    @pschlunder thanks for the idea.  No need for screenshots, I already know how to edit my environment variables--unfortunately :-)

    There were three references to Anaconda in my system PATH statement and I removed them all and restarted RapidMiner, and that did indeed resolve the issue with the new Deep Learning extension.  Of course it's not really a long-term solution since the Anaconda references are need in the PATH statement for other reasons, including the Python RapidMiner extension.  But at least we confirmed that it is the culprit.  Hopefully you can use that to resolve the issue.

     

     

  • pschlunder
    pschlunder New Altair Community Member

    Thanks for reporting back @Telcontar120 =) Correct, this can't be a long term solution. It's prioritized for the next release.

     

    Regards,

    Philipp

  • SGolbert
    SGolbert New Altair Community Member

    Hi @Telcontar120,

     

    you don't need to have Anaconda in the PATH, it's just convenient for calling conda or pip. The Python extension only needs a path to the Python binary, which can be set up from RapidMiner.

     

    Best,

    Sebastian

  • sgenzer
    sgenzer
    Altair Employee

    @Telcontar120 .log file file type noted. Working on it. :)

     

    Scott