🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

deep learning

User: "[Deleted User]"
New Altair Community Member
Updated by Jocelyn
Hi
I want to combine deep learning & neural network in my data because the result of  them are good but when I try to combine them some problems happen. please help me to solve them
Thank you :)o:)

Find more posts tagged with

Sort by:
1 - 30 of 701
    User: "[Deleted User]"
    New Altair Community Member
    OP
    next try and next problem :(:/:'(
    User: "hughesfleming68"
    New Altair Community Member
    In your first example, you need to review ensembles. Go through the examples but combining a neural net with a deep learner is somewhat redundant. You might consider just adding extra layers to your deep learner. 

    In your second example, you need to eliminate or interpolate your missing examples so that you don't have any gaps in your data.
    User: "[Deleted User]"
    New Altair Community Member
    OP
    hughesfleming68 
    But would you please explain more about extra layer? I can not understand.
    In my second example I dont have any missing values but the result of Rm is not clear.I mean that "MISSING VALUE" doesnt have meaning in this situation.
    Thanks :)
    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    @mbs

    The simple debugging technique would be to disable all operators after Nominal to numerical operator and connect the output of nominal to numerical operator to results and run your program. Then you can check the statistics of your example set to see if there are any missing values in any attribute of your dataset.

    Secondly, adding neural net to a deep learning algorithm is redundant as mentioned by @hughesfleming68, You can check the layer information from neural net operator parameters and add this as a new layer in deep learning operator.
    User: "[Deleted User]"
    New Altair Community Member
    OP
    HI @varunm1
    Thank you for your suggestion :)
    I think this is very complex and I have to try it more :)
    I have to work on your suggestion because I can not understand it completely. :/
    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    Then if you want to use the same architecture, I think you can use group models operator and try.
    User: "[Deleted User]"
    New Altair Community Member
    OP
    @varunm1
    Thank you for your perfect idea :)<3
    would you please explain group model more?
     o:) 
    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    Updated by varunm1
    It can take multiple models and combine them into a single model. More detailed explanation below.
    https://docs.rapidminer.com/8.0/studio/operators/modeling/predictive/group_models.html

    When this combined model is applied, it is equivalent to applying the original models in their respective order

    User: "[Deleted User]"
    New Altair Community Member
    OP
     :)  o:)B) perfect

    User: "[Deleted User]"
    New Altair Community Member
    OP
    @varunm1
    please look at the problem :/
    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    Here is a working example, XML below. Analyze my inputs and outputs carefully and how I am connecting them.

    <?xml version="1.0" encoding="UTF-8"?><process version="9.2.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="9.2.001" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="9.2.001" expanded="true" height="68" name="Retrieve Titanic Training" width="90" x="45" y="136">
    <parameter key="repository_entry" value="//Samples/data/Titanic Training"/>
    </operator>
    <operator activated="true" class="nominal_to_numerical" compatibility="9.2.001" expanded="true" height="103" name="Nominal to Numerical" width="90" x="179" y="34">
    <parameter key="return_preprocessing_model" value="false"/>
    <parameter key="create_view" value="false"/>
    <parameter key="attribute_filter_type" value="all"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="nominal"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="file_path"/>
    <parameter key="block_type" value="single_value"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="single_value"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <parameter key="coding_type" value="dummy coding"/>
    <parameter key="use_comparison_groups" value="false"/>
    <list key="comparison_groups"/>
    <parameter key="unexpected_value_handling" value="all 0 and warning"/>
    <parameter key="use_underscore_in_name" value="false"/>
    </operator>
    <operator activated="true" class="split_data" compatibility="9.2.001" expanded="true" height="103" name="Split Data" width="90" x="313" y="34">
    <enumeration key="partitions">
    <parameter key="ratio" value="0.7"/>
    <parameter key="ratio" value="0.3"/>
    </enumeration>
    <parameter key="sampling_type" value="automatic"/>
    <parameter key="use_local_random_seed" value="false"/>
    <parameter key="local_random_seed" value="1992"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="9.2.001" expanded="true" height="103" name="Multiply" width="90" x="447" y="136"/>
    <operator activated="true" class="h2o:deep_learning" compatibility="9.2.000" expanded="true" height="82" name="Deep Learning" width="90" x="581" y="238">
    <parameter key="activation" value="Rectifier"/>
    <enumeration key="hidden_layer_sizes">
    <parameter key="hidden_layer_sizes" value="50"/>
    <parameter key="hidden_layer_sizes" value="50"/>
    </enumeration>
    <enumeration key="hidden_dropout_ratios"/>
    <parameter key="reproducible_(uses_1_thread)" value="false"/>
    <parameter key="use_local_random_seed" value="false"/>
    <parameter key="local_random_seed" value="1992"/>
    <parameter key="epochs" value="10.0"/>
    <parameter key="compute_variable_importances" value="false"/>
    <parameter key="train_samples_per_iteration" value="-2"/>
    <parameter key="adaptive_rate" value="true"/>
    <parameter key="epsilon" value="1.0E-8"/>
    <parameter key="rho" value="0.99"/>
    <parameter key="learning_rate" value="0.005"/>
    <parameter key="learning_rate_annealing" value="1.0E-6"/>
    <parameter key="learning_rate_decay" value="1.0"/>
    <parameter key="momentum_start" value="0.0"/>
    <parameter key="momentum_ramp" value="1000000.0"/>
    <parameter key="momentum_stable" value="0.0"/>
    <parameter key="nesterov_accelerated_gradient" value="true"/>
    <parameter key="standardize" value="true"/>
    <parameter key="L1" value="1.0E-5"/>
    <parameter key="L2" value="0.0"/>
    <parameter key="max_w2" value="10.0"/>
    <parameter key="loss_function" value="Automatic"/>
    <parameter key="distribution_function" value="AUTO"/>
    <parameter key="early_stopping" value="false"/>
    <parameter key="stopping_rounds" value="1"/>
    <parameter key="stopping_metric" value="AUTO"/>
    <parameter key="stopping_tolerance" value="0.001"/>
    <parameter key="missing_values_handling" value="MeanImputation"/>
    <parameter key="max_runtime_seconds" value="0"/>
    <list key="expert_parameters"/>
    <list key="expert_parameters_"/>
    </operator>
    <operator activated="true" class="neural_net" compatibility="9.2.001" expanded="true" height="82" name="Neural Net" width="90" x="581" y="136">
    <list key="hidden_layers"/>
    <parameter key="training_cycles" value="200"/>
    <parameter key="learning_rate" value="0.01"/>
    <parameter key="momentum" value="0.9"/>
    <parameter key="decay" value="false"/>
    <parameter key="shuffle" value="true"/>
    <parameter key="normalize" value="true"/>
    <parameter key="error_epsilon" value="1.0E-4"/>
    <parameter key="use_local_random_seed" value="false"/>
    <parameter key="local_random_seed" value="1992"/>
    </operator>
    <operator activated="true" class="group_models" compatibility="9.2.001" expanded="true" height="103" name="Group Models" width="90" x="715" y="340"/>
    <operator activated="true" class="apply_model" compatibility="9.2.001" expanded="true" height="82" name="Apply Model" width="90" x="715" y="34">
    <list key="application_parameters"/>
    <parameter key="create_view" value="false"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="9.2.001" expanded="true" height="82" name="Performance" width="90" x="715" y="136">
    <parameter key="main_criterion" value="first"/>
    <parameter key="accuracy" value="true"/>
    <parameter key="classification_error" value="false"/>
    <parameter key="kappa" value="false"/>
    <parameter key="weighted_mean_recall" value="false"/>
    <parameter key="weighted_mean_precision" value="false"/>
    <parameter key="spearman_rho" value="false"/>
    <parameter key="kendall_tau" value="false"/>
    <parameter key="absolute_error" value="false"/>
    <parameter key="relative_error" value="false"/>
    <parameter key="relative_error_lenient" value="false"/>
    <parameter key="relative_error_strict" value="false"/>
    <parameter key="normalized_absolute_error" value="false"/>
    <parameter key="root_mean_squared_error" value="false"/>
    <parameter key="root_relative_squared_error" value="false"/>
    <parameter key="squared_error" value="false"/>
    <parameter key="correlation" value="false"/>
    <parameter key="squared_correlation" value="false"/>
    <parameter key="cross-entropy" value="false"/>
    <parameter key="margin" value="false"/>
    <parameter key="soft_margin_loss" value="false"/>
    <parameter key="logistic_loss" value="false"/>
    <parameter key="skip_undefined_labels" value="true"/>
    <parameter key="use_example_weights" value="true"/>
    <list key="class_weights"/>
    </operator>
    <connect from_op="Retrieve Titanic Training" from_port="output" to_op="Nominal to Numerical" to_port="example set input"/>
    <connect from_op="Nominal to Numerical" from_port="example set output" to_op="Split Data" to_port="example set"/>
    <connect from_op="Split Data" from_port="partition 1" to_op="Multiply" to_port="input"/>
    <connect from_op="Split Data" from_port="partition 2" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Multiply" from_port="output 1" to_op="Neural Net" to_port="training set"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Deep Learning" to_port="training set"/>
    <connect from_op="Deep Learning" from_port="model" to_op="Group Models" to_port="models in 2"/>
    <connect from_op="Neural Net" from_port="model" to_op="Group Models" to_port="models in 1"/>
    <connect from_op="Group Models" from_port="model out" to_op="Apply Model" to_port="model"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    Sample screenshot


    Hope this helps.
    User: "[Deleted User]"
    New Altair Community Member
    OP
    Updated by [Deleted User]
    varunm1

    This is xml code.
    how can I import it to my RM to see the result? :'(
    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    Open a new process, copy this code and paste it in XML window (View --> Show Panel --> XML) then click on a green tick mark. you can see the process and run it as well. I am also attaching process for importing but try the XML first so that you will get familiar with ways to use Rapidminer.
    User: "[Deleted User]"
    New Altair Community Member
    OP
    I see the screenshot now.
    perfect great <3<3o:)
    User: "[Deleted User]"
    New Altair Community Member
    OP
    Updated by [Deleted User]
    varunm1
    mine doesnt work :'(
    it needs alot of extra operator
    how can I import your xml code?
    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    @mbs

    Open a new process, copy this code and paste it in XML window (View --> Show Panel --> XML) then click on a green tick mark. you can see the process and run it as well. I am also attaching .rmp process for importing (File --> Import Process) but try the XML first so that you will get familiar with ways to use Rapidminer.
    User: "[Deleted User]"
    New Altair Community Member
    OP
    yes I will do that <3
    User: "[Deleted User]"
    New Altair Community Member
    OP
    varunm1 
    I coped your xml and I changed the data but I dont know why again I miss my last column of my data which is my label :'(:'(:'(
    Any way @ varunm1  you and your group in Rapidminer are  brilliant. 
    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    Updated by varunm1
    Did you check if read excel operator is getting your label column into rapidminer? As I said earlier you can check by disabling all other operators and connecting read excel to output. If there is some issue try importing into RM repository and check. There might be some simple mistake.

    @mbs one more thing, you should set role to a label after read excel in case you are splitting data as train and test sets.
    User: "[Deleted User]"
    New Altair Community Member
    OP
    Updated by [Deleted User]
    @varunm1
    I will check all the points that you mention again but Unfortunately this is a bug in RM that some of my friend and me talked about that before. if you look at this link you will understand the problem.
    https://community.rapidminer.com/discussion/54055/read-excel-via-the-import-configuration-wizard-wont-work#latest
     :) 
    User: "[Deleted User]"
    New Altair Community Member
    OP
    Updated by [Deleted User]
    @varunm1 
    finally I solve the problem. According to the link that I send before I copy the excel and put it in a new folder and rename it then use read excel and recall data from my desktop and it works. :);)B)
    But I did it from your xml code and just I changed the data.
    Thanks  <3
    User: "[Deleted User]"
    New Altair Community Member
    OP
    The accuracy of that is 95% but when I use neural network or deep learning the accuracy of each one was 99.65%
     @varunm1 do you know why the result of  combination is not as good as the first single try of them?
    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    The can be many reasons,

    1. How are you splitting data? If it is a random split then "are your setting local random seed parameter" that will give you the same train and test sets all the time. 
    2. Your algorithm might be overfitting due to a more complex neural net which might be in case of a neural net + deep learning algorithm. 
    3. Is your dataset balanced (a simlilar number for samples for each output label)? If not, accuracy is not a good performance measure.

    For reason  1, I recommend you use cross-validation instead of the random splitting of the train and test datasets. This will give you much reliable results.
    For reason 2, you need to start from smaller networks and then build more complex networks based on data and test performance. There is no use of building networks with more hidden layers when a simple neural network can achieve your task.
    For reason 3, use AUC and kappa values as performance metrics instead of accuracy.

    User: "[Deleted User]"
    New Altair Community Member
    OP
    Updated by [Deleted User]
    question1: 0.7 and 0.3 with split dat.
    question2: :o yes youre right.
    question3: no, it's not.
    reason1: cross validation doesnt work in my RM. just a point: I have just one sample in some of my label so I think cross validation is not good.
    reason2: sorry I can not understand. please explain it moer.
    reason3: Iheard and read some paper for AUC but it was not clear and also I dont have any information about kappa. so please explain them.
    varunm1
     thank you for the time that you spend on my question <3
     
    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    For question 1 and reason 1, even if you use split validation, a label which has only one sample can be either in Training or testing dataset, so I don't understand why data with a single sample is in data set.

    The reason I am saying this is, your single sample data can either be in training or test dataset. If it is in training, it is never useful in the test dataset to check performance. If it is in test set then it never saw in training which means it will be predicted wrong all the time as its label never existed while the algorithm trained. So I think labels with single samples might not be useful for performance measure. You can use cross-validation with stratified sampling that will try to keep samples related to every class in all subsets.

    If you use split validation, you should check "use local random seed" in parameters of split data. This will always create the same train and test subsets even if you use different models.

    Reason 2: The complex algorithms overfit some times (depends on data). A deep learning algorithm is the one which has more hidden layers. In my statement, I am saying to train and test a model with a single hidden layer first and then note the performance parameters like accuracy, kappa, etc. Then you can build another model with more hidden layers and see the performances. If your simple model is giving the best performance there is no need to use a complex model with multiple hidden layers.

    Reason 3: A kappa value can be between -1 to 1. A positive kappa value between 0 and 1 with higher the better. A negative kappa value between -1 and 0 represent your algorithm is predicting exactly opposite classes for data. For example, if you have 20 samples with 10 labeled as male and 10 labeled as female. A kappa value of zero means, your algorithm is predicting all 20 samples as either male or female. A negative kappa value means, your algorithms are predicting opposite classes, this means male samples are predicted as female and female samples are predicted as male. A positive kappa value means it is trying to predict correct classes for the given samples. Higher kappa means better predictions.

    Hope this helps.

    User: "[Deleted User]"
    New Altair Community Member
    OP
    Updated by [Deleted User]
    varunm1 
    your explanation was perfect.
    This is my thesis and I downloaded around 2000 data for that and it was unsupervised then I put label for that and made it supervised. at first I had to do datamining for that but then I change it to machine  learning.
    The point is that I recall all the labeled data at once in to RM and just with split data divided it. Is it correct? 
    Thank you
    Regards
    mbs

    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    Your process seems correct, instead of split data try using cross-validation. You can use anything but try these things as well so that you can defend your thesis well as you might get questions like "How reliable are your performances?" etc.
    User: "[Deleted User]"
    New Altair Community Member
    OP
    Updated by [Deleted User]
    With kappa the result is 0.995
    and also I try to use different algorithms or combine them in order to compare the result 
    if you think one data in some labels are wrong I will delete them
    User: "[Deleted User]"
    New Altair Community Member
    OP
    please look at the process <3

    User: "varunm1"
    New Altair Community Member
    Accepted Answer
    you can remove labels with a single sample from my understanding, I am also not exactly sure on this as this is the first time I am getting a label with a single sample.
1 - 30 of 701