How to select two attributes from ExampleSet?

gaoxiaolei
gaoxiaolei New Altair Community Member
edited November 5 in Community Q&A
Hi, everyone.
I am newbie here. I have a question about how to select two attributes from an exampleset.
If I have an exampleset and it contains both regular attributes and label attribute, I want to select the first and the third regular attributes from it, then convert the new exampleset to double[][]. How should I do it? Dose class AttributeSelectionExampleSet can do it?
Thanks.
gaoxiaolei

Answers

  • wessel
    wessel New Altair Community Member
    Pressing F1 on the "Select Attribute"-operator gives:

    Select Attributes

    Synopsis
    This operator allowes to select which attributes should be part of the resulting

    Description
    This operator selects which attributes of an ExampleSet should be kept and which are removed. Therefore, different filter types may be selected in the parameter attribute filter type and only attributes fulfilling this condition type are selected. The rest will be removed from the ExampleSet. There's a global switch to invert the outcome, so that all attributes which would have been originally discarded will be kept and vice versa. To invert the decision, use the invert selection parameter.
    These types are available
    all: Will simply select each attribute
    single: This will allow you to select a single attribute name. It might be selected from the drop down box of parameter attribute if the meta data is known
    subset: Let's you choose a number of attributes from a list. This will not work if no meta data is present. Each known attribute is shown in the list and might be selected.
    regular_expression: This let's you specify a regular expression. Each attribute whose name matches this expression will be selected. Regular expressions are a very powerful tool but need a detailed explanation to beginners. Please refer to one of the several tutorials available on the internet for a more detailed description.
    value_type: Select only attributes of a certain type. Please mention that the types are hierarchical: For example are binominal attributes nomina as well as polynominal.
    block_type: Similar to value_type this let's you select the attributes depending on their block type.
    no_missing_values: Will select all attributes which don't contain a missing value in any example.
    numeric_value_filter: This will select the attributes by testing if all their values of all examples match this condition or if they aren't not numerical. The numeric condition might be specified by typing a numerical condition. For example the parameter string "> 6" will keep all nominal attributes and all numeric attributes having a value of greater 6 in every example. A combination of conditions is possible: "> 6 && < 11" or "<= 5 || < 0". But && and || must not be mixed.

    Input
    example set input: expects: ExampleSetMetaData: #examples: = 0; #attributes: 0

    Output
    example set output:
    original:

    Parameters
    attribute filter type: The condition specifies which attributes are selected or affected by this operator. Range: all, single, subset, regular_expression, value_type, block_type, no_missing_values, numeric_value_filter; default: all
    attribute: The attribute which should be chosen. Range: string
    attributes: The attribute which should be chosen. Range: string
    regular expression: A regular expression for the names of the attributes which should be kept. Range: string
    use except expression: If enabled, an exception to the specified regular expression might be specified. Attributes of matching this will be filtered out, although matching the first expression. Range: boolean; default: false
    except regular expression: A regular expression for the names of the attributes which should be filtered out although matching the above regular expression. Range: string
    value type: The value type of the attributes. Range: attribute_value, nominal, numeric, integer, real, text, binominal, polynominal, file_path, date_time, date, time; default: attribute_value
    use value type exception: If enabled, an exception to the specified value type might be specified. Attributes of this type will be filtered out, although matching the first specified type. Range: boolean; default: false
    except value type: Except this value type. Range: attribute_value, nominal, numeric, integer, real, text, binominal, polynominal, file_path, date_time, date, time; default: time
    block type: The block type of the attributes. Range: attribute_block, single_value, value_series, value_series_start, value_series_end, value_matrix, value_matrix_start, value_matrix_end, value_matrix_row_start; default: attribute_block
    use block type exception: If enabled, an exception to the specified block type might be specified. Range: boolean; default: false
    except block type: Except this block type. Range: attribute_block, single_value, value_series, value_series_start, value_series_end, value_matrix, value_matrix_start, value_matrix_end, value_matrix_row_start; default: value_matrix_row_start
    numeric condition: Parameter string for the condition, e.g. '>= 5' Range: string
    invert selection: Indicates if only attributes should be accepted which would normally filtered. Range: boolean; default: false
    include special attributes: Indicate if this operator should also be applied on the special attributes. Otherwise they are always kept. Range: boolean; default: false
  • gaoxiaolei
    gaoxiaolei New Altair Community Member
    wessel wrote:

    Pressing F1 on the "Select Attribute"-operator gives:

    Select Attributes

    Synopsis
    This operator allowes to select which attributes should be part of the resulting
    Thank you for your reply, wessel.
    But I want to select two attributes in my own opeator. If there are many attributes in an exampleset, in the first loop the first and the second attributes are selected, and in the second loop the first and the third attributes are selected, .....
    So could you show me some codes? I do not konw whether "Select Attribute"-operator can do this job.
    Thanks again!
  • wessel
    wessel New Altair Community Member
    What you mean codes?
    You mean the script operator?
    Sure you can code it yourself, but why?

    What exactly you want to achieve?
    You want to find out good attribute subsets?
    Like sets containing two attributes?

    The forward selection operator can do that.
    "This operator starts with an empty selection of attributes and, in each round, it adds each unused attribute of the given set of examples. For each added attribute, the performance is estimated using inner operators, e.g. a cross-validation. Only the attribute giving the highest increase of performance is added to the selection. Then a new round is started with the modified selection. "

    Or the Optimize Selection (Brute Force) operator.
    Selects the best features for an example set by trying all possible combinations of attribute selections.
  • gaoxiaolei
    gaoxiaolei New Altair Community Member
    wessel wrote:

    What you mean codes?
    You mean the script operator?
    Sure you can code it yourself, but why?

    What exactly you want to achieve?
    You want to find out good attribute subsets?
    Like sets containing two attributes?

    The forward selection operator can do that.
    "This operator starts with an empty selection of attributes and, in each round, it adds each unused attribute of the given set of examples. For each added attribute, the performance is estimated using inner operators, e.g. a cross-validation. Only the attribute giving the highest increase of performance is added to the selection. Then a new round is started with the modified selection. "

    Or the Optimize Selection (Brute Force) operator.
    Selects the best features for an example set by trying all possible combinations of attribute selections.

    The codes I mean is the java code when I developed a new operator.
    here is my partly codes:

    int attributeNums=com.rapidminer.example.Tools.getRegularAttributeNames(exampleSet).length;
    for (int i = 0; i < attributeNums; i++) {
    for (int j = i+1; j < attributeNums; j++) {
    GmdhNode firstLayerNode=new GmdhNode();
    //My question: how to use the i th and the j the attribute to form another exampleset or convert these two attributes to double[][]???
    firstLayerNode.setNodeID(i);
    firstLayer.addNode(firstLayerNode);
    }
    }
    Thank you wessel!
  • wessel
    wessel New Altair Community Member
    Oh heavens, I am sorry, totally misunderstood your question.
    And now that I do understand your question, I don't know the answer.

    Maybe you can steal some code from here:
    http://www.opensourcejavaphp.net/java/rapidminer/com/rapidminer/operator/learner/PredictionModel.java.html
    http://www.opensourcejavaphp.net/java/rapidminer/com/rapidminer/operator/learner/functions/neuralnet/SimpleNeuralNetLearner.java.html

    import com.rapidminer.operator.Operator;
    import com.rapidminer.Process;
    import com.rapidminer.MacroHandler;
    import com.rapidminer.tools.Ontology

    ExampleSet exampleSet = operator.getInput(ExampleSet.class);

    Attribute sum = AttributeFactory.createAttribute("sum", Ontology.REAL);
    exampleSet.getExampleTable().addAttribute(sum);
    exampleSet.getAttributes().addRegular(sum);

    Attribute avg = AttributeFactory.createAttribute("avg", Ontology.REAL);
    exampleSet.getExampleTable().addAttribute(avg);
    exampleSet.getAttributes().addRegular(avg);

    last = 0;
    n = 0;

    for (Example e : exampleSet) {
        e["sum"] = e["abs"] + last; // iterate over an attribute using the name of the attribute
        last = e["sum"];
        n++;
        e["avg"] = last / n;
    }


    Best regards,

    Wessel
  • gaoxiaolei
    gaoxiaolei New Altair Community Member
    Thanks. But I still can not find some clues there.
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    I hope I understood your problem correctly:
    You can get all regular attributes from an ExampleSet by calling

    exampleSet.getAttributes().regularAttributes();
    After selecting your desired attributes, you can use them to access the data via this code fragment

    // i goes from 0 to exampleSet.size()
    ex.getExampleTable().getDataRow(i).get(attribute);
    This will return the double value for the given example and attribute (note that for nominal values this will return the internal mapping).

    Now you have double values for all examples and your desired attributes and you can use them to fill your own double[][] array.

    Regards,
    Marco
  • gaoxiaolei
    gaoxiaolei New Altair Community Member
    Hi, Marco Boeck.

    You indeed give me some ideas!

    Thank you!

    gaoxiaolei