Creating new attributes / Custom Example Set

cherokee
cherokee New Altair Community Member
edited November 5 in Community Q&A
Hi!

I'm currently writing my own bunch of operators. One task would be to create a new attribute. I tried to copy this process from AttributeConstruction / ExpressionParser. Anyhow it doesn't work. When running my operator in RapidMiner I get the error "ArrayIndexOutOfBoundsException occured in 1st application of Add Point Mapping (Add Point Mapping)".

I'm using this code (minimal example):
public IOObject[] apply() throws OperatorException {
PointExampleSet exampleSet = getInput(PointExampleSet.class);

Attribute xAttr = AttributeFactory.createAttribute("x1", Ontology.REAL);
exampleSet.getExampleTable().addAttribute(xAttr);
exampleSet.getAttributes().addRegular(xAttr);

for (Example example : exampleSet) {
example.setValue(xAttr, 0);
}
  return new IOObject[]{exampleSet};
}
For my project I had to extend ExampleSet, but I didn't override any method. What am I doing wrong?

[Edit:] As workaround I gave AttributeConstruction. I came up with the same error. So the problem seems to be in my Custom ExampleSet. But I haven't overridden anything there. *confused*.

I've written an Reader that creates my PointExampleSet from file(s). Therefore I'm using a MemoryExampleTable.

Best regards,
Michael

Answers

  • cherokee
    cherokee New Altair Community Member
    Here is the code for my Source Operator without logging:
    public IOObject[] apply() throws OperatorException {
    File sourceDir = getParameterAsFile("directory_to_load_from");
    File[] allFiles = sourceDir.listFiles();

    Pattern filePattern = Pattern.compile(getParameterAsString("only_files"));

    MemoryExampleTable data = new MemoryExampleTable(new ArrayList<Attribute> ());
    data.addAttribute(AttributeFactory.createAttribute("id", Ontology.STRING));

    int maxPoints = 0;
    for (File source : allFiles) {
    if (filePattern.matcher(source.getName()).matches()) {
    maxPoints = Math.max(maxPoints, addPTSData(data, source));
    }
    }

    PointExampleSet exSet = new PointExampleSet(data);
    // addPoint mappings
    int pointMaximum = -1;
    String shapeName = "point";
    int shapeNameIndex = -1;
    int sequentialNumber = 0;
    String xSuffix = getParameterAsString("x_value_suffix");
    String ySuffix = getParameterAsString("y_value_suffix");
    List<String[]> shapeNames = getParameterList("shape_names");

    for (int pointNo = 0; pointNo<maxPoints;pointNo++) {
    // check if attribute is present iff not create it

                            if (sequentialNumber >= pointMaximum) {
    shapeNameIndex++;
    if (shapeNameIndex >= shapeNames.size()) {
    shapeNameIndex = shapeNames.size() - 1;
    } else {
    shapeName = shapeNames.get(shapeNameIndex)[0];
    pointMaximum = Integer.parseInt(shapeNames
    .get(shapeNameIndex)[1]);
    sequentialNumber = 0;
    }
    }
    String nextPointName = shapeName + sequentialNumber++;
    Attribute xAttr = data.findAttribute(nextPointName+xSuffix);
    Attribute yAttr = data.findAttribute(nextPointName+ySuffix);

    exSet.addPointMapping(nextPointName, xAttr, yAttr);
    }



    return new IOObject[]{exSet};
    }
  • TobiasMalbrecht
    TobiasMalbrecht New Altair Community Member
    Hi,

    did you add the data via the addDataRow methods in [tt]MemoryExampleTable[/tt]. If you just create an example set from a newly created [tt]MemoryExampleTable[/tt], there won't be any data. So, the common way to create an example set is to create a [tt]MemoryExampleTable[/tt], add the data using the addDataRow methods and then create the example set afterwards ...

    Kind regards,
    Tobias
  • cherokee
    cherokee New Altair Community Member
    Yes,
    I have done it this way. I've used the following code (excerpt):
    private int addPTSData(MemoryExampleTable table, File ptsFile)
    throws OperatorException {
    // Variables for naming attributes
    ...

    List<Point2D> points =  loadPTS(ptsFile); // actual file handling; just returns a list of 2D points

    int arraySize = points.size()*2+1;
    double[] pointData = new double[arraySize];

    Attribute idAtt = table.findAttribute("id");
    String id = ptsFile.getName();

    if (idAtt != null) {
    NominalMapping idMap = idAtt.getMapping();

    if (idMap instanceof PolynominalMapping) {
    pointData[0] = ((PolynominalMapping)idMap).mapString(id);
    }
    }

    int i = 0;
    for (; i<points.size();i++) {
    // save point data in array
    Point2D sp = points.get(i);
    pointData[2*i+1] = sp.getX();
    pointData[2*i+2] = sp.getY();


    ... // some code for creating attributes according to the numer ob points given
    }
    }


    table.addDataRow(new DoubleArrayDataRow(pointData));
    return i;
    }
  • cherokee
    cherokee New Altair Community Member
    So, I think I came a bit closer to the error.

    I get the ArrayIndexOutOfBounds Exception when I'm trying to access xAttr, or yAttr in an Example.

    As base for the ExampleTable I've used DoubleArrayDataRow with an simple double array as constructor argument. Might there be the problem? If yes, what else should I use?

    Greetings,
    Michael
  • land
    land New Altair Community Member
    Hi Michael,
    I think you have somehow rotated your data. In our terminology each point will be one example and hence one data row. So I think your data row should have 2 dimensions (x and y attributes) plus a number for special attributes like label etc.

    An easy way for getting the correct number of attribute's is as follows:
    1. Create your Attributes using the AttributeFactory
    2. Add all these attributes into the ExampleTable
    3. Create data rows using a DataRowFactory with exampleTable.getAttributeCount as length
    4. Set the values in the datarow using dataRow.setValue(attribute, value) for each attribute you created

    This should work,

      Greetings,
      Sebastian
  • cherokee
    cherokee New Altair Community Member
    land wrote:

    I think you have somehow rotated your data. In our terminology each point will be one example and hence one data row. So I think your data row should have 2 dimensions (x and y attributes) plus a number for special attributes like label etc.
    Well now it get's complicated. I think I've done the way you discribe it, but: I'm dealing with a bunch of points as one example, in fact reference points on a face. So your point (example) consists of many of my points. The reason why I extended ExampleSet was to add a point mapping. So I can say
    myExample.getPointValue(pointName)
    and get a Point2D. I'm gonna use this for Preprocessing.

    For example I want to calculate the center ("Am") of two points ("A1" and "A2"). Therefore I have to add two new attribute to an example/point (say "Am_x" and "Am_y") and add a mapping in my example set ("Am" --> "Am_x", "Am_y"). With the first part I have trouble.

    An easy way for getting the correct number of attribute's is as follows:
    1. Create your Attributes using the AttributeFactory
    2. Add all these attributes into the ExampleTable
    3. Create data rows using a DataRowFactory with exampleTable.getAttributeCount as length
    4. Set the values in the datarow using dataRow.setValue(attribute, value) for each attribute you created
    This sounds good. But to keep my approach flexible, I don't know a priori how many attributes I will need. As I don't know how many points are stored in each file. But maybe I find a way to use this.

    Best regards,
    Michael
  • cherokee
    cherokee New Altair Community Member
    Heureka!

    It works now. Thanks a lot, Sebastian. Your were (nearly) right. I have to use the base table instead of the example set. But not only for creating the example set but also for adding the attributes.

    Thank you all a lot for your help!
    Michi