"[MOSTLY SOLVED] Trying to build the tutorial's extension"

Unknown
edited November 5 in Community Q&A
  I'm trying to follow the "How to Extend RapidMiner 5" white paper and having trouble.

  At section 7.1 "The Extension Bundle", it talks of "the tutorial extension that comes with this guide. As all RapidMiner extensions it comes as a single jar file."  No such jar came with the white paper or its accompanying zips (it's not in the Tutorial, Template or Unuk projects).  The contents of the manifest which are given make it look like the jar is produced by the Tutorial, but section 7.2 "The ant Build File" contradicts this by clearly using the Template as the source.

  The first confusion occurs in chapter 4 "Creating your own Extension", on p. 20, where it reads "If you are going to deploy your Extension to RapidMiner for testing purpose, you might execute the install target of the ant file build.xml."  This should specify which build.xml: Template, Tutorial, or either.

  Next, section 5.1 "Our First Operator" should indicate which (of Template or Tutorial) to use as the basis for the new class.  I've used the Template.

  Small correction: on p. 23 (section 5.2 "Adding Ports"), the 'exampleSetInput.getData()' call is deprecated, and should be replaced with 'exampleSetInput.getData(ExampleSet.class)'.

  On p. 25 (section 5.3 "Declaring operators to RapidMiner"), the OperatorsTemplate.xml shown doesn't match the one that came with the Template project. It is currently loaded with a 'generate_extract' operator named 'com.rapidminer.operator.features.construction.TextInformationExtractionOperator'; this should be replaced with what is shown in the white paper (i.e. <key>numerical_to_date</key> <class>com.rapidminer.operator.preprocessing.transformation</class> <replaces>Numerical2Date</replaces>).

  Once the OperatorsTemplate.xml is updated and the Numerical2DateOperator.java has been written in the Template's source, running the ant install fails.  All 74 errors are along the lines of "error: package com.sun.javadoc does not exist", and clearly are a consequence of the javadoc's tools.jar not being in the JRE (it's in the JDK instead).

  It's possible to add tools.jar manually to the Unuk project; in order for the .classpath addition to be a relative path, one could add tools.jar to /lib/.  However, this is pointless as it will not solve the ant install failure.

  It turns out one needs to set the JRE to the JDK's when running the ant install.  From the Template's build.xml, choose Run as: Ant Build...: Edit Configuration: JRE: Separate JRE: jdk, then Run.

  But my ant install still fails.  The edited log concludes this post, below.  The last error message is in French (my OS is French) and basically complains about "unauthorized content in prologue".  The problem seems to be that the ant install run goes through all the projects in my Eclipse project repository, including the .metadata folder.  Note that the error apparently concerns the very first file it looked at.  How do I fix that?

Buildfile: C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\build.xml
Trying to override old definition of task get
Trying to override old definition of task rpm
Trying to override old definition of task post
clean:
    [echo] Cleaning...
  [delete] Deleting directory C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\build
  [delete] Deleting directory C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\javadoc
   [mkdir] Created dir: C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\build
   [mkdir] Created dir: C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\javadoc
version.get:
    [echo] Long version: ${extension.version}.${extension.revision}.${extension.update}; short version: ${extension.version}.${extension.revision}
init.setEncoding:
init:
Trying to override old definition of task post
Trying to override old definition of task post
init.setEncoding:
copy-resources:
    [echo] Copying resources...
    [copy] Copying 9 files to C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\build
build:
Trying to override old definition of task post
Trying to override old definition of task post
build.rm:
Trying to override old definition of task post
Trying to override old definition of task post
build.dependencies.prepare:
    [echo] Dependencies of Template:
    [echo] C:\Users\username\Documents\Eclipse\.metadata\.bak_0.log
...
(thousands of echo lines going through all projects in C:\Users\username\Documents\Eclipse)
...
    [echo] C:\Users\username\Documents\Eclipse\RapidMiner_Unuk\svn.project
build.dependencies:
    [echo] Building plugin dependencies of Template...

BUILD FAILED
C:\Users\username\Documents\Eclipse\RapidMiner_Unuk\build_extension.xml:139: The following error occurred while executing this line:
C:\Users\username\Documents\Eclipse\RapidMiner_Unuk\build_extension.xml:191: The following error occurred while executing this line:
C:\Users\username\Documents\Eclipse\RapidMiner_Unuk\build_extension.xml:196: The following error occurred while executing this line:
C:\Users\username\Documents\Eclipse\.metadata\.bak_0.log:1: Contenu non autorisé dans le prologue.

Total time: 8 seconds

Answers

  • I've found at http://rapid-i.com/rapidforum/index.php?topic=5062.10;wap2 that one should modify the Template's build.xml last five lines from:

    <fileset id="build.dependentExtensions" dir="..">
    </fileset>

    <import file="${rm.dir}/build_extension.xml" />
    </project>
    to:

    <fileset id="build.dependentExtensions" dir="..">
      <exclude name="**/*" />
    </fileset>

    <import file="${rm.dir}/build_extension.xml" />
    </project>
      And this works!

      My ant install log concludes this post.  Now I'll try to get rid of the warnings.

    Buildfile: C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\build.xml
    Trying to override old definition of task get
    Trying to override old definition of task rpm
    Trying to override old definition of task post
    clean:
        [echo] Cleaning...
      [delete] Deleting directory C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\build
      [delete] Deleting directory C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\javadoc
        [mkdir] Created dir: C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\build
        [mkdir] Created dir: C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\javadoc
    version.get:
        [echo] Long version: ${extension.version}.${extension.revision}.${extension.update}; short version: ${extension.version}.${extension.revision}
    init.setEncoding:
    init:
    Trying to override old definition of task post
    Trying to override old definition of task post
    init.setEncoding:
    copy-resources:
        [echo] Copying resources...
        [copy] Copying 9 files to C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\build
    build:
    Trying to override old definition of task post
    Trying to override old definition of task post
    build.rm:
    Trying to override old definition of task post
    Trying to override old definition of task post
    build.dependencies.prepare:
        [echo] Dependencies of Template:
    build.dependencies:
        [echo] Building plugin dependencies of Template...
        [echo] ...Finished
        [echo] RapidMiner Extension Template: Compile with Java from dir: C:\Program Files\Java\jdk1.7.0_09\jre
        [echo] RapidMiner Extension Template: using Java version: 1.7.0_09
        [javac] C:\Users\username\Documents\Eclipse\RapidMiner_Unuk\build_extension.xml:144: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
        [javac] Compiling 9 source files to C:\Users\username\Documents\Eclipse\RapidMiner_Extension_Template\build
        [javac]          WARNING
        [javac] The -source switch defaults to 1.7 in JDK 1.7.
        [javac] If you specify -target 1.6 you now must also specify -source 1.6.
        [javac] Ant will implicitly add -source 1.6 for you.  Please change your build file.
        [javac] warning: [path] bad path element "C:\Users\username\Documents\Eclipse\RapidMiner_Unuk\lib\jdbc\jcifs.jar": no such file or directory
        [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6
        [javac] 2 warnings
    unzipLibs:
        [mkdir] Created dir: C:\Users\username\Documents\Eclipse\RapidMiner_Unuk\release\libfiles
    createJar:
        [echo] Creating jar...
        [echo] Manifest Classpath: C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/SassyReader-0.5.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/blas.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/collections-generic.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/colt.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/commons-codec-1.4.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/commons-collections.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/commons-httpclient-3.1.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/commons-lang-2.4.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/commons-logging-1.1.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/concurrent.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/dom4j-1.6.1.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/encog.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/groovy-all-1.7.7.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/itextpdf-5.3.3.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/ivy-2.2.0.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jama.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jcommon.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jep.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jfreechart.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jmathplot.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/joone-engine.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jugpreview.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jung-algorithms.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jung-api.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jung-graph-impl.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jung-visualization.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/junit.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jxl.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/kdb.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/launcher.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/looks.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/mail.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/microba.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/poi-3.8-20120326.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/poi-excelant-3.8-20120326.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/poi-ooxml-3.8-20120326.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/poi-ooxml-schemas-3.8-20120326.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/poi-scratchpad-3.8-20120326.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/rapidminer.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/rm_doc.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/rsyntaxtextarea.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/slf4j-api-1.6.4.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/slf4j-simple-1.6.4.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/vldocking.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/ws-commons-util-1.0.2.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/xmlbeans-2.3.0.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/xmlpull.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/xmlrpc-client-3.1.3.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/xmlrpc-common-3.1.3.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/xpp3.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/xstream.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-export.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-graphics2d.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-graphicsio-emf.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-graphicsio-pdf.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-graphicsio-ps.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-graphicsio-svg.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-graphicsio-swf.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-graphicsio.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-io.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-swing.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-util.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/freehep/freehep-xml.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jdbc/hsqldb.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jdbc/iijdbc.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jdbc/jtds-1.2.2.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jdbc/mysql-connector-java-5.1.17-bin.jar C:/Users/username/Documents/Eclipse/RapidMiner_Unuk/lib/jdbc/postgresql-9.1-901.jdbc4.jar
          [jar] Building jar: C:\Users\username\Documents\Eclipse\RapidMiner_Unuk\release\rapidminer-Template-${extension.version}.${extension.revision}.${extension.update}.jar
      [delete] Deleting directory C:\Users\username\Documents\Eclipse\RapidMiner_Unuk\release\libfiles
    install:
        [move] Moving 1 file to C:\Users\username\Documents\Eclipse\RapidMiner_Unuk\lib\plugins
    BUILD SUCCESSFUL
    Total time: 10 seconds
  • The remaining problems...

    version.get:
        [echo] Long version: ${extension.version}.${extension.revision}.${extension.update}; short version: ${extension.version}.${extension.revision}
      This also yields a plugin jar bearing the most ugly name "rapidminer-Template-${extension.version}.${extension.revision}.${extension.update}.jar"

      The solution (found here: http://rapid-i.com/rapidforum/index.php/topic,2356.msg9362.html#msg9362) is to create a file named "build.properties" (at the project's root) with this content (modifying the values as required):

    extension.version=1
    extension.revision=0
    extension.update=0
      Next!

    build.dependencies:
       [echo] RapidMiner Extension Template: using Java version: 1.7.0_09
       [javac] C:\Users\dthibault\Documents\Eclipse\RapidMiner_Unuk\build_extension.xml:144: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
      This is a harmless warning that is expected since ant 1.8. To fix it, add to the top-level build.xml (in this case the Template's build.xml) the following lines, right after the <description> element:

     <presetdef name="javac">
       <javac includeantruntime="false" />
     </presetdef>
      You will henceforth receive the "Trying to override old definition of task javac" message instead (just before the "Trying to override old definition of task get" message).

      Next!

      [javac]           WARNING
      [javac] The -source switch defaults to 1.7 in JDK 1.7.
      [javac] If you specify -target 1.6 you now must also specify -source 1.6.
      [javac] Ant will implicitly add -source 1.6 for you.  Please change your build file.
      The solution is to fix this in the Unuk build_extension.xml, because it redefines the javac task at line 144. Simply change the 'target="1.6"' part into 'source="1.6" target="1.6"'. Like so:

    <javac encoding="${build.encoding}" debug="${compiler.debug}" destdir="${build.build}" deprecation="${compiler.deprecation}" compiler="${compiler.version}" fork="true" memorymaximumsize="400m" source="1.6" target="1.6">
      Next!

      [javac] warning: [path] bad path element "C:\Users\dthibault\Documents\Eclipse\RapidMiner_Unuk\lib\jdbc\jcifs.jar": no such file or directory
      This problem occurs because the META-INF/MANIFEST.MF of lib/jdbc/jtds-1.2.2.jar contains a "Class-Path: jcifs.jar" statement.

      One solution is to rebuild the offending jar with a MANIFEST.MF that does not have a Class-Path entry.  Another is allegedly to pass the -Xlint:-path argument to javac.  So, in Unuk's build_extension.xml, in the block we just modified, right after <compilerarg value="${compiler.arguments}" />, we add this:

    <compilerarg value="-Xlint:-path" />
      Well, it's supposed to work but does not. Maybe this is an ant 1.8 bug?  I've tried every occurrence of javac in both build.xml files and the build_extension.xml, with no luck.

      Next!

      [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6
      One solution is to provide the path to the rt.jar of the correct source/target version, e.g. <javac ... bootclasspath="/opt/sun-jdk-1.5.0.22/jre/lib/rt.jar" source="1.5" target="1.5" />.

      Failing this, we go back to the build.dependencies bit of build_extension.xml and add, right after <compilerarg value="${compiler.arguments}" />, this:

    <compilerarg value="-Xlint:-options" />
      Unlike <compilerarg value="-Xlint:-path" />, this works.
  • Two problems, one minor, one major.

    Minor: As part of its startup, RapidMiner emits this troubling message:

    ...
    INFO: JDBC driver oracle.jdbc.driver.OracleDriver not found. Probably the driver is not installed.
    [Fatal Error] :1:1: Premature end of file.
    Apparently this is a "normal" SAXException occurring when processing an input XML file, either because the file is missing or because one is trying to read it a second time (see e.g. http://www.danielschneller.com/2008/01/saxparseexception-1-1-premature-end-of.html). ; Anyone know where in RapidMiner this exception is being thrown and caught?  It'd be nice to get rid of it cleanly.

    Major:

    The Template manages to correctly run its ant build and produces an apparently well-formed jar in RapidMiner's plugins directory.  But when I run it (from within Eclipse), the plug-in is not being loaded.  My own plugin is derived from the Template, so of course it doesn't load either.  What is missing?
  • MariusHelf
    MariusHelf New Altair Community Member
    Hi, thanks for posting the solutions of your problems. I created an internal todo item requesting to update the extension template.

    About the issue with the xml file: does it occur only when your extension is activated (so is it really related to your extension)? Do your operators.xml and operatorsDocumentation.xml etc. have correct  syntax?

    Regards,
    Marius
  • Marius wrote:

    About the issue with the xml file: does it occur only when your extension is activated (so is it really related to your extension)? Do your operators.xml and operatorsDocumentation.xml etc. have correct  syntax?
    I've postponed looking into the XML warning because right now I can't get a skeleton plugin to work.  I've modelled it after the existing com.rapidminer.gui.tools.dialogs.wizards.dataimport.csv.CSVFileReader (I'm trying to implement a new file format reader).  But the RapidMiner launch fails to create an instance of my operator ---something fails in the constructor.  Here is the class, which I think is correct by itself; I strongly suspect the problem is in the remaining support files (.xml, .properties, etc.):

    package com.rapidminer.operator.io;

    import java.io.IOException;
    import java.util.*;

    import com.rapidminer.operator.*;
    import com.rapidminer.parameter.*;

    public class LTFDataReader extends AbstractDataReader {
      // The trace directory path parameter
      public static final String PARAMETER_LTF_DIR = "trace_path";

      /**
       * Constructor.
       * @param description An OperatorDescription
       */
      public LTFDataReader(OperatorDescription description) {
         super(description);
         getParameters().addObserver(new CacheResetParameterObserver(PARAMETER_LTF_DIR), false);
      }

      /**
       * Returns the DataSet representation of the data
       * @see com.rapidminer.operator.io.AbstractDataReader#getDataSet()
       */
      @Override
      protected DataSet getDataSet() throws OperatorException, IOException {
         //TODO
         return null;
      }

      @Override
      public List<ParameterType> getParameterTypes() {
          List<ParameterType> types = super.getParameterTypes();
    //     types.add(new ParameterTypeDirectory(PARAMETER_LTF_DIR, "Path to the directory to read the trace from.", false));
          //TODO Using the default dir path constructor for debug purposes only
          types.add(new ParameterTypeDirectory(PARAMETER_LTF_DIR, "Path to the directory to read the trace from.", "D:\\Traces\\FirefoxData-SBALab\\traces-jsengine\\e4x-decompilation-regress-349814.js"));
          return types;
      }
    }
    The error is:

    com.rapidminer.operator.OperatorCreationException: Operator cannot be constructed: 'read_ltf(com.rapidminer.operator.io.LTFDataReader)': tried to access method com.rapidminer.operator.io.AbstractDataReader$CacheResetParameterObserver.<init>(Lcom/rapidminer/operator/io/AbstractDataReader;Ljava/lang/String;)V from class com.rapidminer.operator.io.LTFDataReader

    Which occurs at the "getParameters().addObserver(new CacheResetParameterObserver(PARAMETER_LTF_DIR), false);" line in the constructor.

    build.properties simply specifies the project version:

    extension.version=1
    extension.revision=0
    extension.update=0
    The RapidMiner_Extension_LTF project was prepared by copying the RapidMiner_Extension_Template project and first changing the *Template* file names into *LTFReader*.
    # com.rapidminer.PluginInitLTFReader doesn't do anything
    # com.rapidminer.operator.io.LTFDataReader is as given above
    # resources com.rapidminer.resources.i18n.ErrorsLTFReader.properties is empty
    # resources com.rapidminer.resources.i18n.GUILTFReader.properties is empty
    # resources com.rapidminer.resources.i18n.UserErrorMessagesLTFReader.properties is empty
    # resources com.rapidminer.resources.GroupsLTFReader.properties is empty
    # resources com.rapidminer.resources.i18n.OperatorsDocLTFReader.xml is as follows:

    <?xml version="1.0" encoding="windows-1252" standalone="no"?>
    <operatorHelp lang="en_EN">
      <operator>
         <name>LTFDataReader</name>
         <synopsis>This operator can read an LTTng trace written in LTF 2.6.</synopsis>
         <help>
         &lt;p&gt;This operator can read LTTng traces written in so-called
         LTF 2.6 (LTTng Trace Format 2.6). LTF 2.6 was introduced by LTTng 0.191
         and used until the last release, LTTng 0.249.&lt;/p&gt;
         </help>
      </operator>
    </operatorHelp>
    # resources com.rapidminer.resources.ioobjectsLTFReader.xml is essentially empty:

    <ioobjects>
    <!--  
       This is an example how to specify IOObjects and theirs various renderers.
    <ioobject
    name="Document"
    class="com.rapidminer.operator.text.Document"
    reportable="true">
     <renderer>com.rapidminer.gui.renderer.text.DocumentRenderer</renderer>
    </ioobject>

       <ioobject
           name="Word List"
           class="com.rapidminer.operator.text.WordList"
           reportable="true">
         <renderer>com.rapidminer.gui.renderer.DefaultTextRenderer</renderer>
       </ioobject>
    -->
    </ioobjects>
    # resources com.rapidminer.resources.parserulesLTFReader.xml is also essentially empty:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <parserules>

    </parserules>
    # resources com.rapidminer.resources.OperatorsLTFReader.xml is as follows:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <operators name="LTFDataReader" version="5.0" docbundle="com/rapidminer/resources/i18n/OperatorsDocLTFReader">
      <group key="">
         <group key="import">
            <group key="data">
            <operator>
            <key>read_ltf</key>
            <class>com.rapidminer.operator.io.LTFDataReader</class>
            </operator>
            </group>
         </group>
      </group>
    </operators>
    # build.xml, finally, is as follows:

    <project name="RapidMiner_Extension_LTF">
    <description>Build file for the LTF RapidMiner extension</description>

      <presetdef name="javac">
        <javac includeantruntime="false" />
      </presetdef>

      <property name="rm.dir" location="../RapidMiner_Unuk" />

    <property name="build.build" location="build" />
    <property name="build.resources" location="resources" />
    <property name="build.lib" location="lib" />

    <property name="check.sources" location = "src" />

    <property name="javadoc.targetDir" location="javadoc" />

    <property name="extension.name" value="LTFReader" />
    <property name="extension.name.long" value="RapidMiner LTF Extension" />
    <property name="extension.namespace" value="LTFDataReader" />
      <property name="extension.vendor" value="DRDC Valcartier" />
      <property name="extension.admin" value="Daniel U. Thibault" />
      <property name="extension.url" value="www.drdc-rddc.gc.ca/drdc/en/centres/drdc-valcartier-rddc-valcartier/" />


    <property name="extension.needsVersion" value="5.0" />
    <property name="extension.dependencies" value="" />

    <property name="extension.initClass" value="com.rapidminer.PluginInitLTFReader" />
    <property name="extension.objectDefinition" value="/com/rapidminer/resources/ioobjectsLTFReader.xml" />
    <property name="extension.operatorDefinition" value="/com/rapidminer/resources/OperatorsLTFReader.xml" />
    <property name="extension.parseRuleDefinition" value="/com/rapidminer/resources/parserulesLTFReader.xml" />
    <property name="extension.groupProperties" value="/com/rapidminer/resources/groupsLTFReader.properties" />
    <property name="extension.errorDescription" value="/com/rapidminer/resources/i18n/ErrorsLTFReader.properties" />
    <property name="extension.userErrors" value="/com/rapidminer/resources/i18n/UserErrorMessagesLTFReader.properties" />
    <property name="extension.guiDescription" value="/com/rapidminer/resources/i18n/GUILTFReader.properties" />


    <!-- Src files -->
    <path id="build.sources.path">
    <dirset dir="src">
    <include name="**" />
    </dirset>
    </path>
    <fileset dir="src" id="build.sources">
    <include name="**/*.java" />
    </fileset>
    <fileset id="build.dependentExtensions" dir="..">
      <exclude name="**/*" />
    </fileset>

    <import file="${rm.dir}/build_extension.xml" />
    </project>
    I suspect the error is in build.xml, or lies in some missing file, maybe even a missing Java class.
  • Meanwhile, I've debugged the "[Fatal Error] :1:1: Premature end of file." message.  This comes from:

    com.rapidminer.RapidMiner.init(), line 534: "DatabaseConnectionService.init"
    com.rapidminer.tools.jdbc.connection.DatabaseConnectionService.init, line 94: "Tools.readTextFile(xmlConnectionsFile)"
    com.rapidminer.tools.Tools.readTextFile, line 776: "Document processXmlDocument = documentBuilder.parse(inStream)"

      Because the file "/.RapidMiner5/connections.xml" exists but is empty (length zero).  javax.xml.parsers.DocumentBuilder throws SAXException AND writes "[Fatal Error] :1:1: Premature end of file" to stderr.  There seems to be no way around this behaviour short of changing the Java source code.

      The simplest workaround is to add a line at the beginning of Tools.readTextFile (just before line 763):

    if (File.length() <= 0) return "";


    I also get an annoying "No file exists" message at the end of the RapidMiner startup.  This comes from:

    com.rapid_i.deployment.update.client.UpdateManager.checkForPurchasedNotInstalled, line 817: "UserCredential authentication = Wallet.getInstance().getEntry(updateServerURI);"
    com.rapidminer.gui.security.Wallet.readCache(), line 80: "System.err.println("No file exists");"

    Clearly this System.err call should be rewritten to use LogService instead (a String will need to be added to resources/com/rapidminer/resources/i18n/LogMessages.properties).  The message should be made more meaningful as well, something like "No stored credentials file exists".
  • Frankly, I'm mostly bewildered at this point.  Let me ask for advice.

    I have this type of data set, stored in a peculiar binary format in the form of a set of files in a directory.  I know how to read the format, which actually starts with its own metadata description.  So, for each record I can readily tell what the fields will be, and what their data types will be.  The records come in a variety of types, each with its own (typically fairly small) set of fields.  About the only field they will all have in common is a timestamp.

    The data sets can be very, very large (gigabytes and more), which means they can be sampled or scanned but certainly can't be all read into memory.  Filtering by record type (or by timestamp) is a possibility.

    My problem is how do I get this into RapidMiner?  We first thought to have an operator similar to the "Read CSV" operator, with maybe some parameters to achieve filtering.  But the family of CSV-related classes leaves me confused.  There is CSVDataReader, used by CSVImportWizard, on the one hand, and the CSVFileReader on the other hand.  The two don't seem to have much to do with each other at all.  CSVFileReader is used only by SimpleExampleSource, which is deprecated, so probably not a good example to follow.  But CSVImportWizard seems to try to do one thing: import data into a repository.  Is the size of the data sets going to be a problem in this paradigm?

    I'd appreciate guidance.
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    1) Why is your connections.xml empty? It's created non empty when it does not yet exist as I just verified.
    2) Useless error message as it is no error. I removed it.
    3) See clases in com.rapidminer.operator.nio - com.rapidminer.operator.io is outdated. To see csv examples, look at CSVExampleSource and CSVResultSetConfiguration.
    4) Well data in a repository is stored in a file on a harddisk - so if your IOObject is huge, the file will be huge, and currently will be read into memory as a whole. So if that's out of the question, you are better off to get your data into a database or split your data up into smaller chunks.

    Regards,
    Marco
  • Marco Boeck wrote:

    1) Why is your connections.xml empty? It's created non empty when it does not yet exist as I just verified.
    2) Useless error message as it is no error. I removed it.
    3) See classes in com.rapidminer.operator.nio - com.rapidminer.operator.io is outdated. To see csv examples, look at CSVExampleSource and CSVResultSetConfiguration.
    4) Well data in a repository is stored in a file on a harddisk - so if your IOObject is huge, the file will be huge, and currently will be read into memory as a whole. So if that's out of the question, you are better off to get your data into a database or split your data up into smaller chunks.
    1) Dunno, it just was.  I deleted it and re-ran RapidMiner, and this time connections.xml got created as a stub with just a "jdbc-entries" element with a "key" attribute.  Maybe the if (File.length() <= 0) should still be inserted into Tools.readTextFile but, instead of just returning an empty string, it would also actively delete the useless file so it gets recreated on the next run.

    4) I guess this means I should set up my source component with built-in filtering.  This could be exposed as input ports so a process could then chunk its way through.  Or manage the ExampleTable so that it becomes a window into the data set (with older rows scrolling off as new rows come in), or have a fully virtual ExampleTable.  I think I'll first try to see if I can correctly create a new source operator that works just like the csv one.

    Do I understand correctly that the Data:Import:Read CSV operator seen in RapidMiner is implemented by the CSVExampleSource class?
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Urhixidur wrote:

    Do I understand correctly that the Data:Import:Read CSV operator seen in RapidMiner is implemented by the CSVExampleSource class?

    Hi,

    com.rapidminer.operator.nio.CSVExampleSource, yes.

    Regards,
    Marco
  • The RapidMiner_Extension_Template's OperatorsDocTemplate.xml should not give TextObjectWriter as an example, since it is one of the few operators whose <operator> element does not have a <key>.  The <key> is essential to the proper function of any extension within the GUI; without it the help does not appear.  k-Medoids is a better example, as it also shows how to use <shortName>.