Extracting the data from a file. [SOLVED]

JEdward
JEdward New Altair Community Member
edited November 2024 in Community Q&A
Hello,

I'm wanting to extract the data from a file so I can then use it as an example set.  
Basically to go:
Open File -> {extract file data as blob} -> Use blob data as example.

So far the closest I can find to do this is using the script operator, but how would I refer to the output of the 'Open File' operator within a Groovy script?  

Thanks,
John.
Tagged:

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    you can see how it could be done in the following process:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.013">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.3.013" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="open_file" compatibility="5.3.013" expanded="true" height="60" name="Open File" width="90" x="45" y="30">
            <parameter key="filename" value="C:\Users\xyz\Test.txt"/>
          </operator>
          <operator activated="true" class="execute_script" compatibility="5.3.013" expanded="true" height="76" name="Execute Script" width="90" x="179" y="30">
            <parameter key="script" value="import javax.swing.JOptionPane;&#10;&#10;import com.rapidminer.operator.nio.file.SimpleFileObject;&#10;import com.rapidminer.operator.nio.file.RepositoryBlobObject&#10;&#10;SimpleFileObject fileObject = input[0];&#10;// fileObject.getFile() returns the File which can be used to read it&#10;System.out.println(fileObject.getFile());&#10;&#10;// for Blobs use this:&#10;//RepositoryBlobObject blob = input[0];&#10;&#10;return fileObject;"/>
          </operator>
          <connect from_op="Open File" from_port="file" to_op="Execute Script" to_port="input 1"/>
          <connect from_op="Execute Script" from_port="output 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Regards,
    Marco
  • MariusHelf
    MariusHelf New Altair Community Member
    You may get the same result with Read Document -> Document to Data without any scripting.
    Not sure though how the encoding deals with binary files.

    Best  regards,
    Marius
  • JEdward
    JEdward New Altair Community Member
    Thanks for this Marco, that's perfect!

    I actually fudged it using a macro and reading the file within the Groovy script with the line:
    f = new File("%{file_path}")

    It works, but I much prefer your version as it should be more flexible overall & might offer a speed improvement as it's executing java directly. 
    I'll give a test both ways and compare. 

    Best,
    John.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.