🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

how to implement python code for the text mining process ?

ksnugrohoUser: "ksnugroho"
New Altair Community Member
Updated by Jocelyn
Hello, 

Find more posts tagged with

Sort by:
1 - 2 of 21
    hi @ksnugroho - you can use the Execute Python operator (in the Python extension) anywhere you want.

    Scott
    kaymanUser: "kayman"
    New Altair Community Member
    Some background on using the python operator : 

    - You can use it as a standalone 'script container' wherever you want, so there isn't even a need to use input or output data.
    - If you want to use data (either incoming or outgoing) remember that the operator is treating your data by default as a panda's dataframe. So simply entering data to the inputs allows you to work with the data as a dataframe, and in case you want to manipulate data in other def's, or  load external data you just need to return it in the rm_man block as dataframe again.

    Find below a simple example, where I use 2 inputs and xlsxwriter, and the python script will generate a multi tabbed excel file, adding the inputs each on one tab, and that's it.

    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.003">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="python_scripting:execute_python" compatibility="8.2.000" expanded="true" height="124" name="Execute Python (2)" width="90" x="246" y="85">
            <parameter key="script" value="import pandas as pd&#10;import xlsxwriter&#10;&#10;def rm_main(data1, data2):&#10;&#10;    writer = pd.ExcelWriter('my_file.xlsx', engine='xlsxwriter')&#10;&#10;    # Write your DataFrame to a file   &#10;    data1.to_excel(writer, 'Page 1')  &#10;    data2.to_excel(writer, 'Page 2')&#10;&#10;    # Save the result &#10;    writer.save()&#10;&#10;    return"/>
          </operator>
          <connect from_port="input 1" to_op="Execute Python (2)" to_port="input 1"/>
          <connect from_port="input 2" to_op="Execute Python (2)" to_port="input 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="source_input 3" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
    </process>