how to implement python code for the text mining process ?
ksnugroho
New Altair Community Member
Answers
-
hi @ksnugroho - you can use the Execute Python operator (in the Python extension) anywhere you want.
Scott0 -
Some background on using the python operator :
- You can use it as a standalone 'script container' wherever you want, so there isn't even a need to use input or output data.
- If you want to use data (either incoming or outgoing) remember that the operator is treating your data by default as a panda's dataframe. So simply entering data to the inputs allows you to work with the data as a dataframe, and in case you want to manipulate data in other def's, or load external data you just need to return it in the rm_man block as dataframe again.
Find below a simple example, where I use 2 inputs and xlsxwriter, and the python script will generate a multi tabbed excel file, adding the inputs each on one tab, and that's it.<?xml version="1.0" encoding="UTF-8"?><process version="9.0.003"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process"> <process expanded="true"> <operator activated="true" class="python_scripting:execute_python" compatibility="8.2.000" expanded="true" height="124" name="Execute Python (2)" width="90" x="246" y="85"> <parameter key="script" value="import pandas as pd import xlsxwriter def rm_main(data1, data2): writer = pd.ExcelWriter('my_file.xlsx', engine='xlsxwriter') # Write your DataFrame to a file data1.to_excel(writer, 'Page 1') data2.to_excel(writer, 'Page 2') # Save the result writer.save() return"/> </operator> <connect from_port="input 1" to_op="Execute Python (2)" to_port="input 1"/> <connect from_port="input 2" to_op="Execute Python (2)" to_port="input 2"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="source_input 2" spacing="0"/> <portSpacing port="source_input 3" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> </process> </operator> </process>
1