Cannot load facttable (Oracle)
Vinnie
New Altair Community Member
Hello
I have RapidMiner Studio connected to a Oracle datawarehouse. Now RapidMiner can see the tables and I can open all tables in the repository except my facttable (wich has over 9 million records). When I click it nothing happens.
When I look in the rapidminer-studio log I get the following:
Mar 31, 2015 8:51:49 AM com.rapidminer.tools.jdbc.DatabaseHandler executeStatement
INFO: Executing query: 'SELECT * FROM "DWH"."FACTTABLE"'
with no error following...
Has anyone been able to read such big amounts of records?
Thanks.
I have RapidMiner Studio connected to a Oracle datawarehouse. Now RapidMiner can see the tables and I can open all tables in the repository except my facttable (wich has over 9 million records). When I click it nothing happens.
When I look in the rapidminer-studio log I get the following:
Mar 31, 2015 8:51:49 AM com.rapidminer.tools.jdbc.DatabaseHandler executeStatement
INFO: Executing query: 'SELECT * FROM "DWH"."FACTTABLE"'
with no error following...
Has anyone been able to read such big amounts of records?
Thanks.
0
Answers
-
Hi,
browsing such big tables in the repository view is not recommended. You should use the "Read Database" operator and potentially specify some filters in the WHERE clause. Loading 9 million rows from your database is possible, but obviously the data has to go somewhere and that will take quite a lot of memory and time to load. Probably using the "Loop" operator and some manual paging should be used.
Basic demonstration process:
Regards,
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.4.000-SNAPSHOT">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.4.000-SNAPSHOT" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="set_macros" compatibility="6.4.000-SNAPSHOT" expanded="true" height="76" name="Set Macros" width="90" x="45" y="30">
<list key="macros">
<parameter key="step_size" value="10000"/>
</list>
</operator>
<operator activated="true" class="loop" compatibility="6.4.000-SNAPSHOT" expanded="true" height="76" name="Loop" width="90" x="179" y="30">
<parameter key="set_iteration_macro" value="true"/>
<parameter key="macro_name" value="i"/>
<parameter key="iterations" value="5"/>
<process expanded="true">
<operator activated="true" class="generate_macro" compatibility="6.4.000-SNAPSHOT" expanded="true" height="76" name="Generate Macro" width="90" x="45" y="30">
<list key="function_descriptions">
<parameter key="macro_start" value="%{step_size} * (%{i}-1)"/>
<parameter key="macro_end" value="%{step_size} * %{i}"/>
</list>
</operator>
<operator activated="true" class="read_database" compatibility="6.4.000-SNAPSHOT" expanded="true" height="60" name="Read Database" width="90" x="179" y="30">
<parameter key="connection" value="Local"/>
<parameter key="query" value="SELECT * FROM `big` WHERE id > %{macro_start} AND id < %{macro_end}"/>
<enumeration key="parameters"/>
</operator>
<connect from_port="input 1" to_op="Generate Macro" to_port="through 1"/>
<connect from_op="Read Database" from_port="output" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
</operator>
<connect from_op="Set Macros" from_port="through 1" to_op="Loop" to_port="input 1"/>
<connect from_op="Loop" from_port="output 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Marco0