importing from Teradata

Mark
Mark New Altair Community Member
edited November 5 in Community Q&A
Would you please help me with this issue?  I read that RapidMiner works with Teradata, but I am getting a connection error.  I downloaded a couple of jar files from Teradata and placed them in C:\Program Files\Rapid-I\RapidMiner5\lib\jdbc, and I edited the .classpath file by adding these lines:

<classpathentry kind="lib" path="lib/jdbc/terajdbc4.jar"/>
<classpathentry kind="lib" path="lib/jdbc/tdgssconfig.jar"/>

When I try to test the connection, I get the following error:

[Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified

I am also getting the following URL:

jdbc:odbc:hostname:1025DatabaseSchemeName

Is that correct? I thought that it should be this instead:

jdbc:odbc:hostname:1025/DatabaseSchemeName

I am running RapidMiner on Windows XP.

Thanks,

Mark
Tagged:

Answers

  • fischer
    fischer New Altair Community Member
    Hi,

    Please don't add JDBC drivers to the .classpath file (I assume you are running it from eclipse, right). Rather than that, use Tools -> Manage Database Drivers to add your driver (if you have multiple JAR files, you can separate them by comma).

    Best,
    Simon
  • Mark
    Mark New Altair Community Member
    Indeed, using Manage Database Drivers is a better way; however, I still had problems, and here is my work around.

    It turns out that Teradata uses a type 4 JDBC URL, but RapidMiner uses type 3.  This means that RapidMiner inserts a colon into the URL when there shouldn't be one.  Many applications allow the user to edit the resulting URL manually, but RapidMiner does not.  I then tried to rig the URL by not putting the port number into Manage Database Connections window and placing this string

    DBS_PORT=1025/DATABASE=databaseName

    into the Database scheme field, but RapidMiner would not let me leave the port number field blank. 

    I think that the best solution would be to allow the user to manually edit the database URL, but I instead made a quick fix to the code to allow me to go forward.  In the class DatabaseConnectionDialog at line 492 I now have:

    // if (host == null || "".equals(port)) { // quick fix for type 4 JDBC
    if (host == null) {

    This prevents the dialog box complaining about an empty port field from coming up, and I can connect to Teradata.

    Mark


  • fischer
    fischer New Altair Community Member
    Hi,

    both your propsed solutions are already there:

    1. You can specify the colon/slash in the Manage Database Driver dialog under "Schema separator"
    2. In the ReadDatabase operator, switch "define connection" from "predefined" to "url".

    Best,
    Simo
  • Mark
    Mark New Altair Community Member
    Thank you again Simon. 

    1. I have slash specified in the Manage Database Driver dialog, but I still get the colon between the hostname and the port number.
    2. I tried using "url" for "define connection," but I got a null pointer error with RapidMiner 5.0.010:

    Aug 16, 2010 9:49:54 AM SEVERE: Process failed: operator cannot be executed. Check the log messages...
    Aug 16, 2010 9:49:54 AM SEVERE: Here:          Process[1] (Process)
              subprocess 'Main Process'
          ==>  +- Read Database[1] (Read Database)
    Aug 16, 2010 9:49:54 AM SEVERE: java.lang.NullPointerException

    I got this error whether I used a compatibility level of 5.0.08 or 5.0.10 in the Read Database operator.  Of interest is that I don't get this error with my Eclipse version of 5.0.010, which differs only in the code change that I described on August 12.

    Here's the process:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
        <parameter key="logverbosity" value="3"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="1"/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <parameter key="parallelize_main_process" value="false"/>
        <process expanded="true" height="235" width="413">
          <operator activated="true" class="read_database" compatibility="5.0.8" expanded="true" height="60" name="Read Database" width="90" x="45" y="75">
            <parameter key="read_not_matching_values_as_missings" value="true"/>
            <list key="data_set_meta_data_information"/>
            <parameter key="attribute_names_already_defined" value="true"/>
            <parameter key="define_connection" value="url"/>
            <parameter key="connection" value="teradata"/>
            <parameter key="database_system" value="Teradata"/>
            <parameter key="database_url" value="jdbc:teradata://hostname/DBS_PORT=1025/DATABASE=databasename"/>
            <parameter key="username" value="username"/>
            <parameter key="password" value="****"/>
            <parameter key="define_query" value="query file"/>
            <parameter key="query" value="Select * where columname= 'columnvalue' from schemaname.datatablename"/>
            <parameter key="query_file" value="pathname\teradata-test-query.txt"/>
          </operator>
          <connect from_op="Read Database" from_port="output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>

  • Mark
    Mark New Altair Community Member
    Here are some more clues:
    1. I reverted my code change so that I have the original code again, but my Eclipse version still works if I run from within Eclipse.  This implies that the code change did not fix my null pointer error; the code change only allow me to run with the proper URL.  Once I had the proper URL, I could run from within Eclipse without getting the null pointer error, but if I run outside of Eclipse, I get the null pointer error even if I have the proper URL.
    2. I copied over the entire installed RapidMiner directory to my development directory.  If I try to run the development RapidMiner executable outside of Eclipse, the process fails just as it does with the installed executable.  Could the classpath be different when running within Eclipse? 
    3. As I mentioned in the beginning of the thread, there are two jar files; one is a driver, while the other contains config information.  If I remove the config jar from the path within the Eclipse project, the process fails with the previously mentioned null pointer error when I try to run RapidMiner from within Eclipse.  This implies that Eclipse helps RapidMiner to know about the second jar file somehow.  How does a user get the installed version of RapidMiner to work with more than one database jar file?
    4. If I edit the RapidMiner's .classpath file so that it contains a reference to the config jar file, RapidMiner still does has an null pointer error for the process.

    Now even more mystified,

    Mark
  • fischer
    fischer New Altair Community Member
    Hi,

    1. I made your changes to the dialog. Empty ports are now allowed.
    2. Can you post a stack trace of the NPE?
    3. Regarding your question "How does a user get the installed version of RapidMiner to work with more than one database jar file?": As posted on August 9, you can separate multiple jar files by comma in the Manage Database Drivers dialog. Didn't that help?

    Best,
    Simon
  • Mark
    Mark New Altair Community Member
    Thank you Simon for your continuing assistance.

    2. Here is the log message and  NPE:

    log message:

    Aug 17, 2010 9:50:33 AM CONFIG: Connecting to jdbc:teradata://servername/DBS_PORT=1025/DATABASE=databasename as username.
    Aug 17, 2010 9:50:53 AM SEVERE: Process failed: null
    Aug 17, 2010 9:50:54 AM SEVERE: Here:          Process[1] (Process)
              subprocess 'Main Process'
          ==>  +- Read Database[1] (Read Database)
    Aug 17, 2010 9:50:54 AM SEVERE: null

    Here is the NPE:

    Exception: java.lang.NullPointerException
    Message: null
    Stack trace:

      com.teradata.tdgss.jtdgss.TdgssConfigApi.GetMechanisms(DashoA1*..)
      com.teradata.tdgss.jtdgss.TdgssManager.<init>(DashoA1*..)
      com.teradata.tdgss.jtdgss.TdgssManager.getInstance(DashoA1*..)
      com.teradata.jdbc.jdbc.GenericTeraEncrypt.getGSSM(GenericTeraEncrypt.java:583)
      com.teradata.jdbc.jdbc.GenericTeraEncrypt.getConfig(GenericTeraEncrypt.java:601)
      com.teradata.jdbc.jdbc.GenericTeraEncrypt.getUserNameForOid(GenericTeraEncrypt.java:694)
      com.teradata.jdbc.AuthMechanism.<init>(AuthMechanism.java:50)
      com.teradata.jdbc.jdbc.GenericInitDBConfigState.action(GenericInitDBConfigState.java:105)
      com.teradata.jdbc.jdbc.GenericLogonController.run(GenericLogonController.java:49)
      com.teradata.jdbc.jdbc_4.TDSession.<init>(TDSession.java:195)
      com.teradata.jdbc.jdbc_3.ifjdbc_4.TeraLocalConnection.<init>(TeraLocalConnection.java:94)
      com.teradata.jdbc.jdbc.ConnectionFactory.createConnection(ConnectionFactory.java:55)
      com.teradata.jdbc.TeraDriver.doConnect(TeraDriver.java:216)
      com.teradata.jdbc.TeraDriver.connect(TeraDriver.java:149)
      com.rapidminer.tools.jdbc.DriverAdapter.connect(DriverAdapter.java:53)
      java.sql.DriverManager.getConnection(DriverManager.java:582)
      java.sql.DriverManager.getConnection(DriverManager.java:154)
      com.rapidminer.tools.jdbc.DatabaseHandler.connect(DatabaseHandler.java:299)
      com.rapidminer.tools.jdbc.DatabaseHandler.getConnectedDatabaseHandler(DatabaseHandler.java:263)
      com.rapidminer.tools.jdbc.DatabaseHandler.getConnectedDatabaseHandler(DatabaseHandler.java:249)
      com.rapidminer.tools.jdbc.DatabaseHandler.getConnectedDatabaseHandler(DatabaseHandler.java:827)
      com.rapidminer.operator.io.DatabaseDataReader.getResultSet(DatabaseDataReader.java:164)
      com.rapidminer.operator.io.DatabaseDataReader$1.<init>(DatabaseDataReader.java:210)
      com.rapidminer.operator.io.DatabaseDataReader.getDataSet(DatabaseDataReader.java:209)
      com.rapidminer.operator.io.AbstractDataReader.createExampleSet(AbstractDataReader.java:1127)
      com.rapidminer.operator.io.AbstractDataReader.createExampleSet(AbstractDataReader.java:1098)
      com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:52)
      com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:36)
      com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:123)
      com.rapidminer.operator.Operator.execute(Operator.java:771)
      com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
      com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
      com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:368)
      com.rapidminer.operator.Operator.execute(Operator.java:771)
      com.rapidminer.Process.run(Process.java:899)
      com.rapidminer.Process.run(Process.java:795)
      com.rapidminer.Process.run(Process.java:790)
      com.rapidminer.Process.run(Process.java:780)
      com.rapidminer.gui.ProcessThread.run(ProcessThread.java:62)

    3. It does not work when I try to enter two files separated by a comma.  When I specify the jar file as

    lib/jdbc/tdgssconfig.jar,lib/jdbc/terajdbc4.jar

    I get the following log error and NPE:

    log error:

    Aug 17, 2010 9:58:38 AM SEVERE: Process failed: Database error occurred: No suitable driver found for jdbc:teradata://servername/DBS_PORT=1025/DATABASE=databasename
    Aug 17, 2010 9:58:38 AM SEVERE: Here:          Process[1] (Process)
              subprocess 'Main Process'
          ==>  +- Read Database[1] (Read Database)
    Aug 17, 2010 9:58:38 AM SEVERE: Database error occurred: No suitable driver found for jdbc:teradata://servername/DBS_PORT=1025/DATABASE=databasename

    NPE:

    Exception: com.rapidminer.operator.UserError
    Message: Database error occurred: No suitable driver found for jdbc:teradata://servername/DBS_PORT=1025/DATABASE=databasename
    Stack trace:

      com.rapidminer.operator.io.DatabaseDataReader.getResultSet(DatabaseDataReader.java:174)
      com.rapidminer.operator.io.DatabaseDataReader$1.<init>(DatabaseDataReader.java:210)
      com.rapidminer.operator.io.DatabaseDataReader.getDataSet(DatabaseDataReader.java:209)
      com.rapidminer.operator.io.AbstractDataReader.createExampleSet(AbstractDataReader.java:1127)
      com.rapidminer.operator.io.AbstractDataReader.createExampleSet(AbstractDataReader.java:1098)
      com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:52)
      com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:36)
      com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:123)
      com.rapidminer.operator.Operator.execute(Operator.java:771)
      com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
      com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
      com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:368)
      com.rapidminer.operator.Operator.execute(Operator.java:771)
      com.rapidminer.Process.run(Process.java:899)
      com.rapidminer.Process.run(Process.java:795)
      com.rapidminer.Process.run(Process.java:790)
      com.rapidminer.Process.run(Process.java:780)
      com.rapidminer.gui.ProcessThread.run(ProcessThread.java:62)

    Cause
    Exception: java.sql.SQLException
    Message: No suitable driver found for jdbc:teradata://servername/DBS_PORT=1025/DATABASE=databasename
    Stack trace:

      java.sql.DriverManager.getConnection(DriverManager.java:602)
      java.sql.DriverManager.getConnection(DriverManager.java:154)
      com.rapidminer.tools.jdbc.DatabaseHandler.connect(DatabaseHandler.java:299)
      com.rapidminer.tools.jdbc.DatabaseHandler.getConnectedDatabaseHandler(DatabaseHandler.java:263)
      com.rapidminer.tools.jdbc.DatabaseHandler.getConnectedDatabaseHandler(DatabaseHandler.java:249)
      com.rapidminer.tools.jdbc.DatabaseHandler.getConnectedDatabaseHandler(DatabaseHandler.java:827)
      com.rapidminer.operator.io.DatabaseDataReader.getResultSet(DatabaseDataReader.java:164)
      com.rapidminer.operator.io.DatabaseDataReader$1.<init>(DatabaseDataReader.java:210)
      com.rapidminer.operator.io.DatabaseDataReader.getDataSet(DatabaseDataReader.java:209)
      com.rapidminer.operator.io.AbstractDataReader.createExampleSet(AbstractDataReader.java:1127)
      com.rapidminer.operator.io.AbstractDataReader.createExampleSet(AbstractDataReader.java:1098)
      com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:52)
      com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:36)
      com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:123)
      com.rapidminer.operator.Operator.execute(Operator.java:771)
      com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
      com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
      com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:368)
      com.rapidminer.operator.Operator.execute(Operator.java:771)
      com.rapidminer.Process.run(Process.java:899)
      com.rapidminer.Process.run(Process.java:795)
      com.rapidminer.Process.run(Process.java:790)
      com.rapidminer.Process.run(Process.java:780)
      com.rapidminer.gui.ProcessThread.run(ProcessThread.java:62)
  • fischer
    fischer New Altair Community Member
    Hi,

    I think I can confirm there was a problem with the "," in the Jar file list. Do you have a chance to try with the latest version on SVN tomorrow when it's synced?

    Best,
    Simon