Connecting hadoop in Azure with radoop
Hello,
i have an Hortonworks Sandbox in Azure and i am trying to connect it with Radoop. I am stucking at the 4(HDFS) test. There is an Timout exception and the log says i am unable to reach the namenode and datanode. Do i have to do some additinal changes ?
The first tests for hive etc. worked fine. Is there some firewall on the Azure Sandbox wich is refusing the connection from my machine?
Here is the console print:
[Nov 10, 2016 12:25:46 PM] SEVERE: java.util.concurrent.TimeoutException
[Nov 10, 2016 12:25:46 PM] SEVERE: Distributed file system test timed out. Please check that the NameNode and DataNodes are accessible on the address and port you specified.
[Nov 10, 2016 12:25:46 PM] SEVERE: Test failed: HDFS
[Nov 10, 2016 12:25:46 PM] SEVERE: Connection test for 'Test' failed.
Best regards,
Jan
Best Answer
-
Hi Jan,
thank you for the details.
The DataNode test reports that the client looks for the DataNode using the internal IP address 10.0.0.4, and it fails.
Please try the test after adding this advanced Hadoop property to the connection with a true value: dfs.client.use.datanode.hostname. (In this case, the DataNode is expected to be accessed via sandbox.hortonworks.com.)
If it does not help, looking into the Log panel (use View -> Show Panel -> Log to enable it), or checking the output of the Extract Logs (on the connection dialog) action after the connection test may reveal more.
By the way, Radoop 7.3.0 has been released, it also slightly improves the error logging on this panel.
Best,
Peter
1
Answers
-
Hi Jan,
could you please provide some additional information abou your setup by answering the following questions?
- Where are you running the client? On your desktop or on an other Azure instance?
- Are you sure, that all the ports are opened on the firewall between the cleint and the Hortonworks VM? (http://docs.rapidminer.com/radoop/installation/networking-setup.html)
- Are you sure, that the service on the VM are running? Could you please check them on the Ambari interface?
Zoltán
0 -
- The Client is running on my dekstop
- All the services are running, i checked it in ambari
- All the ports are opend (default settings)
Now there is the issue that every test in Quick test works, expect the 10th. It is the UDF Jar upload that doesent work. I attached the logpanel here.
If i am looking in the directory in HDFS(/tmp/radoop/_shared/db_default/) the file exists but it is empty.
If i try the full test it gets stucked by trying to connect to the datanode(5/26). The log says that:
[Nov 14, 2016 1:07:55 PM] WARNING: DataNode port 50010 on the ip/hostname 10.0.0.4 cannot be reached. Please check that you can access the DataNodes of your cluster.
[Nov 14, 2016 1:07:55 PM] WARNING: Test finished with warnings: DataNode networking (25.147s)Regards,
Jan
0 -
Hi Jan,
thank you for the details.
The DataNode test reports that the client looks for the DataNode using the internal IP address 10.0.0.4, and it fails.
Please try the test after adding this advanced Hadoop property to the connection with a true value: dfs.client.use.datanode.hostname. (In this case, the DataNode is expected to be accessed via sandbox.hortonworks.com.)
If it does not help, looking into the Log panel (use View -> Show Panel -> Log to enable it), or checking the output of the Extract Logs (on the connection dialog) action after the connection test may reveal more.
By the way, Radoop 7.3.0 has been released, it also slightly improves the error logging on this panel.
Best,
Peter
1 -
Works like a charm! Thank you
Regards
Jan
1