Problem in connecting Radoop to my Cloudera cluster on Azure

rehman
rehman New Altair Community Member
edited November 5 in Community Q&A
I am trying to connect Radoop to my Cloudera Cluster on MS Azure having Spark 2.4. All test went well but I got an error in Spark test (spark staging directory error).

Nov 24, 2019 10:55:31 PM]: Test succeeded: Fetch dynamic settings (9.605s)
[Nov 24, 2019 10:55:31 PM]: Running test 2/5: Spark staging directory
[Nov 24, 2019 10:55:32 PM] SEVERE: Test failed: Spark staging directory
[Nov 24, 2019 10:55:32 PM]: Cleaning after test: Spark staging directory
[Nov 24, 2019 10:55:32 PM]: Cleaning after test: Fetch dynamic settings
[Nov 24, 2019 10:55:32 PM]: Total time: 10.100s
[Nov 24, 2019 10:55:32 PM]: java.lang.IllegalArgumentException: java.net.UnknownHostException: sibaluster-50e44e52.siba
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:130)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:343)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:287)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:156)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2811)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2848)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2830)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:181)
at eu.radoop.datahandler.mapreducehdfs.MRHDFSHandlerLowLevel.testDirPermission(MRHDFSHandlerLowLevel.java:786)
at eu.radoop.datahandler.mapreducehdfs.MRHDFSHandlerLowLevel.testSparkStagingPermission_invoke(MRHDFSHandlerLowLevel.java:766)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at eu.radoop.datahandler.mapreducehdfs.MRHDFSHandlerLowLevel$2.run(MRHDFSHandlerLowLevel.java:641)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at eu.radoop.security.UgiWrapper.doAs(UgiWrapper.java:49)
at eu.radoop.datahandler.mapreducehdfs.MRHDFSHandlerLowLevel.invokeAs(MRHDFSHandlerLowLevel.java:637)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.invokeAs(MapReduceHDFSHandler.java:1805)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.invokeAs(MapReduceHDFSHandler.java:1769)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.invokeAs(MapReduceHDFSHandler.java:1746)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.invoke(MapReduceHDFSHandler.java:1733)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.lambda$testStagingDirPermission$32(MapReduceHDFSHandler.java:1522)
at eu.radoop.tools.ExceptionTools.checkOnly(ExceptionTools.java:277)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.testStagingDirPermission(MapReduceHDFSHandler.java:1521)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.testSparkStagingPermission(MapReduceHDFSHandler.java:1513)
at eu.radoop.connections.service.test.connection.TestSparkStaging.call(TestSparkStaging.java:45)
at eu.radoop.connections.service.test.connection.TestSparkStaging.call(TestSparkStaging.java:24)
at eu.radoop.connections.service.test.RadoopTestContext.lambda$runTest$1(RadoopTestContext.java:282)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: sibaluster-50e44e52.siba
... 42 more

[Nov 24, 2019 10:55:32 PM] SEVERE: java.lang.IllegalArgumentException: java.net.UnknownHostException: sibaluster-50e44e52.siba
[Nov 24, 2019 10:55:32 PM] SEVERE: Spark staging directory (on HDFS) test failed. The Radoop client tried to write the HDFS user home directory.
[Nov 24, 2019 10:55:32 PM] SEVERE: Test failed: Spark staging directory
[Nov 24, 2019 10:55:32 PM] SEVERE: Integration test for 'sibaucluster (10.1.7.8)' failed.

Answers

  • sgenzer
    sgenzer
    Altair Employee
    cc @asimon
  • rehman
    rehman New Altair Community Member
    No. I asked for help
  • sgenzer
    sgenzer
    Altair Employee
    hello @rehman so I am the Community Manager here at RapidMiner. Generally if I do not know the answer to a question, and no one has voluntarily chimed in, I will cc someone at RapidMiner who specializes in the area of question. In your case you have a very specific Radoop question which has fallen into this category. Hence I have cc'ed my colleague who may have time to help here.

    Note you may need to be patient here - this is community support. Hence people help here when they have time and in a spirit of generosity. :smile:

    Scott
  • asimon
    asimon New Altair Community Member
    Hi,
    It would be helpful if you could share your connection.xml or even better the zip package generated by the "Extract Logs" button.
    Otherwise the stack trace and the exception indicate a networking issue (java.net.UnknownHostException: sibaluster-50e44e52.siba), thus please check your network connectivity (can you ping that host from the machine running RM Studio?). DNS and reverse-DNS has to be in place for both inside the Hadoop cluster and your host running RapidMiner Studio. Probably the easiest to achieve that is to put your hostnames and their IP addresses to your /etc/hosts file.