🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Spark Radoop connection

User: "amori"
New Altair Community Member
Updated by Jocelyn

Hi everyone,

 

I am using Cloudera and the upgraded to Spark 2.2. I am having trouble when performing a Full Test. So in the configration what should be in "Spark Archive (or libs) path"?

I have tried getting the jar from (http://spark.apache.org/downloads.html) I wasn't able to find the "..assembly.jar" file. So I tried putting (local:///opt/cloudera/parcels/CDR/lib/spark/lib/spark-2.2.0-bin-hadoop2.6/jars/*), but didn't work. Also, I have tried the jars from (https://www.cloudera.com/documentation/spark2/latest/topics/spark2_packaging.html#packaging) with no luck.

 

 

 

[Dec 18, 2017 10:01:31 PM]: --------------------------------------------------
[Dec 18, 2017 10:01:31 PM]: Integration test for 'cluster (master)' started.
[Dec 18, 2017 10:01:31 PM]: Using Radoop version 8.0.0.
[Dec 18, 2017 10:01:31 PM]: Running tests: [Hive connection, Fetch dynamic settings, Java version, HDFS, MapReduce, Radoop temporary directory, MapReduce staging directory, Spark staging directory, Spark assembly jar existence, UDF jar upload, Create permanent UDFs, HDFS upload, Spark job]
[Dec 18, 2017 10:01:31 PM]: Running test 1/13: Hive connection
[Dec 18, 2017 10:01:31 PM]: Hive server 2 connection (master.c.strange-mason-188717.internal:10000) test started.
[Dec 18, 2017 10:01:31 PM]: Test succeeded: Hive connection (0.042s)
[Dec 18, 2017 10:01:31 PM]: Running test 2/13: Fetch dynamic settings
[Dec 18, 2017 10:01:31 PM]: Retrieving required configuration properties...
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: hive.execution.engine
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: mapreduce.jobhistory.done-dir
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: mapreduce.jobhistory.intermediate-done-dir
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: dfs.user.home.dir.prefix
[Dec 18, 2017 10:01:31 PM]: Could not fetch property dfs.encryption.key.provider.uri
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: spark.executor.memory
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: spark.executor.cores
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: spark.driver.memory
[Dec 18, 2017 10:01:31 PM]: Could not fetch property spark.driver.cores
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: spark.yarn.executor.memoryOverhead
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: spark.yarn.driver.memoryOverhead
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: spark.dynamicAllocation.enabled
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: spark.dynamicAllocation.initialExecutors
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: spark.dynamicAllocation.minExecutors
[Dec 18, 2017 10:01:31 PM]: Successfully fetched property: spark.dynamicAllocation.maxExecutors
[Dec 18, 2017 10:01:31 PM]: Could not fetch property spark.executor.instances
[Dec 18, 2017 10:01:31 PM]: The specified local value of mapreduce.job.reduces (1) differs from remote value (-1).
[Dec 18, 2017 10:01:31 PM]: The specified local value of mapreduce.reduce.speculative (false) differs from remote value (true).
[Dec 18, 2017 10:01:31 PM]: The specified local value of mapreduce.job.redacted-properties (fs.s3a.access.key,fs.s3a.secret.key) differs from remote value (fs.s3a.access.key,fs.s3a.secret.key,yarn.app.mapreduce.am.admin.user.env,mapreduce.admin.user.env,hadoop.security.credential.provider.path).
[Dec 18, 2017 10:01:31 PM]: Test succeeded: Fetch dynamic settings (0.024s)
[Dec 18, 2017 10:01:31 PM]: Running test 3/13: Java version
[Dec 18, 2017 10:01:31 PM]: Cluster Java version: 1.8.0_151-b12
[Dec 18, 2017 10:01:31 PM]: Test succeeded: Java version (0.000s)
[Dec 18, 2017 10:01:31 PM]: Running test 4/13: HDFS
[Dec 18, 2017 10:01:31 PM]: Test succeeded: HDFS (0.125s)
[Dec 18, 2017 10:01:31 PM]: Running test 5/13: MapReduce
[Dec 18, 2017 10:01:31 PM]: Test succeeded: MapReduce (0.022s)
[Dec 18, 2017 10:01:31 PM]: Running test 6/13: Radoop temporary directory
[Dec 18, 2017 10:01:31 PM]: Test succeeded: Radoop temporary directory (0.007s)
[Dec 18, 2017 10:01:31 PM]: Running test 7/13: MapReduce staging directory
[Dec 18, 2017 10:01:31 PM]: Test succeeded: MapReduce staging directory (0.040s)
[Dec 18, 2017 10:01:31 PM]: Running test 8/13: Spark staging directory
[Dec 18, 2017 10:01:31 PM]: Test succeeded: Spark staging directory (0.020s)
[Dec 18, 2017 10:01:31 PM]: Running test 9/13: Spark assembly jar existence
[Dec 18, 2017 10:01:31 PM]: Spark assembly jar existence in the local:// file system cannot be checked. Test skipped.
[Dec 18, 2017 10:01:31 PM]: Test succeeded: Spark assembly jar existence (0.000s)
[Dec 18, 2017 10:01:31 PM]: Running test 10/13: UDF jar upload
[Dec 18, 2017 10:01:32 PM]: Remote radoop_hive-v4.jar is up to date.
[Dec 18, 2017 10:01:32 PM]: Test succeeded: UDF jar upload (0.007s)
[Dec 18, 2017 10:01:32 PM]: Running test 11/13: Create permanent UDFs
[Dec 18, 2017 10:01:32 PM]: Remote radoop_hive-v4.jar is up to date.
[Dec 18, 2017 10:01:32 PM]: Test succeeded: Create permanent UDFs (0.025s)
[Dec 18, 2017 10:01:32 PM]: Running test 12/13: HDFS upload
[Dec 18, 2017 10:01:32 PM]: Uploaded test data file size: 5642
[Dec 18, 2017 10:01:32 PM]: Test succeeded: HDFS upload (0.047s)
[Dec 18, 2017 10:01:32 PM]: Running test 13/13: Spark job
[Dec 18, 2017 10:01:32 PM]: Assuming Spark version Spark 2.2.
[Dec 18, 2017 10:01:32 PM] SEVERE: Test failed: Spark job
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: Spark job
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: HDFS upload
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: Create permanent UDFs
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: UDF jar upload
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: Spark assembly jar existence
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: Spark staging directory
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: MapReduce staging directory
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: Radoop temporary directory
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: MapReduce
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: HDFS
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: Java version
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: Fetch dynamic settings
[Dec 18, 2017 10:01:32 PM]: Cleaning after test: Hive connection
[Dec 18, 2017 10:01:32 PM]: Total time: 0.732s
[Dec 18, 2017 10:01:32 PM]: java.lang.IllegalArgumentException: Required AM memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please increase the value of 'yarn.scheduler.maximum-allocation-mb'.
at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:311)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:164)
at eu.radoop.datahandler.mapreducehdfs.YarnHandlerLowLevel.runSpark_invoke(YarnHandlerLowLevel.java:813)
at eu.radoop.datahandler.mapreducehdfs.YarnHandlerLowLevel.runSpark_invoke(YarnHandlerLowLevel.java:510)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at eu.radoop.datahandler.mapreducehdfs.MRHDFSHandlerLowLevel$2.run(MRHDFSHandlerLowLevel.java:650)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at eu.radoop.datahandler.mapreducehdfs.MRHDFSHandlerLowLevel.invokeAs(MRHDFSHandlerLowLevel.java:646)
at sun.reflect.GeneratedMethodAccessor123.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.invokeAs(MapReduceHDFSHandler.java:1801)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.invokeAs(MapReduceHDFSHandler.java:1759)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.lambda$runSpark$26(MapReduceHDFSHandler.java:1021)
at eu.radoop.tools.ExceptionTools.checkOnly(ExceptionTools.java:474)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.runSpark(MapReduceHDFSHandler.java:1016)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.runSpark(MapReduceHDFSHandler.java:913)
at eu.radoop.connections.service.test.integration.TestSpark.runTestSparkJob(TestSpark.java:331)
at eu.radoop.connections.service.test.integration.TestSpark.runJobWithVersion(TestSpark.java:218)
at eu.radoop.connections.service.test.integration.TestSpark.call(TestSpark.java:109)
at eu.radoop.connections.service.test.integration.TestSpark.call(TestSpark.java:52)
at eu.radoop.connections.service.test.RadoopTestContext.lambda$runTest$0(RadoopTestContext.java:255)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

[Dec 18, 2017 10:01:32 PM] SEVERE: java.lang.IllegalArgumentException: Required AM memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please increase the value of 'yarn.scheduler.maximum-allocation-mb'.
[Dec 18, 2017 10:01:32 PM] SEVERE: The Spark test failed. Please verify your Hadoop and Spark version and check if your assembly jar location is correct. If the job failed, check the logs on the ResourceManager web interface at http://master.c.strange-mason-188717.internal:8088.
[Dec 18, 2017 10:01:32 PM] SEVERE: Test failed: Spark job
[Dec 18, 2017 10:01:32 PM] SEVERE: Integration test for 'cluster (master)' failed.

 

Find more posts tagged with

Sort by:
1 - 1 of 11
    User: "amori"
    New Altair Community Member
    OP
    Accepted Answer

    Hi all,

     

    I got it working! So, I had one slave node 2 vCPUs, 7.5 GB memory. I went to the Cloudera manager -> Yarn -> Configration -> 

    Container Memory yarn.nodemanager.resource.memory-mb = 7 GiB and Container Virtual CPU Cores yarn.nodemanager.resource.cpu-vcores = 2.

    Also, I had to copy the jars file to the slave node, which I missed doing before. 

    The Result:

    [Dec 20, 2017 9:39:31 PM]: Integration test for 'cluster3' completed successfully.

     

    Thank you Peter for helping out your replies on the other posts were a tremendous guide