Running Spark Application on YARN Cluster

We know that Spark can be run on various clusters; It can be run on Mesos and Yarn by using its own cluster manager.
In this instructional blog post, we will be running Spark on Yarn. We will develop a Spark application and run it using the Yarn cluster Manager.
  • Refer to the following Spark-Java word count program.
  • Now we will be building a JAR file for this program by following the steps mentioned below.
    • Copy and paste the program in Eclipse by creating a Java project. After creating a Java Project, create a class with the name WordCount and paste the whole program.
    • Right click on the project —> Go to BuildPath —> Configure BuildPath
    • Open the Spark folder —>lib —> Spark Assembly 1.5.1 jar
    • After adding the JAR file, your errors will be cleared.
  • Now we will need to make a JAR file of that project to run in the cluster.
    • But making JAR in Spark is a little different from Hadoop. You need to install Maven and build your JAR file with Maven.
  • Follow the following steps to install Maven in your system.
    • Open the terminal and type the following:

      It will take some time to download and install after the complete process.
    • We can check whether Maven is installed or not by using the following command:
    • Check the version of installed maven by using the command
Share this article :
 

Enregistrer un commentaire