Spark-Submit in Yarn-Client Mode Hangs: A Common Pyspark 3.4.1 Issue and Its Solution
Understanding the Problem
You've submitted a PySpark job using spark-submit
in yarn-client
mode, and while your task successfully completes, the process hangs indefinitely, preventing you from interacting with the terminal. This behavior can be frustrating, especially when you need to quickly move on to other tasks. This article delves into the root cause of this issue and provides a simple solution.
The Scenario and Original Code
Let's imagine you have a simple PySpark script named my_script.py
:
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("MySparkJob").getOrCreate()
# Your Spark logic here
# ...
spark.stop()
You submit this script to your Yarn cluster using:
spark-submit --master yarn --deploy-mode client my_script.py
The job runs and completes successfully. However, your terminal remains in a hung state, unresponsive to any commands.
The Root Cause: Client Mode and Driver Execution
The yarn-client
mode instructs Spark to execute the driver (the main program) on the client node (your local machine) and launch executors on the Yarn cluster. The problem arises because spark.stop()
gracefully shuts down the SparkSession and the executors, but the driver process on your client node continues running. This happens because, in client mode, the driver process waits for the executors to finish and then terminates itself.
The Solution: Force Termination
The easiest way to resolve this hang is to forcibly terminate the driver process. You can do this by pressing Ctrl+C in your terminal. This will interrupt the driver process and allow you to regain control of your terminal.
Additional Considerations:
- Log Analysis: Inspect the logs in your Spark application directory (usually
$SPARK_HOME/work/
) to ensure there are no errors or warnings that could be contributing to the hang. - Alternative Deploy Mode:
yarn-cluster
: Consider switching toyarn-cluster
mode if you prefer to have your driver run directly within the Yarn cluster. This will prevent the driver process from hanging on the client node. - Spark Driver Configuration: If you're encountering this issue frequently, consider configuring the driver to automatically terminate after the application completes. You can set
spark.driver.stopOnSuccess.enabled
totrue
in yourspark-defaults.conf
file.
Conclusion
While seemingly perplexing, the spark-submit
hang in yarn-client
mode is a common quirk stemming from how Spark handles driver processes. By understanding the root cause and applying the solution of forcibly terminating the driver, you can efficiently manage your Spark jobs and avoid unnecessary delays.
Remember to explore the yarn-cluster
deployment mode for potentially smoother execution and consider fine-tuning your Spark configurations for optimal performance.