Understanding the differences between native memory and executor memory in Spark with YARN

2 min readMay 15, 2020

Recently, I submitted some pyspark ETL jobs on our data science EMR cluster, and not long after submission, I encountered a strange error:

OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000005b7027000, 1234763776, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 1234763776 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/hadoop/ce_studies/hs_err_pid28194.log

I was puzzled, since I had reserved plenty of memory (15G) on the driver as well as the executors

nohup spark-submit --master yarn --deploy-mode client --num-executors 10 --executor-cores 6 --executor-memory 15G --driver-memory 15G --conf "spark.dynamicAllocation.enabled=false" --conf "spark.yarn.queue=root.test.B" --conf "spark.executor.memoryOverhead=2048" --conf "spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2" cev2_processor.py --start_date 2019-01-01 --end_date 2019-01-02 > cev2_processor.log 2>&1

I couldn’t understand why the JRE was unable to reserve just 1.2GB of memory (from initial error message) on the driver when I had reserved 15GB in my spark-submit! Additionally, I knew I had 16GB of physical memory on the driver as well as each task node. I failed to understand how memory is allocated on the driver/executors and which pieces of memory I’m controlling in my submit args. The error message did say something about ‘Native Memory’. But what is it? Native memory, sometimes referred to as Off-heap memory is memory within a processes’ address space, but outside the heap — this includes the memoryOverhead in the YARN Container diagram below. Heap memory is the memory inside a JVM process and controlled by the --driver-memory 15G argument in my submit command above. To get a better picture of what’s happening with YARN + Spark, we need to consider our system as YARN containers, running on each executor as well as the driver like below:

Apache Spark on EMR with YARN containers/resource management [1]

Going deeper still, we can break down the YARN container

YARN container memory allocation with Apache Spark

As you can see above, I was reserving 15G of space for the JVM heap only when there is only 16GB of RAM total, which leaves almost no overhead (Native) memory left, especially since the OS and system takes some of the memory outside the JVM. By shrinking my driver memory allocation down to 6GB, it leaves plenty for the native memory which the JRE needed and the problem was solved!

References

[1] https://medium.com/@goyalsaurabh66/running-spark-jobs-on-yarn-809163fc57e2

[2] https://stackoverflow.com/questions/52650879/there-is-insufficient-memory-for-the-java-runtime-environment-to-continue-with-s

Understanding the differences between native memory and executor memory in Spark with YARN

Written by Charlie Mueller