本文介紹了紗線容器內存不足的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!
問題描述
我的紗線容器內存不足:
此特定容器運行一個Apache-Spark驅動程序節點。
我不理解的部分:我將驅動程序的堆大小限制為512MB(您可以在下面的錯誤消息中看到這一點)。但是紗線容器抱怨內存>1 GB(也請參見下面的消息)。您可以驗證YAIN正在啟動Java是否與Xmx512M一起運行。我的容器設置為1 GB內存,增量為0.5 GB。此外,我托管紗線容器的物理機器每臺都有32 GB。我通過SSH連接到其中一臺物理機,發現它有很多可用內存…
另一件奇怪的事情是,Java沒有拋出OutOfMemory異常。當我查看驅動程序日志時,我發現它最終從紗線中獲得了SIGTERM,并很好地關閉了。如果Yarn內部的Java進程超過了512MB,我難道不應該在Java嘗試從Yarn分配1 GB之前得到一個OutOfMemory異常嗎?
我還嘗試了使用1024M的堆運行。那一次,容器崩潰了,使用量為1.5 GB。這是始終如一的。因此,很明顯,容器有能力在1 GB限制之外再分配0.5 GB。(非常符合邏輯,因為物理機有30 GB的可用內存)
除Java外,紗線容器內是否還有其他東西可能會占用額外的512MB?
我在Yarn上運行CDH 5.4.1和ApacheSpark。集群上的Java版本也升級到了oracleJava 8。我看到一些人聲稱Java 8中的默認MaxPermSize已經被更改,但我幾乎不相信它可能會占用512MB…
紗線錯誤信息:
Diagnostics: Container [pid=23335,containerID=container_1453125563779_0160_02_000001] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1453125563779_0160_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 23335 23333 23335 23335 (bash) 1 0 11767808 432 /bin/bash -c LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native::/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native /usr/lib/jvm/java-8-oracle/bin/java -server -Xmx512m -Djava.io.tmpdir=/var/yarn/nm/usercache/hdfs/appcache/application_1453125563779_0160/container_1453125563779_0160_02_000001/tmp '-Dspark.eventLog.enabled=true' '-Dspark.executor.memory=512m' '-Dspark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar' '-Dspark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native' '-Dspark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native' '-Dspark.shuffle.service.enabled=true' '-Dspark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/assembly/lib/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar' '-Dspark.app.name=not_telling-1453479057517' '-Dspark.shuffle.service.port=7337' '-Dspark.driver.extraClassPath=/etc/hbase/conf:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar' '-Dspark.serializer=org.apache.spark.serializer.KryoSerializer' '-Dspark.yarn.historyServer.address=http://XXXX-cdh-dev-cdh-node2:18088' '-Dspark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native' '-Dspark.eventLog.dir=hdfs://XXXX-cdh-dev-cdh-node1:8020/user/spark/applicationHistory' '-Dspark.master=yarn-cluster' -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1453125563779_0160/container_1453125563779_0160_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class 'not_telling' --jar file:/home/cloud-user/temp/not_telling.jar --arg '--conf' --arg 'spark.executor.extraClasspath=/opt/cloudera/parcels/CDH/jars/htrace-core-3.0.4.jar' --executor-memory 512m --executor-cores 4 --num-executors 10 1> /var/log/hadoop-yarn/container/application_1453125563779_0160/container_1453125563779_0160_02_000001/stdout 2> /var/log/hadoop-yarn/container/application_1453125563779_0160/container_1453125563779_0160_02_000001/stderr
|- 23338 23335 23335 23335 (java) 95290 10928 2786668544 261830 /usr/lib/jvm/java-8-oracle/bin/java -server -Xmx512m -Djava.io.tmpdir=/var/yarn/nm/usercache/hdfs/appcache/application_1453125563779_0160/container_1453125563779_0160_02_000001/tmp -Dspark.eventLog.enabled=true -Dspark.executor.memory=512m -Dspark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar -Dspark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native -Dspark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native -Dspark.shuffle.service.enabled=true -Dspark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/assembly/lib/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar -Dspark.app.name=not_tellin-1453479057517 -Dspark.shuffle.service.port=7337 -Dspark.driver.extraClassPath=/etc/hbase/conf:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar -Dspark.serializer=org.apache.spark.serializer.KryoSerializer -Dspark.yarn.historyServer.address=http://XXXX-cdh-dev-cdh-node2:18088 -Dspark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native -Dspark.eventLog.dir=hdfs://XXXX-cdh-dev-cdh-node1:8020/user/spark/applicationHistory -Dspark.master=yarn-cluster -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1453125563779_0160/container_1453125563779_0160_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class not_telling --jar file:not_telling.jar --arg --conf --arg spark.executor.extraClasspath=/opt/cloudera/parcels/CDH/jars/htrace-core-3.0.4.jar --executor-memory 512m --executor-cores 4 --num-executors 10
推薦答案
查看this article,它有一個很好的描述。您可能想要注意他們在哪里說”在計算執行器的內存時,要注意最大(7%,384M)堆外內存開銷。”
編輯(Eshalev):我接受這個答案,并詳細說明發現了什么。Java8使用了不同的內存方案。具體地說,CompressedClass在”Metspace”中保留了1024MB。這比以前版本的Java在”perm-gen”內存中分配的內存大得多。您可以使用”jmap-heap[id]”來檢查這一點。我們目前通過過度分配超過堆需求的1024MB來防止應用程序崩潰。這很浪費,但它可以防止應用程序崩潰。
這篇關于紗線容器內存不足的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,