Improvement for memory config.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Improvement for memory config.

jinxing
1. For executor memory, we have spark.executor.memory for heap size, and spark.memory.offHeap.size for off-heap size, and these 2 together is the total memory consumption for each executor process.
From the user side, what they always care is the total memory consumption, no matter it is on-heap or off-heap. It seems that it is more friendly to have only one memory config for the user.
Can we merge the two configs to be one, and hide the complexity within internal system?
2. spark.memory.offHeap.size is originally designed for MemoryManager, which is to manage off-heap memory explicitly allocated by Spark itself when creating its own buffers / pages or caching blocks, not to account for off-heap memory used by lower-level code or third-party libraries, for example Netty. But spark.memory.offHeap.size and spark.memory.offHeap.enable are more or less confusing. Sometimes user can ask – "I've already set spark.memory.offHeap.enable to be false, but why Netty is reading remote blocks to off-heap?". Also I think we need to document more about
spark.memory.offHeap.size and spark.memory.offHeap.enable on http://spark.apache.org/docs/latest/configuration.html


 

Loading...