I am running Spark Standalone mode and I am finding that when I configure ports (i.e. spark.blockManager.port) in both the Spark Master's spark-defaults.conf as well as the Spark Worker's, that the Spark Master's port is the one that will be used in all the
workers. Judging by the code, this seems to be done by design. If executor sizes are small, then the 16 ports attempted will be exhausted, and executors will fail to start. This is further exacerbated by the fact that multiple Spark Workers can exist on the
same machine in my particular circumstance.
What are the community's thoughts on changing this behavior such that
The port push down will only happen if the Spark Worker's port configuration is not set. This won't solve the problem, but will mitigate it and seems to make sense
from a user experience point of view.
Similarly, I'd like to prevent environment variable push down as well. Perhaps instead of 1. if we can have a configurable switch to turn off push down of port configuration and a different one to turn off environment variable push
down, this will work too.