My team and I are attempting to run Spark Standalone on IPv6-first
infrastructure. This requires that all RPC listeners bind IPv6 sockets e.g.
`:::7077` instead of `127.0.0.1:7077`. Initial experimentation has found
that Spark 2.4.4 doesn't currently handle this scenario. Various host/bind
addresses can be configured e.g. `spark.driver.host` `SPARK_LOCAL_IP` e.t.c.
however these must be set to host names and will fail to start otherwise. To
get *something* running, I have implemented some 'hacks' (link to diff
force the desired IPv6 bindings which, in combination with using FQDNs for
advertised addresses does work .
However this is clearly not a feasible solution to the issue and I'm looking
for some guidance from those more experienced with the codebase on how to
proceed. Is there a better approach without modifying Spark itself that I
have simply missed? If not, is this support something that would be desired
and how might it best be implemented? I am prepared to contribute the work
and upstream changes that we may need to make for this.
Software Engineer at Unipart Digital