When we release a new version and we have some big change in the API, things start to randomly break for some users. For example, in version 0.44 we had a class DateUtils (used by class Utils) that was dropped in version 0.45. Running when version 0.45 was released (spark shows it is correctly downloading it from maven) and using the class Utils some users got
NoClassDefFoundError for class DateUtils
To me this looks like a caching problem. Probably some node (master or an executor) ClassLoader is still pointing to v0.44 and when loading Utils it tries to find DateUtils class which has disappeared in newer jar. Not sure how this can happen, this is only an intution.
Does anyone have any idea on how to solve this? It is also very hard to debug since I couldn't find a pattern to reproduce it. It happens on every release that changes a class name but not for everyone running the job (that's why caching looked like a good hint to me).