Fwd: Handling Skewness and Heterogeneity

Fwd: Handling Skewness and Heterogeneity

Anis Nasir
Dear all,

Can you please comment on the below mentioned use case. 

Thanking you in advance


Dear All,

I have few use cases for spark streaming where spark cluster consist of heterogenous machines. 

Additionally, there is skew present in both the input distribution (e.g., each tuple is drawn from a zipf distribution) and the service time (e.g., service time required for each tuple comes from a zipf distribution). 

I want to know who spark will handle such use cases.

Any help will be highly appreciated!