GraphFrames 0.5.0 - critical bug fix + other improvements
Hi Spark community,
I'd like to announce a new release of GraphFrames, a Spark Package for DataFrame-based graphs!
We strongly encourage all users to use this latest release for the bug fix described below.
Critical bug fix
This release fixes a bug in indexing vertices. This may have affected your results if:
* your graph uses non-Integer IDs and
* you use ConnectedComponents and other algorithms which are wrappers around GraphX.
The bug occurs when the input DataFrame is non-deterministic. E.g., running an algorithm on a DataFrame just loaded from disk should be fine in previous releases, but running that algorithm on a DataFrame produced using shuffling, unions, and other operators can cause incorrect results. This issue is fixed in this release.
* Python API for aggregateMessages for building custom graph algorithms
* Scala API for parallel personalized PageRank, wrapping the GraphX implementation. This is only available when using GraphFrames with Spark 2.1+.
Support for Spark 1.6, 2.0, and 2.1
Special thanks to Felix Cheung for his work as a new committer for GraphFrames!