Parse xmlrdd with pyspark

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Parse xmlrdd with pyspark

Anshul Sachdeva
Hello Team,

I am trying to parse an xml with spark xml library, I am reading xml from web service using python requests module in a variable then I need to parse it before storing into target table.

I like to do this without saving a file somewhere and then load it.

I know in Java , I have used xmlreader class to parse an xml rdd.

But I am new to python , not sure how I can do the same in pyspark.

Any lead will be appreciable.