When you’re trying spark with its python repl, it’s really easy to write stuff using simple function or lambda. However, it will be a pain in the ass when you’re starting to try some complex stuff because you could easily miss something like indentation, etc.
Try running your pyspark with this command
It will start an IPython Notebook in your browser with Spark Context as sc variable. You could start using it like this: