1. 程式人生 > >Spark英中對照翻譯(PySpark中文版新手快速入門-Quick Start)-中文指南,教程(Python版)-20161115

Spark英中對照翻譯(PySpark中文版新手快速入門-Quick Start)-中文指南,教程(Python版)-20161115

This program just counts the number of lines containing ‘a’ and the number containing ‘b’ in a text file. Note that you’ll need to replace YOUR_SPARK_HOME with the location where Spark is installed.As with the Scala and Java examples, we use a SparkContext to create RDDs. We can pass Python functions to Spark, which are automatically serialized along with any variables that they reference. For applications that use custom classes or third-party libraries, we can also add code dependencies to