在IntelliJ IDEA中配置Spark(Java API)運行環境
阿新 • • 發佈:2018-05-07
appname java api ont lib with dep ava cat net
1. 新建Maven項目
初始Maven項目完成後,初始的配置(pom.xml)如下:
2. 配置Maven
向項目裏新建Spark Core庫
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>net.libaoquan</groupId> <artifactId>TestSpark</artifactId> <version>1.0-SNAPSHOT</version> <dependencies> <dependency> <!-- Spark dependency --> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.2.1</version> </dependency> </dependencies> </project>
3.新建Java類
新建Java類,寫入Spark(Java API)代碼:
import org.apache.spark.api.java.*; import org.apache.spark.SparkConf; import org.apache.spark.api.java.function.Function; public class TestSparkJava { public static void main(String[] args) { String logFile = "D:\\ab.txt"; SparkConf conf = new SparkConf().setMaster("local").setAppName("TestSpark"); JavaSparkContext sc = new JavaSparkContext(conf); JavaRDD<String> logData = sc.textFile(logFile).cache(); long numAs = logData.filter(new Function<String, Boolean>() { public Boolean call(String s) { return s.contains("0"); } }).count(); long numBs = logData.filter(new Function<String, Boolean>() { public Boolean call(String s) { return s.contains("1"); } }).count(); System.out.println("Lines with 0: " + numAs + ", lines with 1: " + numBs); sc.stop(); } }
運行項目,結果如下:
在IntelliJ IDEA中配置Spark(Java API)運行環境