1. 程式人生 > >Spark Streaming實時流處理筆記(7)—— 環境搭建

Spark Streaming實時流處理筆記(7)—— 環境搭建

1 配置Hadoop

  1. hadoop-env.sh
export JAVA_HOME=/usr/apps/jdk1.8.0_181-amd64
  1. core-site.xml
<configuration>

<property>
<name>fs.defaultFS</name>
<value>hdfs://node1:8020</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/appsData/hdpData/tmp</value>
</property>

</configuration>
  1. hdfs-site.xml
<configuration>

<property>
<name>dfs.replication</name>
<name>1</name>
</property>

<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/appsData/hdpData/namedir</value>
</property>

<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/appsData/hdpData/datadir</value>
</property>

<property>
	<name>dfs.permissions</name>
	<value>false</value>
</property>

</configuration>
  1. slaves
node1

1.1 格式化 namenode

hdfs namenode -format

在這裡插入圖片描述

1.2 啟動 hdfs

[[email protected] hadoop]$ start-dfs.sh
18/12/06 10:37:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [node1]
node1: starting namenode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-node1.out
node1: starting datanode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-node1.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-node1.out
18/12/06 10:38:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[
[email protected]
hadoop]$ jps 1777 DataNode 1953 SecondaryNameNode 2071 Jps 1647 NameNode [[email protected] hadoop]$

http://node1:50070
在這裡插入圖片描述

2 配置 YARN

  1. mapred-site.xml
<configuration>
	<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>  
</configuration>
  1. yarn-site.xml
<configuration>

<!-- Site specific YARN configuration properties -->
	<property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>node1</value>
    </property>
    
</configuration>

2.1 啟動 YARN

[[email protected] ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-node1.out
node1: starting nodemanager, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-node1.out
[[email protected] ~]$ jps
2144 ResourceManager
1777 DataNode
1953 SecondaryNameNode
2567 Jps
2251 NodeManager
1647 NameNode
[[email protected] ~]$ 

http://node1:8088
在這裡插入圖片描述

3 配置 Hbase

http://archive-primary.cloudera.com/cdh5/cdh/5/
在這裡插入圖片描述
新增到環境變數

export HBASE_HOME=/home/hadoop/apps/hbase-1.2.0-cdh5.7.0
export PATH=$PATH:$HBASE_HOME/bin

3.1 Hbase 配置檔案

/home/hadoop/apps/hbase-1.2.0-cdh5.7.0/conf

  1. hbase-env.sh
export JAVA_HOME=/usr/apps/jdk1.8.0_181-amd64
export HBASE_MANAGES_ZK=false
  1. hbase-site.xml
<property>
	<name>hbase.rootdir</name>
	<value>hdfs://node1:8020/hbase</value>
</property>
<property>
	<name>hbase.cluster.distributed</name>
	<value>true</value>
</property>
<property>
	<name>hbase.zookeeper.quorum</name>
	<value>node1:2181</value>
</property>
  1. regionservers
node1

3.2 啟動 hbase

  • 先啟動 zookeeper
[[email protected] ~]$ zkServer.sh start
JMX enabled by default
Using config: /home/hadoop/apps/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[[email protected] ~]$ jps
2144 ResourceManager
2768 Jps
1777 DataNode
1953 SecondaryNameNode
2744 QuorumPeerMain
2251 NodeManager
1647 NameNode
[[email protected] ~]$ 
[[email protected] ~]$ start-hbase.sh
starting master, logging to /home/hadoop/apps/hbase-1.2.0-cdh5.7.0/logs/hbase-hadoop-master-node1.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
node1: starting regionserver, logging to /home/hadoop/apps/hbase/bin/../logs/hbase-hadoop-regionserver-node1.out
node1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
node1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

http://node1:60010
在這裡插入圖片描述

  • hbase shell
    在這裡插入圖片描述

4 IDEA +MAVEN 環境搭建

參考 Kafka部分
POM

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.myspark.com</groupId>
  <artifactId>sparktrain</artifactId>
  <version>1.0</version>
  <inceptionYear>2008</inceptionYear>
  <properties>
    <scala.version>2.11.12</scala.version>
    <kafka.version>0.9.0.0</kafka.version>
    <spark.version>2.2.0</spark.version>
    <hadoop.version>2.6.0-cdh5.7.0</hadoop.version>
    <hbase.version>1.2.0-cdh5.7.0</hbase.version>
  </properties>

  <repositories>
    <repository>
      <id>scala-tools.org</id>
      <name>Scala-Tools Maven2 Repository</name>
      <url>http://scala-tools.org/repo-releases</url>
    </repository>

    <!--新增庫-->
    <repository>
      <id>cloudera</id>
      <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
    </repository>

  </repositories>

  <pluginRepositories>
    <pluginRepository>
      <id>scala-tools.org</id>
      <name>Scala-Tools Maven2 Repository</name>
      <url>http://scala-tools.org/repo-releases</url>
    </pluginRepository>
  </pluginRepositories>

  <dependencies>
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>${scala.version}</version>
    </dependency>

    <dependency>
      <groupId>org.apache.kafka</groupId>
      <artifactId>kafka_2.11</artifactId>
      <version>${kafka.version}</version>
    </dependency>

    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-client</artifactId>
      <version>${hadoop.version}</version>
    </dependency>

    <dependency>
      <groupId>org.apache.hbase</groupId>
      <artifactId>hbase-client</artifactId>
      <version>${hbase.version}</version>
    </dependency>

    <dependency>
      <groupId>org.apache.hbase</groupId>
      <artifactId>hbase-server</artifactId>
      <version>${hbase.version}</version>
    </dependency>

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.11</artifactId>
      <version>${spark.version}</version>
    </dependency>


  </dependencies>

  <build>
    <sourceDirectory>src/main/scala</sourceDirectory>
    <testSourceDirectory>src/test/scala</testSourceDirectory>
    <plugins>
      <plugin>
        <groupId>org.scala-tools</groupId>
        <artifactId>maven-scala-plugin</artifactId>
        <executions>
          <execution>
            <goals>
              <goal>compile</goal>
              <goal>testCompile</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <scalaVersion>${scala.version}</scalaVersion>
          <args>
            <arg>-target:jvm-1.5</arg>
          </args>
        </configuration>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-eclipse-plugin</artifactId>
        <configuration>
          <downloadSources>true</downloadSources>
          <buildcommands>
            <buildcommand>ch.epfl.lamp.sdt.core.scalabuilder</buildcommand>
          </buildcommands>
          <additionalProjectnatures>
            <projectnature>ch.epfl.lamp.sdt.core.scalanature</projectnature>
          </additionalProjectnatures>
          <classpathContainers>
            <classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer>
            <classpathContainer>ch.epfl.lamp.sdt.launching.SCALA_CONTAINER</classpathContainer>
          </classpathContainers>
        </configuration>
      </plugin>
    </plugins>
  </build>
  <reporting>
    <plugins>
      <plugin>
        <groupId>org.scala-tools</groupId>
        <artifactId>maven-scala-plugin</artifactId>
        <configuration>
          <scalaVersion>${scala.version}</scalaVersion>
        </configuration>
      </plugin>
    </plugins>
  </reporting>
</project>