下載軟體

免金鑰登入

  • ssh-keygen -t  rsa  -P  ""
  • ssh-copy-id root@master
  • ssh-copy-id root@slave01
  • ssh-copy-id root@slave02

管理介面

啟動服務

  • Master
    • start-all.sh
    • zkServer.sh  start
    • start-hbase.sh
    • hadoop-daemon.sh start datanode
    • yarn-daemon.sh start datanode
  • Slave
    • zkServer.sh start
    • hadoop-daemon.sh start  datanode
    • yarn-daemon.sh start datanode

測試用檔案

開放資料(Open Data)

Word Count

  • Linux
    • cat TestWord.txt | tr -sc 'a-zA-Z' '\n' | grep -v '^$' | sort | uniq -c | awk '{t=$1;$1=$2;$2=t;print;}' | tr ' ' ',' > ~/TTT
  • Hadoop
    • hadoop jar /opt/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.0.jar wordcount /TestWord.txt /Output

增加叢集節點

  • Hadoop、Zookeeper、HBase
  • hadoop-daemon.sh start datanode
  • yarn-daemon.sh start nodemanager
  • hdfs  dfsadmin -refreshNodes
  • start-balancer.sh

更改 Secondary NameNode

  • hdfs-site.xml

<property>
<name>dfs.namenode.http-address</name>
<value>master:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>slave01:50090</value>
</property>

Hadoop  Ecosystem 下載大全

  • wget http://apache.stu.edu.tw/hadoop/common/stable/hadoop-2.9.0.tar.gz
  • wget http://apache.stu.edu.tw/zookeeper/stable/zookeeper-3.4.10.tar.gz
  • wget http://apache.stu.edu.tw/hbase/stable/hbase-1.2.6-bin.tar.gz
  • wget http://apache.stu.edu.tw/hive/stable-2/apache-hive-2.3.3-bin.tar.gz
  • wget http://apache.stu.edu.tw/pig/latest/pig-0.17.0.tar.gz
  • wget http://apache.stu.edu.tw/mahout/0.13.0/apache-mahout-distribution-0.13.0.tar.gz
  • wget http://downloads.lightbend.com/scala/2.12.5/scala-2.12.5.tgz
  • wget http://apache.stu.edu.tw/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz
  • wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.6.1.5/ambari.repo
  • wget https://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin

Hadoop  Ecosystem 設定檔大全

  • hosts

192.168.56.100   master  master.hadoop
192.168.56.101   slave01 slave01.hadoop
192.168.56.102   slave02 slave02.hadoop

  • profile

export JAVA_HOME=/usr/java/jdk1.8.0_161
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/opt/hadoop-2.9.0
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export ZOOKEEPER_HOME=/opt/zookeeper-3.4.10
export PATH=$PATH:$ZOOKEEPER_HOME/bin
export HBASE_HOME=/opt/hbase-1.2.6
export PATH=$PATH:$HBASE_HOME/bin
export HIVE_HOME=/opt/hive-2.3.3
export PATH=$PATH:$HIVE_HOME/bin
export PIG_HOME=/opt/pig-0.17.0
export PATH=$PATH:$PIG_HOME/bin
export MAHOUT_HOME=/opt/mahout-0.13.0
export PATH=$PATH:$MAHOUT_HOME/bin
export HADOOP_CONF_DIR=/opt/hadoop-2.9.0/etc/hadoop
export SCALA_HOME=/opt/scala-2.12.5
export PATH=$PATH:$SCALA_HOME/bin
export SPARK_HOME=/opt/spark-2.3.0
export PATH=$PATH:$SPARK_HOME/bin

  • start-dfs.sh

HDFS_DATANODE_USER=root
# HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root

  • start-yarn.sh

YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

  • core-site.xml

<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-2.9.0/tmp</value>
</property>

  • hdfs-site.xml

<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///opt/hadoop-2.9.0/NameNode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///opt/hadoop-2.9.0/DataNode</value>
</property>

  • yarn-site.xml

<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property> 
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

  • mapred-site.xml

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

  • hbase-site.xml

<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/opt/hbase-1.2.6/tmp</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave01,slave02</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/opt/zookeeper-3.4.10/data</value>
</property>

  • hive-site.xml

<property>
<name>system:java.io.tmpdir</name>
<value>/opt/hive-2.3.3/tmp</value>
</property>
<property>
<name>system:user.name</name>
<value>${user.name}</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>a12345</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.mariadb.jdbc.Driver</value>
</property>

Hadoop  Ecosystem 測試大全

  • Hadoop
    • hadoop fs -put ~/TestWord.txt  /
    • hadoop jar /opt/hadoop-2.9.0/share/hadoop/ mapreduce/hadoop-mapreduce-examples-2.9.0.jar wordcount /TestWord.txt /Out
  • HBase
    • hbase shell
      • create 'scores','grad','course'
      • put 'scores','kath','grad:','1'
      • put 'scores','kath','course:math','87'
      • get 'scores','kath'
      • get 'scores','kath',{COLUMN=>'course:math'}
  • Hive
    • hive
      • CREATE DATABASE myhive;
      • SHOW DATABASES;
      • USE myhive;
      • SHOW TABLES;
      • CREATE TABLE student(name STRING, scores INT);
      • echo "kath 92" > student.txt
      • echo "john 87" >> student.txt
      • LOAD DATA LOCAL INPATH "student.txt" INTO TABLE student;
      • SELECT * FROM student;
  • Spark
    • echo "My Name is Tony, I am a Teacher, I am Fine, Nice to Meet You." > TestTony.txt
    • hadoop fs -ls TestTony.txt /
    • spark-shell
      • val txtFile=sc.textFile("hdfs://master:9000/TestTony.txt")
      • val stringRDD=txtFile.flatMap(line => line.split(" "))
      • val countsRDD=stringRDD.map(word => (word,1)).reduceByKey(_ + _)
      • countsRDD.sortByKey().collect.foreach(println)
  • Hadoop  Streaming
    • hadoop jar /opt/hadoop-2.9.0/share/hadoop/tools/lib/hadoop-streaming-2.9.0.jar -input /Input -output /Output1 -mapper /bin/cat -reducer /usr/bin/wc
    • hadoop jar /opt/hadoop-2.9.0/share/hadoop/tools/lib/hadoop-streaming-2.9.0.jar -input /Input -output /Output2 -mapper org.apache.hadoop.mapred.lib.IdentityMapper -reducer /bin/wc

 

 

 

 

 

 

 

 

arrow
arrow

    林聖祐Tony老師 發表在 痞客邦 留言(0) 人氣()