经验首页 前端设计 程序设计 Java相关 移动开发 数据库/运维 软件/图像 大数据/云计算 其他经验
当前位置:技术经验 » 大数据/云/AI » Hadoop » 查看文章
hadoop2-hive的安装和测试
来源:cnblogs  作者:Hongten  时间:2018/11/1 9:27:02  对本文有异议

在安装和测试hive之前,我们需要把Hadoop的所有服务启动

在安装Hive之前,我们需要安装mysql数据库

  1. --mysql的安装 - (https://segmentfault.com/a/1190000003049498)
  2. --检测系统是否自带安装mysql
  3. yum list installed | grep mysql
  4. --删除系统自带的mysql及其依赖
  5. yum -y remove mysql-libs.x86_64
  6. --CentOS添加rpm源,并且选择较新的源
  7. wget dev.mysql.com/get/mysql-community-release-el6-5.noarch.rpm
  8. yum localinstall mysql-community-release-el6-5.noarch.rpm
  9. yum repolist all | grep mysql
  10. yum-config-manager --disable mysql55-community
  11. yum-config-manager --disable mysql56-community
  12. yum-config-manager --enable mysql57-community-dmr
  13. yum repolist enabled | grep mysql
  14. --安装mysql 服务器
  15. yum install mysql-community-server
  16. --启动mysql
  17. service mysqld start
  18. --查看mysql是否自启动,并且设置开启自启动
  19. chkconfig --list | grep mysqld
  20. chkconfig mysqld on
  21.  
  22. --查找初始化密码
  23. grep 'temporary password' /var/log/mysqld.log
  24.  
  25. --mysql安全设置
  26. mysql_secure_installation
  27. --启动mysql
  28. service mysqld start
  29. --登录
  30. mysql u root p
  31. --设置的密码
  32. !QAZ2wsx3edc
  33. --开通远程访问
  34. grant all on *.* to root@'%' identified by '!QAZ2wsx3edc';
  35. select * from mysql.user;
  36. --node1也可以访问
  37. grant all on *.* to root@'node1' identified by '!QAZ2wsx3edc';
  38. --创建hive数据库,后面要用到,hive不会 自动创建
  39. create database hive;

 

安装和配置Hive

  1. --安装Hive
  2. cd ~
  3. tar -zxvf apache-hive-0.13.1-bin.tar.gz
  4. --创建软链
  5. ln -sf /root/apache-hive-0.13.1-bin /home/hive
  6. --修改配置文件
  7. cd /home/hive/conf/
  8. cp -a hive-default.xml.template hive-site.xml
  9. --启动Hive
  10. cd /home/hive/bin/
  11. ./hive
  12. --退出hive
  13. quit;
  14. --修改配置文件
  15. cd /home/hive/conf/
  16. vi hive-site.xml
  17. --以下需要修改的地方
  18. <property>
  19. <name>javax.jdo.option.ConnectionURL</name>
  20. <value>jdbc:mysql://node1/hive</value>
  21. <description>JDBC connect string for a JDBC metastore</description>
  22. </property>
  23.  
  24. <property>
  25. <name>javax.jdo.option.ConnectionDriverName</name>
  26. <value>com.mysql.jdbc.Driver</value>
  27. <description>Driver class name for a JDBC metastore</description>
  28. </property>
  29. <property>
  30. <name>javax.jdo.option.ConnectionUserName</name>
  31. <value>root</value>
  32. <description>username to use against metastore database</description>
  33. </property>
  34.  
  35. <property>
  36. <name>javax.jdo.option.ConnectionPassword</name>
  37. <value>!QAZ2wsx3edc</value>
  38. <description>password to use against metastore database</description>
  39. </property>
  40. :wq

 

添加mysql驱动

  1. --拷贝mysql驱动到/home/hive/lib/
  2. cp -a mysql-connector-java-5.1.23-bin.jar /home/hive/lib/

 

在这里我写了一个生成文件的java文件

GenerateTestFile.java

  1. import java.io.BufferedWriter;
  2. import java.io.File;
  3. import java.io.FileWriter;
  4. import java.util.Random;
  5. /**
  6. * @author Hongwei
  7. * @created 31 Oct 2018
  8. */
  9. public class GenerateTestFile {
  10. public static void main(String[] args) throws Exception{
  11. int num = 20000000;
  12. File writename = new File("/root/output1.txt");
  13. System.out.println("begin");
  14. writename.createNewFile();
  15. BufferedWriter out = new BufferedWriter(new FileWriter(writename));
  16. StringBuilder sBuilder = new StringBuilder();
  17. for(int i=1;i<num;i++){
  18. Random random = new Random();
  19. sBuilder.append(i).append(",").append("name").append(i).append(",")
  20. .append(random.nextInt(50)).append(",").append("Sales").append("\n");
  21. }
  22. System.out.println("done........");
  23. out.write(sBuilder.toString());
  24. out.flush();
  25. out.close();
  26. }
  27. }

 

编译和运行文件:

  1. cd
  2. javac GenerateTestFile.java
  3. java GenerateTestFile

 

最终就会生成/root/output1.txt文件,为上传测试文件做准备。

 

启动Hive

  1. --启动hive
  2. cd /home/hive/bin/
  3. ./hive

 

创建t_tem2表

  1. create table t_emp2(
  2. id int,
  3. name string,
  4. age int,
  5. dept_name string
  6. )
  7. ROW FORMAT DELIMITED
  8. FIELDS TERMINATED BY ',';

输出结果:

  1. hive> create table t_emp2(
  2. > id int,
  3. > name string,
  4. > age int,
  5. > dept_name string
  6. > )
  7. > ROW FORMAT DELIMITED
  8. > FIELDS TERMINATED BY ',';
  9. OK
  10. Time taken: 0.083 seconds

 

上传文件

  1. load data local inpath '/root/output1.txt' into table t_emp2;

输出结果:

  1. hive> load data local inpath '/root/output1.txt' into table t_emp2;
  2. Copying data from file:/root/output1.txt
  3. Copying file: file:/root/output1.txt
  4. Loading data to table default.t_emp2
  5. Table default.t_emp2 stats: [numFiles=1, numRows=0, totalSize=593776998, rawDataSize=0]
  6. OK
  7. Time taken: 148.455 seconds

 

 

测试,查看t_temp2表里面所有记录的总条数:

  1. hive> select count(*) from t_emp2;
  2. Total jobs = 1
  3. Launching Job 1 out of 1
  4. Number of reduce tasks determined at compile time: 1
  5. In order to change the average load for a reducer (in bytes):
  6. set hive.exec.reducers.bytes.per.reducer=<number>
  7. In order to limit the maximum number of reducers:
  8. set hive.exec.reducers.max=<number>
  9. In order to set a constant number of reducers:
  10. set mapreduce.job.reduces=<number>
  11. Starting Job = job_1541003514112_0002, Tracking URL = http://node1:8088/proxy/application_1541003514112_0002/
  12. Kill Command = /home/hadoop-2.5/bin/hadoop job -kill job_1541003514112_0002
  13. Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 1
  14. 2018-10-31 09:41:49,863 Stage-1 map = 0%, reduce = 0%
  15. 2018-10-31 09:42:26,846 Stage-1 map = 33%, reduce = 0%, Cumulative CPU 33.56 sec
  16. 2018-10-31 09:42:47,028 Stage-1 map = 44%, reduce = 0%, Cumulative CPU 53.03 sec
  17. 2018-10-31 09:42:48,287 Stage-1 map = 56%, reduce = 0%, Cumulative CPU 53.79 sec
  18. 2018-10-31 09:42:54,173 Stage-1 map = 67%, reduce = 0%, Cumulative CPU 56.99 sec
  19. 2018-10-31 09:42:56,867 Stage-1 map = 78%, reduce = 0%, Cumulative CPU 57.52 sec
  20. 2018-10-31 09:42:58,201 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 58.44 sec
  21. 2018-10-31 09:43:16,966 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 60.62 sec
  22. MapReduce Total cumulative CPU time: 1 minutes 0 seconds 620 msec
  23. Ended Job = job_1541003514112_0002
  24. MapReduce Jobs Launched:
  25. Job 0: Map: 3 Reduce: 1 Cumulative CPU: 60.62 sec HDFS Read: 593794153 HDFS Write: 9 SUCCESS
  26. Total MapReduce CPU Time Spent: 1 minutes 0 seconds 620 msec
  27. OK
  28. 19999999
  29. Time taken: 105.013 seconds, Fetched: 1 row(s)

 

查询表中age=20的记录总条数:

  1. hive> select count(*) from t_emp2 where age=20;
  2. Total jobs = 1
  3. Launching Job 1 out of 1
  4. Number of reduce tasks determined at compile time: 1
  5. In order to change the average load for a reducer (in bytes):
  6. set hive.exec.reducers.bytes.per.reducer=<number>
  7. In order to limit the maximum number of reducers:
  8. set hive.exec.reducers.max=<number>
  9. In order to set a constant number of reducers:
  10. set mapreduce.job.reduces=<number>
  11. Starting Job = job_1541003514112_0003, Tracking URL = http://node1:8088/proxy/application_1541003514112_0003/
  12. Kill Command = /home/hadoop-2.5/bin/hadoop job -kill job_1541003514112_0003
  13. Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 1
  14. 2018-10-31 09:44:28,452 Stage-1 map = 0%, reduce = 0%
  15. 2018-10-31 09:44:45,102 Stage-1 map = 11%, reduce = 0%, Cumulative CPU 5.54 sec
  16. 2018-10-31 09:44:49,318 Stage-1 map = 33%, reduce = 0%, Cumulative CPU 7.63 sec
  17. 2018-10-31 09:45:14,247 Stage-1 map = 44%, reduce = 0%, Cumulative CPU 13.97 sec
  18. 2018-10-31 09:45:15,274 Stage-1 map = 67%, reduce = 0%, Cumulative CPU 14.99 sec
  19. 2018-10-31 09:45:41,594 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 18.7 sec
  20. 2018-10-31 09:45:50,973 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 26.08 sec
  21. MapReduce Total cumulative CPU time: 26 seconds 80 msec
  22. Ended Job = job_1541003514112_0003
  23. MapReduce Jobs Launched:
  24. Job 0: Map: 3 Reduce: 1 Cumulative CPU: 33.19 sec HDFS Read: 593794153 HDFS Write: 7 SUCCESS
  25. Total MapReduce CPU Time Spent: 33 seconds 190 msec
  26. OK
  27. 399841
  28. Time taken: 98.693 seconds, Fetched: 1 row(s)

 

========================================================

More reading,and english is important.

I'm Hongten

 

  1. 大哥哥大姐姐,觉得有用打赏点哦!你的支持是我最大的动力。谢谢。
    Hongten博客排名在100名以内。粉丝过千。
    Hongten出品,必是精品。

E | hongtenzone@foxmail.com  B | http://www.cnblogs.com/hongten

======================================================== 

 友情链接:直通硅谷  点职佳  北美留学生论坛

本站QQ群:前端 618073944 | Java 606181507 | Python 626812652 | C/C++ 612253063 | 微信 634508462 | 苹果 692586424 | C#/.net 182808419 | PHP 305140648 | 运维 608723728

W3xue 的所有内容仅供测试,对任何法律问题及风险不承担任何责任。通过使用本站内容随之而来的风险与本站无关。
关于我们  |  意见建议  |  捐助我们  |  报错有奖  |  广告合作、友情链接(目前9元/月)请联系QQ:27243702 沸活量
皖ICP备17017327号-2 皖公网安备34020702000426号