基本环境资源
Hadoop:2.7.X
Hive:2.1.X.bin.tar.gz 版本
Hive:1.x.src.tar.gz 源码版本
第一步:windows 安装Hadoop2.7.x,请参考:
第二步:下载Hive.tar.gz,官网下载地址:http://archive.apache.org/dist/hive
第二步:解压Hive.tar.gz 至指定文件夹目录(C:\hive),配置Hive 全局环境变量。
Hive 全局环境变量:
第三步:Hive 配置文件(C:\hive\apache-hive-2.1.1-bin\conf)
配置文件目录C:\hive\apache-hive-2.1.1-bin\conf\conf有4个默认的配置文件模板拷贝成新的文件名
hive-default.xml.template —–> hive-site.xml
hive-env.sh.template —–> hive-env.sh
hive-exec-log4j.properties.template —–> hive-exec-log4j2.properties
hive-log4j.properties.template —–> hive-log4j2.properties
第四步: 新建本地目录后面配置文件用到
C:\hive\apache-hive-2.1.1-bin\my_hive
第五步:Hive需要调整的配置文件(hive-site.xml 和hive-env.sh)
编辑C:\hive\apache-hive-2.1.1-bin\conf\conf\hive-site.xml 文件
<!–hive的临时数据目录,指定的位置在hdfs上的目录–>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<!–hive的临时数据目录,指定的位置在hdfs上的目录–>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<!– scratchdir 本地目录 –>
<property>
<name>hive.exec.local.scratchdir</name>
<value>C:/hive/apache-hive-2.1.1-bin/my_hive/scratch_dir</value>
<description>Local scratch space for Hive jobs</description>
</property>
<!– resources_dir 本地目录 –>
<property>
<name>hive.downloaded.resources.dir</name>
<value>C:/hive/apache-hive-2.1.1-bin/my_hive/resources_dir/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<!– querylog 本地目录 –>
<property>
<name>hive.querylog.location</name>
<value>C:/hive/apache-hive-2.1.1-bin/my_hive/querylog_dir</value>
<description>Location of Hive run time structured log file</description>
</property>
<!– operation_logs 本地目录 –>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>C:/hive/apache-hive-2.1.1-bin/my_hive/operation_logs_dir</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
<!– 数据库连接地址配置 –>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.60.178:3306/hive?serverTimezone=UTC&useSSL=false&allowPublicKeyRetrieval=true</value>
<description>
JDBC connect string for a JDBC metastore.
</description>
</property>
<!– 数据库驱动配置 –>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<!– 数据库用户名 –>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>admini</value>
<description>Username to use against metastore database</description>
</property>
<!– 数据库访问密码 –>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<!– 解决 Caused by: MetaException(message:Version information not found in metastore. ) –>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn’t match with one from in Hive jars.
</description>
</property>
<!– 自动创建全部 –>
<!– hive Required table missing : “DBS” in Catalog””Schema” 错误 –>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
<description>Auto creates necessary schema on a startup if one doesn’t exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.</description>
</property>
编辑(C:\hive\apache-hive-2.1.1-bin\conf\conf\hive-env.sh 文件)
# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=C:\hadoop\hadoop-2.7.6
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=C:\hive\apache-hive-2.1.1-bin\conf
# Folder containing extra libraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=C:\hive\apache-hive-2.1.1-bin\lib
第六步:在hadoop上创建hdfs目录
hadoop fs -mkdir /tmp
hadoop fs -mkdir /user/
hadoop fs -mkdir /user/hive/
hadoop fs -mkdir /user/hive/warehouse
hadoop fs -chmod g+w /tmp
hadoop fs -chmod g+w /user/hive/warehouse
第七步:创建Hive 初始化依赖的数据库hive,注意编码格式:latin1
第八步:启动Hive 服务
(1)、首先启动Hadoop,执行指令:stall-all.cmd
(2)、Hive 初始化数据,执行指令:hive –service metastore
如果一切正常,cmd 窗口指令显示如下截图
如果Hive 初始化正常,MySQL中Hive 数据库涉及表,如下截图:
(3)、启动Hive服务,执行指令:hive
至此,windows 10 搭建Hive 服务结束。
遇到的问题(1):Hive 执行数据初始化(hive –service metastore),总是报错。
解决思路:通过Hive 自身携带的脚本,完成Hive 数据库的初始化。
Hive 携带脚本的文件位置(C:\hive\apache-hive-2.1.1-bin\scripts\metastore\upgrade),选择执行SQL的版本,如下截图:
选择需要执行的Hive版本(Hive_x.x.x)所对应的sql 版本(hive-schema-x.x.x.mysql.sql)
说明:我选择Hive版本时2.1.1,所以我选项的对应sql 版本hive-schema-2..1.0.mysql.sql 脚本。
遇到的问题(2):Hive 的Hive_x.x.x_bin.tar.gz 版本在windows 环境中缺少 Hive的执行文件和运行程序。
解决版本:下载低版本Hive(apache-hive-1.0.0-src),将bin 目录替换目标对象(C:\hive\apache-hive-2.1.1-bin)原有的bin目录。
截图如下:apache-hive-1.0.0-src\bin 目录 结构
————————————————
版权声明:本文为CSDN博主「在奋斗的大道」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/zhouzhiwengang/article/details/88191251