메모장
우분투 16.04 HIVE(하이브 2.11) 설치 본문
hdfs 폴더 생성
hadoop fs -mkdir /tmp
hadoop fs -mkdir /user
hadoop fs -mkdir /user/hive
hadoop fs -mkdir /user/hive/warehouse
hadoop fs -chmod g+w /tmp
hadoop fs -chmod g+w /user/hive/warehouse
1. 하이브 2.0.0 설치 파일을 다운받은 후, 압축을 해제합니다.
apache-hive-2.1.1-bin.tar.gz를 다운받음
$ tar xvfz apache-hive-2.0.0-bin.tar.gz
export HIVE_HOME=~/dev/Apps/apache-hive-2.1.1-bin
export PATH=$HIVE_HOME/bin:$PATH
2. hive-env.sh 파일을 준비합니다.
$ cd apache-hive-2.1.1-bin
$ cp conf/hive-env.sh.template conf/hive-env.sh
3. hive-env.sh 에 하둡 설치 경로를 설정합니다.
$ vi conf/hive-env.sh
HADOOP_HOME=/home/hadoop/hadoop-2.7.3
conf파일 생성
conf/*.template파일을 복사하여 각각의 설정파일을 만든다.
cp hive-env.sh.template hive-env.sh
cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
cp hive-log4j2.properties.template hive-log4j2.properties
cp hive-default.xml.template hive-site.xml
cp beeline-log4j2.properties.template beeline-log4j2.properties
conf의 hive-site.xml 수정
conf 설정
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.local</name>
<value>true</value>
<description>Use false if a production metastore server is used</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive_metastore_db?createDatabaseIfNoExist=true</value>
<description> Roy
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>User-Defined(Roy) Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>User-defined(Roy)Username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hivedev</value>
<description>User-defined(Roy)password to use against metastore database</description>
</property>
</configuration>
<property>
<name>hive.server2.enable.doAs</name>
<!-- <value>true</value> -->
<value>false</value>
<description>
Setting this property to true will have HiveServer2 execute
Hive operations as the user making the calls to it.
</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive-whitexozu</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/whitexozu</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/whitexozu_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.scratch.dir.permission</name>
<value>733</value>
<description>The permission for the user specific scratch directories that get created.</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/tmp/hive-whitexozu</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/tmp/hive-whitexozu/operation_logs</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
4. 하이브 메타스토어를 초기화합니다. 이 과정은 하이브 2.0.0 버전부터 추가된 과정으로 하이브를 처음 실행할 때, 반드시 선행되어야 하는 과정입니다. -dbType의 파라미터값으로 메타스토어로 사용할 데이터베이스 타입을 명시하며, Derby 외에 다른 데이터베이스를 사용하는 경우에도 반드시 이 과정을 진행해야 합니다. 참고로 schematool은 메타스토어 초기화 기능외에도 기존에 번거로웠던 업그레이드 기능도 편리하게 처리해줍니다.
$ ./bin/schematool -initSchema -dbType derby
SLF4J: Class path contains multiple SLF4J bindings.
(중략)
Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User: APP
Starting metastore schema initialization to 2.0.0
Initialization script hive-schema-2.0.0.derby.sql
Initialization script completed
schemaTool completed
오류가 뜰 경우
mv metastore_db metastore_db.tmp
schematool -initSchema -dbType derby
5. 하이브 쉘을 실행합니다.
$ ./bin/hive
(중략)
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.가