메모장

우분투 16.04 HIVE(하이브 2.11) 설치 본문

카테고리 없음

우분투 16.04 HIVE(하이브 2.11) 설치

hiandroid 2017. 9. 6. 15:50
반응형

hdfs 폴더 생성

           hadoop fs -mkdir /tmp

hadoop fs -mkdir /user

           hadoop fs -mkdir /user/hive

           hadoop fs -mkdir /user/hive/warehouse

           hadoop fs -chmod g+w /tmp

           hadoop fs -chmod g+w /user/hive/warehouse


1. 하이브 2.0.0 설치 파일을 다운받은 후, 압축을 해제합니다.

apache-hive-2.1.1-bin.tar.gz를 다운받음

 $ tar xvfz apache-hive-2.0.0-bin.tar.gz


.bash_profile에 HIVE export 적용

           export HIVE_HOME=~/dev/Apps/apache-hive-2.1.1-bin

           export PATH=$HIVE_HOME/bin:$PATH


2. hive-env.sh 파일을 준비합니다.

 $ cd apache-hive-2.1.1-bin

 $ cp conf/hive-env.sh.template conf/hive-env.sh


3. hive-env.sh 에 하둡 설치 경로를 설정합니다.

 $ vi conf/hive-env.sh

 HADOOP_HOME=/home/hadoop/hadoop-2.7.3


conf파일 생성

           conf/*.template파일을 복사하여 각각의 설정파일을 만든다.

           cp hive-env.sh.template hive-env.sh

           cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties

           cp hive-log4j2.properties.template hive-log4j2.properties

           cp hive-default.xml.template hive-site.xml

           cp beeline-log4j2.properties.template beeline-log4j2.properties

 

conf의 hive-site.xml 수정

conf 설정

           <configuration>  

    <property>  

        <name>hive.metastore.warehouse.dir</name>  

        <value>/usr/hive/warehouse</value>  

        <description>location of default database for the warehouse</description>  

    </property>  

    <property>  

        <name>hive.metastore.local</name>  

        <value>true</value>

        <description>Use false if a production metastore server is used</description>  

    </property>  

    <property>  

        <name>hive.exec.scratchdir</name>  

        <value>/tmp/hive</value>  

        <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>  

    </property>  

    <property>  

        <name>javax.jdo.option.ConnectionURL</name>  

        <value>jdbc:mysql://localhost:3306/hive_metastore_db?createDatabaseIfNoExist=true</value>  

        <description> Roy  

      JDBC connect string for a JDBC metastore.  

      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.  

      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.  

    </description>  

    </property>  

    <property>  

        <name>javax.jdo.option.ConnectionDriverName</name>  

        <value>com.mysql.jdbc.Driver</value>  

        <description>User-Defined(Roy) Driver class name for a JDBC metastore</description>  

    </property>  

    <property>  

        <name>javax.jdo.option.ConnectionUserName</name>  

        <value>hive</value>  

        <description>User-defined(Roy)Username to use against metastore database</description>  

    </property>  

    <property>  

        <name>javax.jdo.option.ConnectionPassword</name>  

        <value>hivedev</value>  

        <description>User-defined(Roy)password to use against metastore database</description>  

    </property>  

</configuration>  

 

           <property>

    <name>hive.server2.enable.doAs</name>

           <!-- <value>true</value> -->

               <value>false</value>

               <description>

                 Setting this property to true will have HiveServer2 execute

                 Hive operations as the user making the calls to it.

               </description>

             </property>

 

 

             <property>

    <name>hive.exec.scratchdir</name>

    <value>/tmp/hive-whitexozu</value>

    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>

  </property>

  <property>

    <name>hive.exec.local.scratchdir</name>

    <value>/tmp/whitexozu</value>

    <description>Local scratch space for Hive jobs</description>

  </property>

  <property>

    <name>hive.downloaded.resources.dir</name>

    <value>/tmp/whitexozu_resources</value>

    <description>Temporary local directory for added resources in the remote file system.</description>

  </property>

  <property>

    <name>hive.scratch.dir.permission</name>

    <value>733</value>

    <description>The permission for the user specific scratch directories that get created.</description>

  </property>

 

  <property>

    <name>hive.querylog.location</name>

    <value>/tmp/hive-whitexozu</value>

    <description>Location of Hive run time structured log file</description>

  </property>

  <property>

 

  <property>

    <name>hive.server2.logging.operation.log.location</name>

    <value>/tmp/hive-whitexozu/operation_logs</value>

    <description>Top level directory where operation logs are stored if logging functionality is enabled</description>

  </property>


4. 하이브 메타스토어를 초기화합니다. 이 과정은 하이브 2.0.0 버전부터 추가된 과정으로 하이브를 처음 실행할 때, 반드시 선행되어야 하는 과정입니다.  -dbType의 파라미터값으로 메타스토어로 사용할 데이터베이스 타입을 명시하며, Derby 외에 다른 데이터베이스를 사용하는 경우에도 반드시 이 과정을 진행해야 합니다.  참고로 schematool은 메타스토어 초기화 기능외에도 기존에 번거로웠던 업그레이드 기능도 편리하게 처리해줍니다.

 $ ./bin/schematool -initSchema -dbType derby

 SLF4J: Class path contains multiple SLF4J bindings.

 (중략)

 Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true

 Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver

 Metastore connection User: APP

 Starting metastore schema initialization to 2.0.0

 Initialization script hive-schema-2.0.0.derby.sql

 Initialization script completed

 schemaTool completed


오류가 뜰 경우

mv metastore_db metastore_db.tmp

schematool -initSchema -dbType derby


5. 하이브 쉘을 실행합니다.

 $ ./bin/hive

 (중략)

 Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark,  tez) or using Hive 1.X releases.가


6. HiveQL을 실행합니다. 데이터베이스 목록을 조회하면, 기본 데이터베이스인 default가 출력됩니다. 
 $ ./bin/hive
 (중략)
 hive> show databases;
 OK
 default
 Time taken: 0.021 seconds, Fetched: 1 row(s)

참고: http://blrunner.com/100


반응형