Sqoop2: Kerberize Installation

In this tutorial I will show you how to kerberize Sqoop installation. Before you begin ensure you have installed Sqoop.

This assumes your hostname is “hadoop”

Create Kerberos Principals

  1. cd /etc/security/keytabs
  2. sudo kadmin.local
  3. addprinc -randkey sqoop/hadoop@REALM.CA
  4. xst -kt sqoop.service.keytab sqoop/hadoop@REALM.CA
  5. addprinc -randkey sqoopHTTP/hadoop@REALM.CA
  6. xst -kt sqoopHTTP.service.keytab sqoopHTTP/hadoop@REALM.CA
  7. q

Set Keytab Permissions/Ownership

  1. sudo chown root:hadoopuser /etc/security/keytabs/*
  2. sudo chmod 750 /etc/security/keytabs/*

Configuration

Configure Kerberos with Sqoop

  1. cd /usr/local/sqoop/conf/
  2. nano sqoop.properties
  3.  
  4. #uncomment the following
  5. org.apache.sqoop.security.authentication.type=KERBEROS
  6. org.apache.sqoop.security.authentication.handler=org.apache.sqoop.security.authentication.KerberosAuthenticationHandler
  7.  
  8. #update to the following
  9. org.apache.sqoop.security.authentication.kerberos.principal=sqoop/hadoop@GAUDREAULT_KDC.CA
  10. org.apache.sqoop.security.authentication.kerberos.keytab=/etc/security/keytabs/sqoop.service.keytab

 

 

 

 

 

 

 

 

 

 

Sqoop2: Installation

We are going to install Sqoop. Ensure you have Hadoop installed already.

This assumes your hostname is “hadoop”

Install Java JDK

  1. apt-get update
  2. apt-get upgrade
  3. apt-get install default-jdk

Download Sqoop:

  1. wget https://archive.apache.org/dist/sqoop/1.99.7/sqoop-1.99.7-bin-hadoop200.tar.gz
  2. tar -zxvf sqoop-1.99.7-bin-hadoop200.tar.gz
  3. sudo mv sqoop-1.99.7-bin-hadoop200 /usr/local/sqoop/
  4. sudo chown -R root:hadoopuser /usr/local/sqoop/

Setup .bashrc:

  1. sudo nano ~/.bashrc

Add the following to the end of the file.

#SQOOP VARIABLES START
export SQOOP_HOME=/usr/local/sqoop
export PATH=$PATH:$SQOOP_HOME/bin
export SQOOP_CONF_DIR=$SQOOP_HOME/conf
export SQOOP_CLASS_PATH=$SQOOP_CONF_DIR
#SQOOP VARIABLES STOP

  1. source ~/.bashrc

Initialise Repository

  1. ./bin/sqoop2-tool upgrade

Modify sqoop2-server

If you are running Hadoop on the same server as Sqoop Server you will need to modify this file. The reason is because Sqoop needs you to point to the lib directory for common, hdfs, mapreduce and yarn.

  1. nano /usr/loca/sqoop/bin/sqoop.sh
  2.  
  3. #Modify these lines
  4. HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-${HADOOP_HOME}/share/hadoop/common}
  5. HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-${HADOOP_HOME}/share/hadoop/hdfs}
  6. HADOOP_MAPRED_HOME=${HADOOP_MAPRED_HOME:-${HADOOP_HOME}/share/hadoop/mapreduce}
  7. HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-${HADOOP_HOME}/share/hadoop/yarn}
  8.  
  9. #TO
  10.  
  11. HADOOP_COMMON_HOME=${HADOOP_HOME}/share/hadoop/common
  12. HADOOP_HDFS_HOME=${HADOOP_HOME}/share/hadoop/hdfs
  13. HADOOP_MAPRED_HOME=${HADOOP_HOME}/share/hadoop/mapreduce
  14. HADOOP_YARN_HOME=${HADOOP_HOME}/share/hadoop/yarn

Configuration

  1. nano /usr/local/sqoop/conf/sqoop.properties
  2. #Update the following line
  3. org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/local/hadoop/etc/hadoop/

Start Sqoop Server

  1. ./bin/sqoop2-server start

References

https://linoxide.com/tools/install-apache-sqoop-ubuntu-16-04/