Apache Sqoop Tutorial – Basic Sqoop Import and Export Operations
What is Sqoop?
Sqoop is one type of tool which used to transfer data between RDBMS and HDFS. It is export and import data from datastores to HDFS. It uses a MapReduce for export the data for processing the large amount of data. Sqoop only works with relational databases and it is a open source tool written by Cloudra.
Main Functions of Sqoop:
- Import one and selected tables.
- Import Complete Hadoop Database
- Filter out selected column and row from any table
WorkFlow of Sqoop:
Sqoop Import – It import separate table from RDBMS to HDFS and all rows of table is one record in sqoop which stored as textfile or sequence Files
Sqoop Export – It used to export file from HDFS to RDBMS and that file stored to record which is called rows.
Some Sqoop Import Operations:
1. General Syntax:
$ sqoop import (generic args) (import args)$ sqoop-import (generic args) (import args)
2. How to import Table to HDFS
$ sqoop import –connect –table –username –password –target-dir
Connect – Give JDBC Connection
Table – Give name of Source tabe
Target Dir – Give import directory name
3. Importing Selected Data
$ sqoop import –connect –table –username –password –columns –where
columns – select subset columns
where – retrive data from where
Sqoop Export Operations:
1. General:
$ sqoop export (generic args) (export args)$ sqoop-export (generic args) (export args)
2. Sqoop-Eval – used to run queries quickly
$ sqoop eval –connect –query “SQL query”
3. Sqoop List Database – List out all databases
Reference - Apache Sqoop Tutorial$ sqoop list-databases –connect
Comments
Post a Comment