关键词:
Sqoop是一个用来完成Hadoop和关系型数据库中的数据相互转移的工具,它可以将关系型数据库中的数据导入到Hadoop的HDFS中,也可以将HDFS的数据导入到关系型数据库中。
Kafka是一个开源的分布式消息订阅系统
一、Sqoop的安装
1.http://www-eu.apache.org/dist/sqoop/1.4.7/下载sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz并解压到/home/jun下
[[email protected] sqoop-1.4.7.bin__hadoop-2.6.0]$ ls -l total 2020 drwxr-xr-x. 2 jun jun 4096 Dec 19 2017 bin -rw-rw-r--. 1 jun jun 55089 Dec 19 2017 build.xml -rw-rw-r--. 1 jun jun 47426 Dec 19 2017 CHANGELOG.txt -rw-rw-r--. 1 jun jun 9880 Dec 19 2017 COMPILING.txt drwxr-xr-x. 2 jun jun 150 Dec 19 2017 conf drwxr-xr-x. 5 jun jun 169 Dec 19 2017 docs drwxr-xr-x. 2 jun jun 96 Dec 19 2017 ivy -rw-rw-r--. 1 jun jun 11163 Dec 19 2017 ivy.xml drwxr-xr-x. 2 jun jun 4096 Dec 19 2017 lib -rw-rw-r--. 1 jun jun 15419 Dec 19 2017 LICENSE.txt -rw-rw-r--. 1 jun jun 505 Dec 19 2017 NOTICE.txt -rw-rw-r--. 1 jun jun 18772 Dec 19 2017 pom-old.xml -rw-rw-r--. 1 jun jun 1096 Dec 19 2017 README.txt -rw-rw-r--. 1 jun jun 1108073 Dec 19 2017 sqoop-1.4.7.jar -rw-rw-r--. 1 jun jun 6554 Dec 19 2017 sqoop-patch-review.py -rw-rw-r--. 1 jun jun 765184 Dec 19 2017 sqoop-test-1.4.7.jar drwxr-xr-x. 7 jun jun 73 Dec 19 2017 src drwxr-xr-x. 4 jun jun 114 Dec 19 2017 testdata
2.配置MySQL连接器
[[email protected] sqoop-1.4.7.bin__hadoop-2.6.0]$ cp /home/jun/Resources/mysql-connector-java-5.1.46/mysql-connector-java-5.1.46.jar /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/lib/
3.配置Sqoop环境变量
编辑配置文件
[[email protected] lib]$ cd /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/conf/ [[email protected] conf]$ ls oraoop-site-template.xml sqoop-env-template.cmd sqoop-env-template.sh sqoop-site-template.xml sqoop-site.xml [[email protected] conf]$ cp sqoop-env-template.sh sqoop-env.sh [[email protected] conf]$ gedit sqoop-env.sh
增加下面的配置
#Set path to where bin/hadoop is available export HADOOP_COMMON_HOME=/home/jun/hadoop #Set path to where hadoop-*-core.jar is available export HADOOP_MAPRED_HOME=/home/jun/hadoop #set the path to where bin/hbase is available export HBASE_HOME=/home/jun/hbase-1.2.6.1 #Set the path to where bin/hive is available export HIVE_HOME=/home/jun/apache-hive-2.3.3-bin #Set the path for where zookeper config dir is export ZOOCFGDIR=/usr/local/zk
4.配置linux环境变量
#sqoop export SQOOP_HOME=/home/jun/sqoop-1.4.7.bin__hadoop-2.6.0 export PATH=$PATH:$SQOOP_HOME/bin
5.启动Sqoop,如果出现下面的内容就说明安装成功
[[email protected] ~]$ sqoop-help Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/07/23 15:56:36 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/jun/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/jun/hbase-1.2.6.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] usage: sqoop COMMAND [ARGS] Available commands: codegen Generate code to interact with database records create-hive-table Import a table definition into Hive eval Evaluate a SQL statement and display the results export Export an HDFS directory to a database table help List available commands import Import a table from a database to HDFS import-all-tables Import tables from a database to HDFS import-mainframe Import datasets from a mainframe server to HDFS job Work with saved jobs list-databases List available databases on a server list-tables List available tables in a database merge Merge results of incremental imports metastore Run a standalone Sqoop metastore version Display version information See ‘sqoop help COMMAND‘ for information on a specific command.
6.测试与MySQL的连接
(1)列出MySQL的所有数据库
[[email protected] ~]$ sqoop-list-databases --connect jdbc:mysql://localhost:3306 --username root -P Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/07/23 16:03:01 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/jun/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/jun/hbase-1.2.6.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Enter password: 18/07/23 16:03:05 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. Mon Jul 23 16:03:05 CST 2018 WARN: Establishing SSL connection without server‘s identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn‘t set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to ‘false‘. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. information_schema hive_db mysql performance_schema sys
(2)列出数据库下的所有数据表
[[email protected] ~]$ sqoop-list-tables --connect jdbc:mysql://localhost:3306/mysql --username root -P Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/07/23 16:06:06 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/jun/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/jun/hbase-1.2.6.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Enter password: 18/07/23 16:06:09 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. Mon Jul 23 16:06:09 CST 2018 WARN: Establishing SSL connection without server‘s identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn‘t set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to ‘false‘. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. columns_priv db engine_cost event func general_log gtid_executed help_category help_keyword help_relation help_topic innodb_index_stats innodb_table_stats ndb_binlog_index plugin proc procs_priv proxies_priv server_cost servers slave_master_info slave_relay_log_info slave_worker_info slow_log tables_priv time_zone time_zone_leap_second time_zone_name time_zone_transition time_zone_transition_type user
(3)执行MySQL的查询语句
[[email protected] ~]$ sqoop-eval --connect jdbc:mysql://localhost:3306/mysql --username root -P --query "select * from plugin" Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/07/23 16:09:33 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/jun/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/jun/hbase-1.2.6.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Enter password: 18/07/23 16:09:36 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. Mon Jul 23 16:09:37 CST 2018 WARN: Establishing SSL connection without server‘s identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn‘t set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to ‘false‘. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. ----------------------------------------------- | name | dl | ----------------------------------------------- | validate_password | validate_password.so | -----------------------------------------------
etl工具sqoop(代码片段)
ETL工具sqoop文章目录ETL工具sqoopsqoop简介sqoop安装一、安装包的获取:二、上传安装包到服务器三、进行安装配置四、验证sqoopsqoop常用命令sqoop案例一、基础操作二、导入数据操作三、导出数据操作sqoop简介Sqoop(发音:skup)... 查看详情
sqoop使用入门(代码片段)
...供了两个版本,1.4.x的为sqoop1,1.99x的为sqoop2,前者因为安装简单,得到了大量使用,后者虽然引进了安全机制、webui,restapi等更加方便使用的特性,但是安装过程繁琐暂时不记录。以下是sqoop1的结构图,它只提供一个sqoop客户端... 查看详情
sqoop的安装和使用(代码片段)
...出的MapReduce中主要是对InputFormat和OutputFormat进行定制三、安装1、前提概述将来sqoop在使用的时候有可能会跟那些系统或者组件打交道?HDFS,MapReduce,YARN,ZooKeeper,Hive,HBase,MySQL下载地址http://mirrors.hust.edu.cn/apache/sqoop(1)上传... 查看详情
四十centos安装sqoop(使用sqoop完成mysql和hdfs之间的数据互导)(代码片段)
环境准备:centos7centos可以上网hadoop,Hbase,Hive,Zookeeper正常运行环境搭建:版本:sqoop1.4.7-hadoop2.6.0一、Sqoop安装 1、直接在虚拟机浏览器下载sqoop1.4.7https://archive.apache.org/dist/sqoop/1 查看详情
sqoop(代码片段)
sqoop1Sqoop简介2Sqoop原理3Sqoop安装3.1下载并解压3.2修改配置文件3.3拷贝JDBC驱动3.4验证Sqoop3.5测试Sqoop是否能够成功连接数据库4Sqoop的简单使用案例4.1导入数据4.1.1RDBMS到HDFS4.1.2RDBMS到Hive4.1.3RDBMS到Hbase4.2、导出数据4.2.1HIVE/HDFS到RDBMS4.3脚... 查看详情
sqoop(代码片段)
sqoop1Sqoop简介2Sqoop原理3Sqoop安装3.1下载并解压3.2修改配置文件3.3拷贝JDBC驱动3.4验证Sqoop3.5测试Sqoop是否能够成功连接数据库4Sqoop的简单使用案例4.1导入数据4.1.1RDBMS到HDFS4.1.2RDBMS到Hive4.1.3RDBMS到Hbase4.2、导出数据4.2.1HIVE/HDFS到RDBMS4.3脚... 查看详情
sqoop(代码片段)
sqoop1Sqoop简介2Sqoop原理3Sqoop安装3.1下载并解压3.2修改配置文件3.3拷贝JDBC驱动3.4验证Sqoop3.5测试Sqoop是否能够成功连接数据库4Sqoop的简单使用案例4.1导入数据4.1.1RDBMS到HDFS4.1.2RDBMS到Hive4.1.3RDBMS到Hbase4.2、导出数据4.2.1HIVE/HDFS到RDBMS4.3脚... 查看详情
sqoop的安装与常用抽数操作(代码片段)
sqoop简介处理sqoop环境配置处理sqoop数据导入导出处理一:sqoop简介处理-1.Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如:MySQL,Oracle,Postgres等)中的数据导进到Hadoop的HDFS中,也可... 查看详情
sqoop安装与简单实用(代码片段)
一,sqoop安装 1.解压源码包2.配置环境变量3.在bin目录下的 /bin/configsqoop注释掉check报错信息4.配置conf目录下 /conf/sqoop-env.sh配置hadoop和hive家目录5.导入依赖的jar包至lib目录下 mysql-connector-java-5.1.46-bin.jar/share/hadoop/... 查看详情
sqoop的安装(代码片段)
条件 1.启动mysql servicemysqldstart 2.启动hadoop集群 start-all.sh1.下载jar包(:http://mirrors.hust.edu.cn/apache/sqoop/1.4.6/)2.上传安装包sqoop-1.4.6.bin__hadoop-2.0.4-alpha 查看详情
sqoop--数据库和hdfs之间的搬运工(代码片段)
...翻译出的MapReduce中对inputformat和outputformat进行定制。sqoop安装安装sqoop首先要安装java和hadoop,当然我这里已经安装好了,大数据组件的安装很简单,可以参考我的其他博客。然后我们安装sqoop,这里我采用的是1.4.5版本的,目前sqoo... 查看详情
sqoop学习之路(代码片段)
一、概述二、工作机制三、安装1、前提概述2、软件下载3、安装步骤四、Sqoop的基本命令基本操作示例五、Sqoop的数据导入1、从RDBMS导入到HDFS中2、把MySQL数据库中的表数据导入到Hive中3、把MySQL数据库中的表数据导入到hbase 正文... 查看详情
sqoop安装及导入sqlserver数据(代码片段)
...用对应的sqljdbc.jar包,这里用到的是sqljdbc4.jar点我下载3.安装sqoop将下载好的压缩包解压到指定安装目录,如/opttar-zxvfsqoop-1.4.7_hadoop 查看详情
sqoop的介绍以及部署安装(代码片段)
1.sqoop的介绍(1)介绍: Sqoop是Apache旗下的一款“hadoop和关系型数据库服务器之间传送数据”的工具。 导入数据:MySQL、Oracle导入数据到hadoop的hdfs、hive、HBASE等数据存储系统。 导出数据:从hadoop的文件... 查看详情
ha高可用+hive+hbase+sqoop+kafka+flume+spark安装部署(代码片段)
目录前言资料HA高可用部署Hive安装部署Hbase安装部署sqoop安装部署解压安装包修改配置文件环境变量 sqoop-env.sh拷贝JDBC驱动测试Sqoop是否能够成功连接数据库kafka安装部署解压安装包 环境变量配置文件创建logs文件夹zookeeper.propertie... 查看详情
安装sqoop(代码片段)
1.sqoop的下载地址https://mirrors.tuna.tsinghua.edu.cn/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz2.上传sqoop3.解压sqoopsudotar-zxvfsqoop-1.4.7.bin__hadoop-2.6.0.tar.gz-C/usr/local/4.进入/usr/local目录c 查看详情
sqoop安装及使用(代码片段)
SQOOP安装及使用文章目录SQOOP安装及使用SQOOP安装1、上传并解压2、修改文件夹名字3、修改配置文件4、修改环境变量5、添加MySQL连接驱动6、测试准备MySQL数据登录MySQL数据库创建student数据库切换数据库并导入数据另外一种导入数... 查看详情
sqoop简单回顾总结(代码片段)
...底层调用mapreduce,换言之使用sqoop必须得开yarn。3Sqoop安装 查看详情