mysql student 테이블 정보
1
2
3
4
5
6
7
8
9
|
mysql> desc student;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | NULL | |
| name | varchar(10) | YES | | NULL | |
| phone | varchar(20) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
3 rows in set (0.00 sec)
|
1
2
3
4
5
6
7
8
|
mysql> select * from student;
+----+-------+---------------+
| id | name | phone |
+----+-------+---------------+
| 1 | smith | 010-1111-1111 |
| 2 | kelly | 010-2222-2222 |
+----+-------+---------------+
2 rows in set (0.00 sec)
|
1
2
|
[hadoop@master ~]$ cd /usr/local/sqoop/
[hadoop@master sqoop]$ vi student_import.sh
|
위 처럼 student_import.sh 파일을 만들고 다음과 같이 작성 하였다.
1
2
3
4
5
6
7
8
9
10
11
12
13
|
import
--username
hadoop
--password
'비밀번호'
--connect
jdbc:mysql://xxx.xxx.xxx.35:3309/member?serverTimezone=Asia/Seoul
--table
student
--split-by
ID
--m
1
|
참고로 작성 방법은 다음과 같다.
--connect jdbc:mysql://마스터주소:mysql포트번호/mysql데이터베이스명?serverTimezone=Asia/Seoul \ =JDBC 연결, mysql 데이터베이스, mysql 타임존 명시
--username [mysql 사용자명] \ =mysql 인증 사용자명 명시
--password [mysql 비밀번호] \ =mysql 인증 비밀번호 명시
--query 'select * from sample_table WHERE $CONDITIONS' \ =mysql 쿼리 명시
--split-by [컬럼명] \ =분할 단위로 쓰이는 테이블의 컬럼
--target-dir [디렉토리] 예시) hdfs://localhost/user/hadoop/ =목적지 HDFS 디렉토리(디렉토리를 작성하지 않으면 디폴트 값인 root경로로 바로 저장 된다.)
-m 1
1
2
3
4
5
6
7
8
9
10
11
12
|
import-all-tables
--username
mysql계정명 입력
--password
비밀번호 입력
--connect
jdbc:mysql://192.168.0.35:3309/디비명?serverTimezone=Asia/Seoul
--warehouse-dir
/user/hadoop/EdgeDB <= hdfs 저장
-m
1
|
위의 코드는 테이블 import가 아닌 DB import 코드이다.
1. sqoop과 mysql이 제대로 연결되어 있는지 확인하기 위헤 다음과 같은 명령어로 mysql 데이터베이스 리스트를 보도록한다.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
[hadoop@master ~]$ sqoop list-databases --connect jdbc:mysql://xxx.xxx.xxx.35:3309 --username hadoop --password '비밀번호'
Warning: /usr/local/sqoop/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/local/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/05/09 17:33:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
19/05/09 17:33:26 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/05/09 17:33:26 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
member
mysql
performance_schema
|
HBase, HCatalog, Accumulo는 필요없으니 무시....
아까 만들었던 student_import.sh로 member라는 데이터베이스에서 student 테이블 리스트를 가져오도록 하겠습니다.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
|
[hadoop@master sqoop]$ sqoop --options-file student_import.sh
Warning: /usr/local/sqoop/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/local/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/05/09 17:44:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
19/05/09 17:44:16 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/05/09 17:44:16 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
19/05/09 17:44:16 INFO tool.CodeGenTool: Beginning code generation
19/05/09 17:44:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
19/05/09 17:44:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
19/05/09 17:44:16 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hadoop/compile/3ca02518569f69f0effb4492d2b8319c/student.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
19/05/09 17:44:17 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/3ca02518569f69f0effb4492d2b8319c/student.jar
19/05/09 17:44:17 WARN manager.MySQLManager: It looks like you are importing from mysql.
19/05/09 17:44:17 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
19/05/09 17:44:17 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
19/05/09 17:44:17 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
19/05/09 17:44:17 INFO mapreduce.ImportJobBase: Beginning import of student
19/05/09 17:44:17 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
19/05/09 17:44:17 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
19/05/09 17:44:17 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
19/05/09 17:44:18 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.0.35:8032
19/05/09 17:44:20 INFO db.DBInputFormat: Using read commited transaction isolation
19/05/09 17:44:20 INFO mapreduce.JobSubmitter: number of splits:1
19/05/09 17:44:20 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1557362273260_0003
19/05/09 17:44:20 INFO impl.YarnClientImpl: Submitted application application_1557362273260_0003
19/05/09 17:44:20 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1557362273260_0003/
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=142093
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=44
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=5970
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=2985
Total vcore-milliseconds taken by all map tasks=2985
Total megabyte-milliseconds taken by all map tasks=4584960
Map-Reduce Framework
Map input records=2
Map output records=2
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=75
CPU time spent (ms)=950
Physical memory (bytes) snapshot=248098816
Virtual memory (bytes) snapshot=3030355968
Total committed heap usage (bytes)=351272960
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=44
19/05/09 17:44:34 INFO mapreduce.ImportJobBase: Transferred 44 bytes in 16.2221 seconds (2.7123 bytes/sec)
19/05/09 17:44:34 INFO mapreduce.ImportJobBase: Retrieved 2 records.
|
HDFS 웹 UI 확인
http://ip주소:50070
YARN 웹 UI 확인
http://ip주소:8088
마지막 다음과 같은 명령어로 HDFS에 있는 student 파일 확인하기
1
2
3
|
[hadoop@master sqoop]$ hdfs dfs -cat /user/hadoop/student/*
1,smith,010-1111-1111
2,kelly,010-2222-2222
|
import 참고
http://hochul.net/blog/datacollector_apache_sqoop_from_rdbms2/
이상으로 hadoop-sqoop-mysql 연동을 마무리 하겠습니다.
'Centos7 > Sqoop' 카테고리의 다른 글
Python Script로 Sqoop import 하는 방법(MySQL --> HDFS) (0) | 2019.05.16 |
---|---|
스쿱-MySQL 연동 1. Sqoop & MySQL Connector 설치 및 설정 (0) | 2019.05.09 |