Apache Hbase安装及运行
安装hbase1.4,确保在这之前hadoop是正常运行的。设置相应的环境变量,
export HADOOP_HOME=/u01/hadoop export HBASE_HOME=/u01/hbase export PATH=$PATH:$HADOOP_HOME/bin:$HBASE_HOME/bin启动hbase
./start-hbase.sh确保hadoop, hbase能正常启动,如有问题,可自行搜索文档解决。
[oracle@ol66 bin]$ jps 11685 NodeManager 11157 SecondaryNameNode 10844 NameNode 11405 ResourceManager 13135 HMaster 13455 Jps 10959 DataNode确保hbase在hadoop上正常运行
[oracle@ol66 u01]$ hdfs dfs -ls / Found 3 items drwxr-xr-x - oracle supergroup 0 2018-02-28 00:52 /hbase drwxr-xr-x - oracle supergroup 0 2018-02-27 23:14 /ogg drwxr-xr-x - oracle supergroup 0 2018-02-28 00:33 /tmp[oracle@ol66 bin]$ ./hbase shell
HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. Version 1.4.0, r10b9b9fae6b557157644fb9a0dc641bb8cb26e39, Fri Dec 8 16:09:13 PST 2017hbase(main):001:0> list
TABLE 0 row(s) in 0.3800 seconds=> []
hbase(main):002:0>可以看到,系统中还没有任何表。
OGG安装及测试
配置OGG for bigdata的环境变量
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JAVA_HOME/jre/lib/amd64/server安装ogg for bigdata 12.3版本软件
进入ggsci ggsci>create subdirs退回到操作系统命令行,在OGG安装目录下执行如下命令,拷贝hbase示例投递参数后进行修改
cp AdapterExamples/big-data/hbase/* dirprm/ 在dirprm目录下,编辑hbase.props文件。 根据安装的hbase的路径,修改gg.classpath中 hbase lib的路径;保存退出。 hbase.props的完整内容如下: gg.handlerlist=hbase gg.handler.hbase.type=hbase gg.handler.hbase.hBaseColumnFamilyName=cf gg.handler.hbase.keyValueDelimiter=CDATA[=] gg.handler.hbase.keyValuePairDelimiter=CDATA[,] gg.handler.hbase.encoding=UTF-8 gg.handler.hbase.pkUpdateHandling=abend gg.handler.hbase.nullValueRepresentation=CDATA[NULL] gg.handler.hbase.authType=none gg.handler.hbase.includeTokens=falsegg.handler.hbase.mode=tx goldengate.userexit.timestamp=utc goldengate.userexit.writers=javawriter javawriter.stats.display=TRUE javawriter.stats.full=TRUEgg.log=log4j gg.log.level=INFOgg.report.time=30secgg.classpath=/u01/hbase/lib/*:/u01/hbase/conf/: javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar |
重新进入GGSCI,使用示例参数和示例队列创建投递进程。
GGSCI>add replicat rhbase, exttrail AdapterExamples/trail/tr rhbase.prm的内容如下:REPLICAT rhbase-- Trail file for this example is located in "AdapterExamples/trail" directory-- Command to add REPLICAT-- add replicat rhbase, exttrail AdapterExamples/trail/tr TARGETDB LIBFILE libggjava.so SET property=dirprm/hbase.props REPORTCOUNT EVERY 1 MINUTES, RATE GROUPTRANSOPS 10000 MAP QASOURCE.*, TARGET QASOURCE.*; |
测试
启动投递进程
GGSCI (ol66) 19> start rhbaseSending START request to MANAGER ...
REPLICAT RHBASE starting GGSCI (ol66) 20> info rhbaseREPLICAT RHBASE Initialized 2018-02-28 00:53 Status STARTING
Checkpoint Lag 00:00:00 (updated 00:02:16 ago) Process ID 15424 Log Read Checkpoint File AdapterExamples/trail/tr000000000 First Record RBA 0GGSCI (ol66) 22> info rhbase
REPLICAT RHBASE Last Started 2018-02-28 00:55 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:02:20 ago) Process ID 15424 Log Read Checkpoint File /u01/ogg4bd/AdapterExamples/trail/tr000000000 First Record RBA 0GGSCI (ol66) 27> info rhbase
REPLICAT RHBASE Last Started 2018-02-28 00:55 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:01 ago) Process ID 15424 Log Read Checkpoint File /u01/ogg4bd/AdapterExamples/trail/tr000000000 2015-11-06 02:45:39.000000 RBA 5660投递完成,在OGG中检查投递结果
GGSCI (ol66) 28> stats rhbase, total
Sending STATS request to REPLICAT RHBASE ...
Start of Statistics at 2018-02-28 00:56:23.
Replicating from QASOURCE.TCUSTMER to QASOURCE.TCUSTMER:
*** Total statistics since 2018-02-28 00:55:43 ***
Total inserts 5.00 Total updates 1.00 Total deletes 0.00 Total discards 0.00 Total operations 6.00Replicating from QASOURCE.TCUSTORD to QASOURCE.TCUSTORD:
*** Total statistics since 2018-02-28 00:55:43 ***
Total inserts 5.00 Total updates 3.00 Total deletes 2.00 Total discards 0.00 Total operations 10.00End of Statistics.
在hbase上查看,已经比刚开始多了2张表hbase(main):002:0> list
TABLE QASOURCE:TCUSTMER QASOURCE:TCUSTORD 2 row(s) in 0.0190 seconds=> ["QASOURCE:TCUSTMER", "QASOURCE:TCUSTORD"]
查看数据
hbase(main):005:0> scan 'QASOURCE:TCUSTMER' ROW COLUMN+CELL ANN column=cf:CITY, timestamp=1519750550592, value=NEW YORK ANN column=cf:CUST_CODE, timestamp=1519750550592, value=ANN ANN column=cf:NAME, timestamp=1519750550592, value=ANN'S BOATS ANN column=cf:STATE, timestamp=1519750550592, value=NY BILL column=cf:CITY, timestamp=1519750550592, value=DENVER BILL column=cf:CUST_CODE, timestamp=1519750550592, value=BILL BILL column=cf:NAME, timestamp=1519750550592, value=BILL'S USED CARS BILL column=cf:STATE, timestamp=1519750550592, value=CO DAVE column=cf:CITY, timestamp=1519750550592, value=TALLAHASSEE DAVE column=cf:CUST_CODE, timestamp=1519750550592, value=DAVE DAVE column=cf:NAME, timestamp=1519750550592, value=DAVE'S PLANES INC. DAVE column=cf:STATE, timestamp=1519750550592, value=FL JANE column=cf:CITY, timestamp=1519750550421, value=DENVER JANE column=cf:CUST_CODE, timestamp=1519750550421, value=JANE JANE column=cf:NAME, timestamp=1519750550421, value=ROCKY FLYER INC. JANE column=cf:STATE, timestamp=1519750550421, value=CO WILL column=cf:CITY, timestamp=1519750550421, value=SEATTLE WILL column=cf:CUST_CODE, timestamp=1519750550421, value=WILL WILL column=cf:NAME, timestamp=1519750550421, value=BG SOFTWARE CO. WILL column=cf:STATE, timestamp=1519750550421, value=WA 5 row(s) in 0.3150 secondshbase(main):001:0> scan 'QASOURCE:TCUSTORD'
ROW COLUMN+CELL BILL|1995-12-31 15:00:00|CAR|765 column=cf:CUST_CODE, timestamp=1519750550614, value=BILL BILL|1995-12-31 15:00:00|CAR|765 column=cf:ORDER_DATE, timestamp=1519750550614, value=1995-12-31 15:00:00 BILL|1995-12-31 15:00:00|CAR|765 column=cf:ORDER_ID, timestamp=1519750550614, value=765 BILL|1995-12-31 15:00:00|CAR|765 column=cf:PRODUCT_AMOUNT, timestamp=1519750550614, value=3 BILL|1995-12-31 15:00:00|CAR|765 column=cf:PRODUCT_CODE, timestamp=1519750550614, value=CAR BILL|1995-12-31 15:00:00|CAR|765 column=cf:PRODUCT_PRICE, timestamp=1519750550614, value=14000.00 BILL|1995-12-31 15:00:00|CAR|765 column=cf:TRANSACTION_ID, timestamp=1519750550614, value=100 BILL|1996-01-01 00:00:00|TRUCK|333 column=cf:CUST_CODE, timestamp=1519750550614, value=BILL BILL|1996-01-01 00:00:00|TRUCK|333 column=cf:ORDER_DATE, timestamp=1519750550614, value=1996-01-01 00:00:00 BILL|1996-01-01 00:00:00|TRUCK|333 column=cf:ORDER_ID, timestamp=1519750550614, value=333 BILL|1996-01-01 00:00:00|TRUCK|333 column=cf:PRODUCT_AMOUNT, timestamp=1519750550614, value=15 BILL|1996-01-01 00:00:00|TRUCK|333 column=cf:PRODUCT_CODE, timestamp=1519750550614, value=TRUCK BILL|1996-01-01 00:00:00|TRUCK|333 column=cf:PRODUCT_PRICE, timestamp=1519750550614, value=25000.00 BILL|1996-01-01 00:00:00|TRUCK|333 column=cf:TRANSACTION_ID, timestamp=1519750550614, value=100 WILL|1994-09-30 15:33:00|CAR|144 column=cf:CUST_CODE, timestamp=1519750550614, value=WILL WILL|1994-09-30 15:33:00|CAR|144 column=cf:ORDER_DATE, timestamp=1519750550614, value=1994-09-30 15:33:00 WILL|1994-09-30 15:33:00|CAR|144 column=cf:ORDER_ID, timestamp=1519750550614, value=144 WILL|1994-09-30 15:33:00|CAR|144 column=cf:PRODUCT_AMOUNT, timestamp=1519750550453, value=3 WILL|1994-09-30 15:33:00|CAR|144 column=cf:PRODUCT_CODE, timestamp=1519750550614, value=CAR WILL|1994-09-30 15:33:00|CAR|144 column=cf:PRODUCT_PRICE, timestamp=1519750550614, value=16520.00 WILL|1994-09-30 15:33:00|CAR|144 column=cf:TRANSACTION_ID, timestamp=1519750550453, value=100 3 row(s) in 0.6770 seconds可以看到,OGG配置投递到Hbase非常简单,可以根据DB中表的主键字段创建key,如果没有PK字段,则投递时会报错。以下是当前OGG版本支持的Hbase版本