Saturday 20 August 2016

Free Big Data Project with End to End Flow @ Kalyan







Pre-Requisites of Big Data Project:
hadoop-2.6.0
hbase-1.1.2
phoenix-4.7.0
flume-1.6.0
tomcat-7
java-1.7

NOTE: Make sure that install all the above components

You can follow this link to install above components

---------------------------------------------------------------------------------
   Follow the below instructions to work with Big Data Project
---------------------------------------------------------------------------------

Project Download Links:
`hadoop-2.6.0.tar.gz ` ==> link
`hbase-1.1.2-bin.tar.gz ` ==> link
`phoenix-4.7.0-HBase-1.1-bin.tar.gz ` ==> link
`apache-flume-1.6.0-bin.tar.gz ` ==> link
`apache-tomcat-7.0.70.tar.gz ` ==> link

`kalyan.war` ==> link
`flume-phoenix.conf` ==> link
`phoenix-flume-4.7.0-HBase-1.1.jar` ==> link
`json-path-2.2.0.jar` ==> link


---------------------------------------------------------------------------------


Start the hadoop

Start the hbase


Start the phoenix


---------------------------------------------------------------------------------

Create `users` and `productlog` tables in `phoenix` with below queries

CREATE TABLE users(userid bigint PRIMARY KEY, username varchar, password varchar, email varchar, country varchar, state varchar, city varchar, date varchar);

CREATE TABLE IF NOT EXISTS productlog(userid bigint not null, username varchar, email varchar, date varchar  not null, product varchar  not null, transaction varchar, country varchar, state varchar, city varchar CONSTRAINT pk PRIMARY KEY (userid, date, product));



---------------------------------------------------------------------------------

update `~/.bashrc` with below changes

export TOMCAT_HOME=/home/hadoop/work/apache-tomcat-7.0.70
export PATH=$TOMCAT_HOME/bin:$PATH
export JAVA_OPTS="-Xms100m -Xmx2000m -Dcom.sun.management.jmxremote"

---------------------------------------------------------------------------------

Copy the `kalyan.war` file to `$TOMCAT_HOME/webapps`


---------------------------------------------------------------------------------

Start the `TOMCAT` server with command is
$TOMCAT_HOME/bin/startup.sh


---------------------------------------------------------------------------------

Generate sample users to work with `Big Data Project` using below commands

java -cp $TOMCAT_HOME/webapps/kalyan/kalyan.jar:$TOMCAT_HOME/webapps/kalyan/WEB-INF/lib/*  com.orienit.kalyan.utils.GenerateUsers <no.of.users>

java -cp $TOMCAT_HOME/webapps/kalyan/kalyan.jar:$TOMCAT_HOME/webapps/kalyan/WEB-INF/lib/*  com.orienit.kalyan.utils.GenerateUsers 100


---------------------------------------------------------------------------------

Generate sample product logs to work with `Big Data Project` using below commands

java -cp $TOMCAT_HOME/webapps/kalyan/kalyan.jar:$TOMCAT_HOME/webapps/kalyan/WEB-INF/lib/*  com.orienit.kalyan.utils.GenerateProductLog <path of the log file> <no.of.logs>

java -cp $TOMCAT_HOME/webapps/kalyan/kalyan.jar:$TOMCAT_HOME/webapps/kalyan/WEB-INF/lib/*  com.orienit.kalyan.utils.GenerateProductLog /tmp/product.log 10000



---------------------------------------------------------------------------------


Send the log changes to Phoenix using `Flume-Phoenix Integration`
(This is one of my contribution to Apache)


---------------------------------------------------------------------------------


Create `flume-phoenix.conf` file with below content

flume-phoenix.sources = execsource
flume-phoenix.sinks = phoenixsink
flume-phoenix.channels = memorychannel

flume-phoenix.sources.execsource.type = exec

flume-phoenix.sources.execsource.command = tail -F /tmp/product.log
flume-phoenix.sources.execsource.channels = memorychannel

flume-phoenix.sinks.phoenixsink.type = org.apache.phoenix.flume.sink.PhoenixSink

flume-phoenix.sinks.phoenixsink.channel = memorychannel
flume-phoenix.sinks.phoenixsink.batchSize = 10
flume-phoenix.sinks.phoenixsink.zookeeperQuorum = localhost
flume-phoenix.sinks.phoenixsink.table = productlog
flume-phoenix.sinks.phoenixsink.ddl = CREATE TABLE IF NOT EXISTS productlog(userid bigint not null, username varchar, email varchar, date varchar not null, product varchar not null, transaction varchar, country varchar, state varchar, city varchar CONSTRAINT pk PRIMARY KEY (userid, date, product))
flume-phoenix.sinks.phoenixsink.serializer = json
flume-phoenix.sinks.phoenixsink.serializer.columnsMapping = {"userid":"userid", "username":"username", "email":"email", "date":"date", "product":"product", "transaction":"transaction", "country":"country", "state":"state", "city":"city"}
flume-phoenix.sinks.phoenixsink.serializer.partialSchema = true
flume-phoenix.sinks.phoenixsink.serializer.columns=userid,username,email,date,product,transaction,country,state,city

flume-phoenix.channels.memorychannel.type = memory

flume-phoenix.channels.memorychannel.capacity = 1000
flume-phoenix.channels.memorychannel.transactionCapacity = 100



---------------------------------------------------------------------------------

Copy the `flume-phoenix.conf` file to `$FLUME_HOME/conf` folder

---------------------------------------------------------------------------------


Copy the `$FLUME_HOME/conf/flume-env.sh.template` file to `$FLUME_HOME/conf` folder and rename to `flume-env.sh`

---------------------------------------------------------------------------------

Update `flume-env.sh` file with below change

export JAVA_OPTS="-Xms100m -Xmx2000m -Dcom.sun.management.jmxremote"

---------------------------------------------------------------------------------

Copy the `phoenix-flume-4.7.0-HBase-1.1.jar` file to `$FLUME_HOME/lib` folder

---------------------------------------------------------------------------------

Copy the `json-path-2.2.0.jar` file to `$FLUME_HOME/lib` folder

---------------------------------------------------------------------------------

Execute the below command `Send the log changes to Phoenix `

$FLUME_HOME/bin/flume-ng agent -n flume-phoenix --conf $FLUME_HOME/conf -f $FLUME_HOME/conf/flume-phoenix.conf -Dflume.root.logger=DEBUG,console

---------------------------------------------------------------------------------

Verify the project output through web ui 
http://localhost:8080/kalyan/home

---------------------------------------------------------------------------------

Stop the `TOMCAT` server with command is 
$TOMCAT_HOME/bin/shutdown.sh


---------------------------------------------------------------------------------





Related Posts Plugin for WordPress, Blogger...