Apache Zookeeper

 

zookeeper

 

Apache Zookeeper is a :

  • centralized
  • high performance
  • coordination system

for distributed applications.

Apache Zookeeper enables distributed systems.

 

Applications using Apache Zookeeper

  • Apache Hadoop
  • Apache HBase
  • Apache Kafka
  • Apache Accumulo
  • Apache Mesos
  • Apache Solr
  • Neo4j

 

Zookeeper Primitives and Recipes

Zookeeper provides primitives for distributed coordination. Rather than exposing the primitives directly to client applications, it exposes a file system like API.

Recipes are the implementations of primitives in Zookeeper. Recipes provide the operations on Zookeeper data nodes (called ZNodes).

The ZNodes are organized in a hierarchical tree model similar to a file system.

ZNodes

zookeeper_tree

In this diagram,

the /employees znode is the parent znode for all znodes representing employees. An example is Matt which is a znode employee-1

the /dept znode is the parent znode for all znodes representing departments. An example is HR which is a znode dept-1

the /offices znode is the parent znode for all znodes representing offices. An example is Boston which is a znode office-1

ZNodes can contain data or no data. If there is data in a znode, it is stored as a byte array.

The leaf nodes in the tree represent the data. Every time data is added, a znode is added. A znode is removed when data is deleted.

There are 4 modes for Zookeeper ZNodes:

  1. Persistent
  2. Ephemeral
  3. Persistent_Sequential
  4. Ephemeral_Sequential

Persistent Nodes are znodes that can be deleted only by request. They survive service restarts and are backed up in disk.

Ephemeral Nodes are znodes that exist as long as the session that created the znode is active. When the session ends the znode is deleted. Because of this behavior, ephemeral znodes are not allowed to have children.

Sequence: When creating a znode you can also request that ZooKeeper append a monotonically increasing counter to the end of path. This counter is unique to the parent znode. The counter has a format of %010d — that is 10 digits with 0 (zero) padding.

The Curator framework also defines the following recipe: a persistent ephemeral node is an ephemeral node that attempts to stay present in ZooKeeper, even through connection and session interruptions.

Zookeeper API

There are 6 primary operations exposed by the API:

  • create /path data    –  Creates a znode named with /path and containing data
  • delete /path     –  Deletes the znode /path
  • exists /path     – Checks whether /path exists
  • setData /path data    –  Sets the data of znode /path to data
  • getData /path    –  Returns the data in /path
  • getChildren /path    – Returns the list of children under /path

Installing Zookeeper

Download stable version of Zookeeper from https://zookeeper.apache.org/releases.html

$> tar xvz zookeeper-3.4.6.tar.gz
$> cd zookeeper-3.4.6/conf

 

Create zoo.cfg file with the following info:

tickTime=2000
dataDir=/home/xyz/zookeeper/data
clientPort=2181

 

Remember to change the data dir value to something that is writable by the zookeeper process.

 

$> cd zookeeper-3.4.6/bin
$>./zkServer.sh start
 JMX enabled by default
 Using config: /home/zyx/zookeeper-3.4.6/bin/../conf/zoo.cfg
 Starting zookeeper ... STARTED

 

Now that the zookeeper server has started, time to interact with it.

 

In another terminal/command window, go to the bin directory of your zookeeper installation.

bin$ ./zkCli.sh -server 127.0.0.1:2181
Connecting to 127.0.0.1:2181
2015-09-09 21:22:29,700 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2015-09-09 21:22:29,704 [myid:] - INFO [main:Environment@100] - Client environment:host.name=xxx..xxxx.xxx
2015-09-09 21:22:29,704 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.7.0_51
2015-09-09 21:22:29,707 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2015-09-09 21:22:29,707 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre
2015-09-09 21:22:29,707 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/Users/......
2015-09-09 21:22:29,727 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/Users/xyz/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
2015-09-09 21:22:29,727 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/var/folders/dt/p17rgljd56v_jd0hy9s73l3w0000gn/T/
2015-09-09 21:22:29,728 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
2015-09-09 21:22:29,728 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Mac OS X
2015-09-09 21:22:29,728 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=x86_64
2015-09-09 21:22:29,728 [myid:] - INFO [main:Environment@100] - Client environment:os.version=10.10.5 
2015-09-09 21:22:29,729 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/Users/xyz
2015-09-09 21:22:29,729 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/Users/xyz/zookeeper/zookeeper-3.4.6/bin
2015-09-09 21:22:29,731 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=127.0.0.1:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@1d88a478
Welcome to ZooKeeper!
2015-09-09 21:22:29,766 [myid:] - INFO [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@975] - Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2015-09-09 21:22:29,775 [myid:] - INFO [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@852] - Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session
2015-09-09 21:22:29,806 [myid:] - INFO [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@1235] - Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x14fb50eae000000, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path:null

Type help to get all the available commands. After that, we are going to use the “ls”, “get” and “set” commands.

[zk: 127.0.0.1:2181(CONNECTED) 0] help
ZooKeeper -server host:port cmd args
connect host:port
get path [watch]
ls path [watch]
set path data [version]
rmr path
delquota [-n|-b] path
quit
printwatches on|off
create [-s] [-e] path data acl
stat path [watch]
close
ls2 path [watch]
history
listquota path
setAcl path acl
getAcl path
sync path
redo cmdno
addauth scheme auth
delete path [version]
setquota -n|-b val path
[zk: 127.0.0.1:2181(CONNECTED) 1]


[zk: 127.0.0.1:2181(CONNECTED) 1] ls /
[zookeeper]
[zk: 127.0.0.1:2181(CONNECTED) 2] create /blog_testing test_data
Created /blog_testing
[zk: 127.0.0.1:2181(CONNECTED) 3] ls /
[blog_testing, zookeeper]
[zk: 127.0.0.1:2181(CONNECTED) 4] get /blog_testing
test_data
cZxid = 0x2
ctime = Wed Sep 09 21:48:02 CDT 2015
mZxid = 0x2
mtime = Wed Sep 09 21:48:02 CDT 2015
pZxid = 0x2
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 9
numChildren = 0
[zk: 127.0.0.1:2181(CONNECTED) 5] set /blog_testing updated_text
cZxid = 0x2
ctime = Wed Sep 09 21:48:02 CDT 2015
mZxid = 0x3
mtime = Wed Sep 09 21:48:42 CDT 2015
pZxid = 0x2
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 12
numChildren = 0
[zk: 127.0.0.1:2181(CONNECTED) 6] get /blog_testing
updated_text
cZxid = 0x2
ctime = Wed Sep 09 21:48:02 CDT 2015
mZxid = 0x3
mtime = Wed Sep 09 21:48:42 CDT 2015
pZxid = 0x2
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 12
numChildren = 0
[zk: 127.0.0.1:2181(CONNECTED) 7] delete /blog_testing
[zk: 127.0.0.1:2181(CONNECTED) 8] ls /
[zookeeper]
[zk: 127.0.0.1:2181(CONNECTED) 9]

 

To shut down the zookeeper server, in the bin directory

$>./zkServer.sh stop
JMX enabled by default
Using config: /Users/xyz/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED

Zookeeper Programming

If you want to write programs interacting with Zookeeper, you should definitely use the Apache Curator framework.

curator-logo

 

Unit Testing with Zookeeper

The Apache Curator project provides an embedded zookeeper instance that can be used for unit testing.

import org.apache.curator.test.TestingServer;
TestingServer testingServer = new TestingServer();
testingServer.start();
String zookeeperConnectionStr = testingServer.getConnectString();

 

Stay Tuned!