What is ZooKeeper? Where to use it?

This is for Hadoop eco system like HDFS, Map reduce, Hive, Hbase, Pig, sqoop,sqoop2, Avro, solr, hcatalog, impala, Oozie, Zoo Keeper and Hadoop distribution like Cloudera, Hortonwork etc.
Posts: 14
Joined: Sat Jul 19, 2014 6:44 pm

What is ZooKeeper? Where to use it?

Postby pintuvirani » Mon Jul 21, 2014 8:50 pm

What is ZooKeeper? Where and How i can use ZooKeeper?


Re: What is ZooKeeper? Where to use it?

Postby Guest » Tue Jul 22, 2014 8:26 pm

ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. Each time they are implemented there is a lot of work that goes into fixing the bugs and race conditions that are inevitable. Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them ,which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.

HBase uses Zookeeper for coordinating activities its "head node" was responsible for prior to the current version. The move to using Zookeeper means the central control is no longer a single point of failure.

ZooKeeper runs on a cluster of servers called an ensemble that share the state of your data. These may be the same machines that are running other Hadoop services or a separate cluster. Whenever a change is made, it is not considered successful until it has been written to a quorum (at least half) of the servers in the ensemble. A leader is elected within the ensemble, and if two conflicting changes are made at the same time, the one that is processed by the leader first will succeed and the other will fail. ZooKeeper guarantees that writes from the same client will be processed in the order they were sent by that client. This guarantee, along with other features discussed below, allow the system to be used to implement locks, queues, and other important primitives for distributed queueing. The outcome of a write operation allows a node to be certain that an identical write has not succeeded for any other node.


[zk1: localhost:7781(CONNECTED)] ls /
[zk1: localhost:7781(CONNECTED)] help

Reading and writing Data

[zk1: localhost:7781(CONNECTED)] create /zk1-demo ''
Created /zk1-demo
[zk1: localhost:7781(CONNECTED)] create /zk1-demo/my-node1 'Hello Hadoop!'
Created /zk1-demo/my-node1

Return to “Hadoop and Big Data”

Who is online

Users browsing this forum: No registered users and 2 guests