What is HCatalog? What are application of HCatalog?

This is for Hadoop eco system like HDFS, Map reduce, Hive, Hbase, Pig, sqoop,sqoop2, Avro, solr, hcatalog, impala, Oozie, Zoo Keeper and Hadoop distribution like Cloudera, Hortonwork etc.
hadoopuser
Posts: 42
Joined: Mon Jul 21, 2014 7:40 pm
Contact:

What is HCatalog? What are application of HCatalog?

Postby hadoopuser » Mon Jul 21, 2014 9:06 pm

What is HCatalog? What are application of HCatalog? How to use it in Hadoop?


Guest

Re: What is HCatalog? What are application of HCatalog?

Postby Guest » Tue Jul 22, 2014 5:07 pm

HCatalog is developed by members from the Apache Pig, Hive, and Hadoop projects, plus new contributors. Most of the lates code has been written by Yahoos.

HCatalog is a table and storage management layer for Hadoop that enables users with different data processing tools — Pig, MapReduce — to more easily read and write data on the grid.

HCatalog supports reading and writing files in any format for which a Hive SerDe (serializer-deserializer) can be written. By default, HCatalog supports RCFile, CSV, JSON, and SequenceFile formats. To use a custom format, you must provide the InputFormat, OutputFormat, and SerDe.

e.g.
HCatalog takes Hive's metastore, and wraps additional layers around it to provide these services. It comes with HCatInputFormat and HCatOutputFormat for MapReduce users, and HCatLoader and HCatStorer for Pig users. Taking the Pig script example above, using HCatalog it looks like:

A = load 'raw' using HCatLoader();
B = filter A by ds='20110225' and region='us' and property='news';

How to use Hcatalog command in HortonWork? detials you will get in below link:
http://hortonworks.com/hadoop-tutorial/ ... -commands/

Guest

Re: What is HCatalog? What are application of HCatalog?

Postby Guest » Tue Jul 22, 2014 5:09 pm

HCatalog is a metadata abstraction layer for referencing data without using the underlying file­names or formats. It insulates users and scripts from how and where the data is physically stored.
Templeton provides a REST-like web API for HCatalog and related Hadoop components. Appli­cation developers make HTTP requests to access the Hadoop MapReduce, Pig, Hive, and HCat­alog DDL from within the applications. Data and code used by Templeton is maintained in HDFS. HCatalog DDL commands are executed directly when requested.



Return to “Hadoop and Big Data”

Who is online

Users browsing this forum: No registered users and 1 guest