GenericOptionsParser, Tool, and ToolRunner in Hadoop

This is for Hadoop eco system like HDFS, Map reduce, Hive, Hbase, Pig, sqoop,sqoop2, Avro, solr, hcatalog, impala, Oozie, Zoo Keeper and Hadoop distribution like Cloudera, Hortonwork etc.
forum_admin
Site Admin
Posts: 185
Joined: Wed Jul 16, 2014 9:22 pm
Contact:

GenericOptionsParser, Tool, and ToolRunner in Hadoop

Postby forum_admin » Tue Jul 29, 2014 9:15 pm

what is GenericOptionsParser, Tool, and ToolRunner? What is the use of GenericOptionsParser, Tool, and ToolRunner in Hadoop?


Guest

Re: GenericOptionsParser, Tool, and ToolRunner in Hadoop

Postby Guest » Tue Jul 29, 2014 9:27 pm

By using ToolRunner.run(), any hadoop application can handle standard command line options supported by hadoop. ToolRunner uses GenericOptionsParser internally. In short, the hadoop specific options which are provided command line are parsed and set into the Configuration object of the application.
./hadoop YourHadoopCluster -D mapred.map.tasks=5

public class ToolRunner extends Object
ToolRunner can be used to run classes implementing Tool interface. It works with GenericOptionsParser to parse the generic hadoop command line arguments and modifies the Configuration of the Tool. The application-specific options are passed along without being modified.

public interface Tool extends Configurable

A tool interface that supports handling of generic command-line options.

Tool, is the standard for any Map-Reduce tool/application. The tool/application should delegate the handling of standard command-line options to ToolRunner.run(Tool, String[]) and only handle its custom arguments.

public class GenericOptionsParser extends Object
GenericOptionsParser is a utility to parse command line arguments generic to the Hadoop framework. It interprets common Hadoop Command Line Option and it will set it in Configuration Object.
GenericOptionsParser recognizes several standarad command line arguments, enabling applications to easily specify a namenode, a jobtracker, additional configuration resources etc.
for more details refer below blog
http://randomzonein.blogspot.com/2013/0 ... l-and.html
The few supported generic options are:

-conf <configuration file> specify a configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|jobtracker:port> specify a job tracker


Return to “Hadoop and Big Data”

Who is online

Users browsing this forum: No registered users and 2 guests