This is for Hadoop eco system like HDFS, Map reduce, Hive, Hbase, Pig, sqoop,sqoop2, Avro, solr, hcatalog, impala, Oozie, Zoo Keeper and Hadoop distribution like Cloudera, Hortonwork etc.
2 posts • Page 1 of 1
Hadoop Streaming is a generic API which allows writing Mappers and Reduces in any language. But the basic concept remains the same. Mappers and Reducers receive their input and output on stdin and stdout as (key, value) pairs.
Hadoop streaming is a utility that comes with the Hadoop distribution. The utility allows you to create and run Map/Reduce jobs with any executable or script as the mapper and/or the reducer. For example:
Code: Select all
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-input myInputDirs \
-output myOutputDir \
-mapper /bin/cat \
Users browsing this forum: No registered users and 2 guests