Advantages of small input split size in Hadoop

This is for Hadoop eco system like HDFS, Map reduce, Hive, Hbase, Pig, sqoop,sqoop2, Avro, solr, hcatalog, impala, Oozie, Zoo Keeper and Hadoop distribution like Cloudera, Hortonwork etc.
mohit123
Posts: 162
Joined: Sat Sep 20, 2014 11:29 pm
Contact:

Advantages of small input split size in Hadoop

Postby mohit123 » Sun Sep 21, 2014 3:47 am

What are the advantages and disadvantages of small input split size in Hadoop?


Guest

Re: Advantages of small input split size in Hadoop

Postby Guest » Sun Sep 21, 2014 11:49 pm

if the input splits are small, the processing will be better load-balanced since a faster node will be able to process proportionally more splits than a slower node.

But if the splits are too smaller than the default HDFS block size, then managing splits and creation of map tasks becomes an overhead than the job execution time.


Return to “Hadoop and Big Data”

Who is online

Users browsing this forum: No registered users and 2 guests