java - MapReduce Multiple Outputs: File Could Only Be Replicated to 0 Nodes, Instead of 1 -


i have reduce job , getting above error file replicated 0 nodes instead of 1. have searched online , saw problem data node, running other mapreduce jobs in workflow working. difference see using multiple outputs , specifying folder, sure path correct. here multiple outputs write line:

mos.write("mosname", new longwritable(key), value, outputfilepath); 

the exact error getting is:

org.apache.hadoop.ipc.remoteexception(java.io.ioexception): file xxx  replicated 0 nodes instead of minreplication (=1).  there 7  datanode(s) running , no node(s) excluded in operation. 

any appreciated.

i've had same issue, did not replicate when writing output context instead of multipleoutputs. far can tell it's caused fact multipleoutputs holds more data in memory longer.

the solution combination of:

(1) performing compression on output

fileoutputformat.setcompressoutput(job, true); fileoutputformat.setoutputcompressorclass(job, gzipcodec.class); 

(2) giving job more memory (keep in mind jvm memory in java.opts has @ 80% of container memory)

-dmapreduce.map.memory.mb=3072 -dmapreduce.map.java.opts=-xmx2048m 

Comments

Popular posts from this blog

javascript - Karma not able to start PhantomJS on Windows - Error: spawn UNKNOWN -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -

Nuget pack csproj using nuspec -