java - MapReduce Multiple Outputs: File Could Only Be Replicated to 0 Nodes, Instead of 1 -
i have reduce job , getting above error file replicated 0 nodes instead of 1. have searched online , saw problem data node, running other mapreduce jobs in workflow working. difference see using multiple outputs , specifying folder, sure path correct. here multiple outputs write line:
mos.write("mosname", new longwritable(key), value, outputfilepath);
the exact error getting is:
org.apache.hadoop.ipc.remoteexception(java.io.ioexception): file xxx replicated 0 nodes instead of minreplication (=1). there 7 datanode(s) running , no node(s) excluded in operation.
any appreciated.
i've had same issue, did not replicate when writing output context instead of multipleoutputs. far can tell it's caused fact multipleoutputs holds more data in memory longer.
the solution combination of:
(1) performing compression on output
fileoutputformat.setcompressoutput(job, true); fileoutputformat.setoutputcompressorclass(job, gzipcodec.class);
(2) giving job more memory (keep in mind jvm memory in java.opts has @ 80% of container memory)
-dmapreduce.map.memory.mb=3072 -dmapreduce.map.java.opts=-xmx2048m
Comments
Post a Comment