How can I define or solve this error for hadoop streaming? -
i got errors hadoop mr job, how can define problem hadoop streaming?
error: java.io.eofexception: unexpected end of input stream @ org.apache.hadoop.io.compress.decompressorstream.decompress(decompressorstream.java:145) @ org.apache.hadoop.io.compress.decompressorstream.read(decompressorstream.java:85) @ java.io.inputstream.read(inputstream.java:101) @ org.apache.hadoop.util.linereader.fillbuffer(linereader.java:180) @ org.apache.hadoop.util.linereader.readdefaultline(linereader.java:216) @ org.apache.hadoop.util.linereader.readline(linereader.java:174) @ org.apache.hadoop.mapred.linerecordreader.next(linerecordreader.java:209) @ org.apache.hadoop.mapred.linerecordreader.next(linerecordreader.java:47) @ org.apache.hadoop.mapred.maptask$trackedrecordreader.movetonext(maptask.java:199) @ org.apache.hadoop.mapred.maptask$trackedrecordreader.next(maptask.java:185) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:63) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:432) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:175) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:415) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1548) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:170) error: java.io.eofexception: unexpected end of input stream @ org.apache.hadoop.io.compress.decompressorstream.decompress(decompressorstream.java:145) @ org.apache.hadoop.io.compress.decompressorstream.read(decompressorstream.java:85) @ java.io.inputstream.read(inputstream.java:101) @ org.apache.hadoop.util.linereader.fillbuffer(linereader.java:180) @ org.apache.hadoop.util.linereader.readdefaultline(linereader.java:216) @ org.apache.hadoop.util.linereader.readline(linereader.java:174) @ org.apache.hadoop.mapred.linerecordreader.next(linerecordreader.java:209) @ org.apache.hadoop.mapred.linerecordreader.next(linerecordreader.java:47) @ org.apache.hadoop.mapred.maptask$trackedrecordreader.movetonext(maptask.java:199) @ org.apache.hadoop.mapred.maptask$trackedrecordreader.next(maptask.java:185) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:63) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:432) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:175) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:415) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1548) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:170)
(unfortunately don't have permission post source code)
these error logs not helping much. since dont have permission share code, can try following steps.
- check dependent libraries used in code present in nodes of hadoop cluster. required because task may execute in of worker nodes.
get sample input file , execute code locally before running mapreduce program. can execute locally in following way.
cat sampleinput | python mappercode.py
Comments
Post a Comment