写于2016.7月
最近项目需要在hbase上做统计分析,在本机上装了hive,结果跑小批量数据sum时报错:
hive> select count(*) from page_view; Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1468481753104_0014, Tracking URL = http://wangkai8:8088/proxy/application_1468481753104_0014/ Kill Command = /Users/wangkai8/app/hadoop-2.3.0-cdh5/bin/hadoop job -kill job_1468481753104_0014 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2016-07-14 18:38:11,888 Stage-1 map = 100%, reduce = 100% Ended Job = job_1468481753104_0014 with errors Error during job, obtaining debugging information... Examining task ID: task_1468481753104_0014_m_000000 (and more) from job job_1468481753104_0014 Task with the most failures(1): ----- Task ID: task_1468481753104_0014_r_000000 URL: http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1468481753104_0014&tipid=task_1468481753104_0014_r_000000 ----- Diagnostic Messages for this Task: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:351) at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:232) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 7 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:135) ... 12 more FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 HDFS Read: 499 HDFS Write: 218201 FAIL Total MapReduce CPU Time Spent: 0 msec hive>
查看yarn日志,一样也是空指针异常,还有个提示是No plan file found: hdfs://…
2016-07-14 15:37:15,211 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring retrieval request: __REDUCE_PLAN__
2016-07-14 15:37:15,211 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: hdfs://wangkai8/tmp/hive-wangkai8/hive_2016-07-14_15-37-06_635_5512340937878595139-1/-mr-10004/1e60df26-2b73-4384-adbc-651342dd48f8/reduce.xml
2016-07-14 15:37:15,211 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring cache key: __REDUCE_PLAN__
2016-07-14 15:37:15,212 WARN [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local (uberized) 'child' : java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:351)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:232)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 7 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:132)
... 12 more
2016-07-14 15:37:15,212 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1468481753104_0002_r_000000_0
2016-07-14 15:37:15,212 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1468481753104_0002_r_000000_0 is : 0.0
2016-07-14 15:37:15,212 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
2016-07-14 15:37:15,212 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from attempt_1468481753104_0002_r_000000_0: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:351)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:232)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 7 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:132)
... 12 more
2016-07-14 15:37:15,213 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1468481753104_0002_01_000001 taskAttempt attempt_1468481753104_0002_m_000000_0
2016-07-14 15:37:15,213 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1468481753104_0002_r_000000_0: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:351)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:232)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 7 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:132)
... 12 more
查看hive源码,发现gWork为null,导致空指针异常
//src: org.apache.hadoop.hive.ql.exec.mr.ExecReducer#configure
ObjectCache cache = ObjectCacheFactory.getCache(jc); ReduceWork gWork = (ReduceWork) cache.retrieve(PLAN_KEY); if (gWork == null) { gWork = Utilities.getReduceWork(job); cache.cache(PLAN_KEY, gWork); } else { Utilities.setReduceWork(job, gWork); } reducer = gWork.getReducer(); //gWork为null,导致空指针
接着查看Utilities.getReduceWork(job)方法,最后跟到org.apache.hadoop.hive.ql.exec.Utilities#getBaseWork:
/** * Returns the Map or Reduce plan * Side effect: the BaseWork returned is also placed in the gWorkMap * @param conf * @param name * @return BaseWork based on the name supplied will return null if name is null * @throws RuntimeException if the configuration files are not proper or if plan can not be loaded */ private static BaseWork getBaseWork(Configuration conf, String name) { BaseWork gWork = null; Path path = null; InputStream in = null; try { path = getPlanPath(conf, name); assert path != null; if (!gWorkMap.containsKey(path)) { Path localPath; if (ShimLoader.getHadoopShims().isLocalMode(conf)) { localPath = path; } else { localPath = new Path(name); } if (HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) { LOG.debug("Loading plan from string: "+path.toUri().getPath()); String planString = conf.get(path.toUri().getPath()); if (planString == null) { LOG.info("Could not find plan string in conf"); return null; } byte[] planBytes = Base64.decodeBase64(planString); in = new ByteArrayInputStream(planBytes); in = new InflaterInputStream(in); } else { in = new FileInputStream(localPath.toUri().getPath()); } if(MAP_PLAN_NAME.equals(name)){ if (ExecMapper.class.getName().equals(conf.get(MAPRED_MAPPER_CLASS))){ gWork = deserializePlan(in, MapWork.class, conf); } else if(RCFileMergeMapper.class.getName().equals(conf.get(MAPRED_MAPPER_CLASS))) { gWork = deserializePlan(in, MergeWork.class, conf); } else if(ColumnTruncateMapper.class.getName().equals(conf.get(MAPRED_MAPPER_CLASS))) { gWork = deserializePlan(in, ColumnTruncateWork.class, conf); } else if(PartialScanMapper.class.getName().equals(conf.get(MAPRED_MAPPER_CLASS))) { gWork = deserializePlan(in, PartialScanWork.class,conf); } else { throw new RuntimeException("unable to determine work from configuration ." + MAPRED_MAPPER_CLASS + " was "+ conf.get(MAPRED_MAPPER_CLASS)) ; } } else if (REDUCE_PLAN_NAME.equals(name)) { if(ExecReducer.class.getName().equals(conf.get(MAPRED_REDUCER_CLASS))) { gWork = deserializePlan(in, ReduceWork.class, conf); } else { throw new RuntimeException("unable to determine work from configuration ." + MAPRED_REDUCER_CLASS +" was "+ conf.get(MAPRED_REDUCER_CLASS)) ; } } gWorkMap.put(path, gWork); } else { LOG.debug("Found plan in cache."); gWork = gWorkMap.get(path); } return gWork; } catch (FileNotFoundException fnf) { // happens. e.g.: no reduce work. LOG.info("No plan file found: "+path); return null; } catch (Exception e) { LOG.error("Failed to load plan: "+path, e); throw new RuntimeException(e); } finally { if (in != null) { try { in.close(); } catch (IOException cantBlameMeForTrying) { } } } }
再结合yarn报错信息:2016-07-14 15:37:15,211 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: hdfs://wangkai8/tmp/hive-wangkai8/hive_2016-07-14_15-37-06_635_5512340937878595139-1/-mr-10004/1e60df26-2b73-4384-adbc-651342dd48f8/reduce.xml
很明显reduce.xml找不到,报FileNotFoundException:No plan file found,但通过dfs fs -ls hdfs://wangkai8/tmp/hive-wangkai8/hive_2016-07-14_15-37-06_635_5512340937878595139-1/-mr-10004/1e60df26-2b73-4384-adbc-651342dd48f8/reduce.xml查看文件,文件的确是存在的,文件存在还报这个错,那这个问题就很诡异了。
接着在hive中映射一张hbase表,此表记录较多,大概50万条,同样执行select count操作:
hive> select count(*) from qry_ins_user; Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1468481753104_0016, Tracking URL = http://wangkai8:8088/proxy/application_1468481753104_0016/ Kill Command = /Users/wangkai8/app/hadoop-2.3.0-cdh5/bin/hadoop job -kill job_1468481753104_0016 Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1 2016-07-14 19:42:16,196 Stage-1 map = 0%, reduce = 0% 2016-07-14 19:42:30,570 Stage-1 map = 50%, reduce = 0% 2016-07-14 19:42:32,627 Stage-1 map = 100%, reduce = 0% 2016-07-14 19:42:38,792 Stage-1 map = 100%, reduce = 100% Ended Job = job_1468481753104_0016 MapReduce Jobs Launched: Job 0: Map: 2 Reduce: 1 HDFS Read: 566 HDFS Write: 7 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK 509844 Time taken: 34.023 seconds, Fetched: 1 row(s) hive>
这个居然能执行成功,3条记录执行就报错?这个问题不能不说更诡异了。后面各种调试,最后从AppMaster日志中得到关键信息:
Log Type: syslog Log Length: 155414 2016-07-14 18:40:17,962 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1468481753104_0015_000001 2016-07-14 18:40:18,211 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2016-07-14 18:40:18,230 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2016-07-14 18:40:18,327 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2016-07-14 18:40:18,340 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: 2016-07-14 18:40:18,340 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@6b7a5e49) 2016-07-14 18:40:18,363 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: The specific max attempts: 2 for application: 15. Attempt num: 1 is last retry: false 2016-07-14 18:40:18,452 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: dfs.datanode.data.dir; Ignoring. 2016-07-14 18:40:18,454 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2016-07-14 18:40:18,460 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: dfs.namenode.name.dir; Ignoring. 2016-07-14 18:40:18,464 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2016-07-14 18:40:18,837 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config org.apache.hadoop.hive.shims.HadoopShimsSecure$NullOutputCommitter 2016-07-14 18:40:18,839 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.hive.shims.HadoopShimsSecure$NullOutputCommitter 2016-07-14 18:40:18,876 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler 2016-07-14 18:40:18,877 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher 2016-07-14 18:40:18,877 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher 2016-07-14 18:40:18,878 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher 2016-07-14 18:40:18,878 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler 2016-07-14 18:40:18,882 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher 2016-07-14 18:40:18,882 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter 2016-07-14 18:40:18,883 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter 2016-07-14 18:40:18,940 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler 2016-07-14 18:40:19,105 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-07-14 18:40:19,147 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-07-14 18:40:19,147 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started 2016-07-14 18:40:19,153 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1468481753104_0015 to jobTokenSecretManager 2016-07-14 18:40:19,239 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Uberizing job job_1468481753104_0015: 1m+1r tasks (132 input bytes) will run sequentially on single node. 2016-07-14 18:40:19,257 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1468481753104_0015 = 132. Number of splits = 1 2016-07-14 18:40:19,258 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1468481753104_0015 = 1 2016-07-14 18:40:19,258 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1468481753104_0015Job Transitioned from NEW to INITED 2016-07-14 18:40:19,258 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster uberizing job job_1468481753104_0015 in local container ("uber-AM") on node wangkai8:59734. 2016-07-14 18:40:19,280 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2016-07-14 18:40:19,286 INFO [Socket Reader #1 for port 57486] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 57486 2016-07-14 18:40:19,301 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2016-07-14 18:40:19,258 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster uberizing job job_1468481753104_0015 in local container ("uber-AM") on node wangkai8:59734.
reduce跑在local模式,难道50万记录的表跑在集群模式?查看50万条记录对应AppMaster日志:
Log Type: syslog Log Length: 47618 2016-07-14 19:42:13,007 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1468481753104_0016_000001 2016-07-14 19:42:13,254 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2016-07-14 19:42:13,277 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2016-07-14 19:42:13,366 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2016-07-14 19:42:13,380 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: 2016-07-14 19:42:13,380 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@5f4829f5) 2016-07-14 19:42:13,407 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: The specific max attempts: 2 for application: 16. Attempt num: 1 is last retry: false 2016-07-14 19:42:13,499 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: dfs.datanode.data.dir; Ignoring. 2016-07-14 19:42:13,502 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2016-07-14 19:42:13,508 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: dfs.namenode.name.dir; Ignoring. 2016-07-14 19:42:13,511 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2016-07-14 19:42:14,025 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config org.apache.hadoop.hive.shims.HadoopShimsSecure$NullOutputCommitter 2016-07-14 19:42:14,027 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.hive.shims.HadoopShimsSecure$NullOutputCommitter 2016-07-14 19:42:14,071 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler 2016-07-14 19:42:14,072 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher 2016-07-14 19:42:14,073 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher 2016-07-14 19:42:14,073 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher 2016-07-14 19:42:14,074 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler 2016-07-14 19:42:14,080 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher 2016-07-14 19:42:14,081 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter 2016-07-14 19:42:14,082 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter 2016-07-14 19:42:14,159 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler 2016-07-14 19:42:14,409 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-07-14 19:42:14,480 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-07-14 19:42:14,480 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started 2016-07-14 19:42:14,487 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1468481753104_0016 to jobTokenSecretManager 2016-07-14 19:42:14,603 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1468481753104_0016 because: too much input; 2016-07-14 19:42:14,628 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1468481753104_0016 = 446693376. Number of splits = 2 2016-07-14 19:42:14,630 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1468481753104_0016 = 1 2016-07-14 19:42:14,630 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1468481753104_0016Job Transitioned from NEW to INITED 2016-07-14 19:42:14,631 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1468481753104_0016. 2016-07-14 19:42:14,666 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2016-07-14 19:42:14,676 INFO [Socket Reader #1 for port 59418] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 59418 2016-07-14 19:42:14,698 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2016-07-14 19:42:14,603 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1468481753104_0016 because: too much input;
好吧,果然是集群模式,那就查看org.apache.hadoop.mapreduce.v2.app.job.impl.JobImp源码,看看local、集群模式是怎么决定的:
/** * Decide whether job can be run in uber mode based on various criteria. * @param dataInputLength Total length for all splits */ private void makeUberDecision(long dataInputLength) { //FIXME: need new memory criterion for uber-decision (oops, too late here; // until AM-resizing supported, // must depend on job client to pass fat-slot needs) // these are no longer "system" settings, necessarily; user may override int sysMaxMaps = conf.getInt(MRJobConfig.JOB_UBERTASK_MAXMAPS, 9); int sysMaxReduces = conf.getInt(MRJobConfig.JOB_UBERTASK_MAXREDUCES, 1); long sysMaxBytes = conf.getLong(MRJobConfig.JOB_UBERTASK_MAXBYTES, fs.getDefaultBlockSize(this.remoteJobSubmitDir)); // FIXME: this is wrong; get FS from // [File?]InputFormat and default block size // from that long sysMemSizeForUberSlot = conf.getInt(MRJobConfig.MR_AM_VMEM_MB, MRJobConfig.DEFAULT_MR_AM_VMEM_MB); long sysCPUSizeForUberSlot = conf.getInt(MRJobConfig.MR_AM_CPU_VCORES, MRJobConfig.DEFAULT_MR_AM_CPU_VCORES); boolean uberEnabled = conf.getBoolean(MRJobConfig.JOB_UBERTASK_ENABLE, false); boolean smallNumMapTasks = (numMapTasks <= sysMaxMaps); boolean smallNumReduceTasks = (numReduceTasks <= sysMaxReduces); boolean smallInput = (dataInputLength <= sysMaxBytes); // ignoring overhead due to UberAM and statics as negligible here: boolean smallMemory = ( (Math.max(conf.getLong(MRJobConfig.MAP_MEMORY_MB, 0), conf.getLong(MRJobConfig.REDUCE_MEMORY_MB, 0)) <= sysMemSizeForUberSlot) || (sysMemSizeForUberSlot == JobConf.DISABLED_MEMORY_LIMIT)); boolean smallCpu = ( Math.max( conf.getInt( MRJobConfig.MAP_CPU_VCORES, MRJobConfig.DEFAULT_MAP_CPU_VCORES), conf.getInt( MRJobConfig.REDUCE_CPU_VCORES, MRJobConfig.DEFAULT_REDUCE_CPU_VCORES)) <= sysCPUSizeForUberSlot ); boolean notChainJob = !isChainJob(conf); // User has overall veto power over uberization, or user can modify // limits (overriding system settings and potentially shooting // themselves in the head). Note that ChainMapper/Reducer are // fundamentally incompatible with MR-1220; they employ a blocking // queue between the maps/reduces and thus require parallel execution, // while "uber-AM" (MR AM + LocalContainerLauncher) loops over tasks // and thus requires sequential execution. isUber = uberEnabled && smallNumMapTasks && smallNumReduceTasks && smallInput && smallMemory && smallCpu && notChainJob; if (isUber) { LOG.info("Uberizing job " + jobId + ": " + numMapTasks + "m+" + numReduceTasks + "r tasks (" + dataInputLength + " input bytes) will run sequentially on single node."); // make sure reduces are scheduled only after all map are completed conf.setFloat(MRJobConfig.COMPLETED_MAPS_FOR_REDUCE_SLOWSTART, 1.0f); // uber-subtask attempts all get launched on same node; if one fails, // probably should retry elsewhere, i.e., move entire uber-AM: ergo, // limit attempts to 1 (or at most 2? probably not...) conf.setInt(MRJobConfig.MAP_MAX_ATTEMPTS, 1); conf.setInt(MRJobConfig.REDUCE_MAX_ATTEMPTS, 1); // disable speculation conf.setBoolean(MRJobConfig.MAP_SPECULATIVE, false); conf.setBoolean(MRJobConfig.REDUCE_SPECULATIVE, false); } else { StringBuilder msg = new StringBuilder(); msg.append("Not uberizing ").append(jobId).append(" because:"); if (!uberEnabled) msg.append(" not enabled;"); if (!smallNumMapTasks) msg.append(" too many maps;"); if (!smallNumReduceTasks) msg.append(" too many reduces;"); if (!smallInput) msg.append(" too much input;"); if (!smallMemory) msg.append(" too much RAM;"); if (!notChainJob) msg.append(" chainjob;"); LOG.info(msg.toString()); } }
mapreduce是否跑在local模式由7个变量决定:uberEnabled && smallNumMapTasks && smallNumReduceTasks&& smallInput && smallMemory && smallCpu && notChainJob,只有这7个变量同时满足才能跑在local模式。
首先要满足uberEnabled为true,看看uberEnable定义:uberEnabled = conf.getBoolean(MRJobConfig.JOB_UBERTASK_ENABLE, false),MRJobConfig.JOB_UBERTASK_ENABLE对应配置项为mapreduce.job.ubertask.enable(mapred-site.xml),说明mapreduce.job.ubertask.enable配置为true,查看mapred-site.xml配置文件,果然是true
<property> <name>mapreduce.job.ubertask.enable</name> <value>true</value> <description>Whether to enable the small-jobs "ubertask" optimization, which runs "sufficiently small" jobs sequentially within a single JVM. "Small" is defined by the following maxmaps, maxreduces, and maxbytes settings. Users may override this value. </description> </property>
问题找到了就好办,第一种办法直接把mapreduce.job.ubertask.enable改为false,都走集群模式;
第二种办法是mapreduce.job.ubertask.enable还是为true,继续分析reduce.xml文件找不到的原因,还是查看读取reduce.xml那块:org.apache.hadoop.hive.ql.exec.Utilities#getBaseWork
if (HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) { LOG.debug("Loading plan from string: "+path.toUri().getPath()); String planString = conf.get(path.toUri().getPath()); if (planString == null) { LOG.info("Could not find plan string in conf"); return null; } byte[] planBytes = Base64.decodeBase64(planString); in = new ByteArrayInputStream(planBytes); in = new InflaterInputStream(in); } else { in = new FileInputStream(localPath.toUri().getPath()); }
可见hive执行计划文件由ConfVars.HIVE_RPC_QUERY_PLAN决定是通过rpc读集群上的文件还是读本地文件,默认为本地文件。
转自:
https://www.cnblogs.com/yueweimian/p/5671497.html