local模式运行
这种方式访问本地文件,完全在本地执行,与集群无关,适用于开发阶段逻辑验证。

 

yarn模式运行
这种方式是真正的集群方式运行,将程序打成jar包上传到集群服务器上执行hadoop jar命令执行。生产环境使用。

 

Windows或者Linux上远程提交到yarn运行
这种方式是在Windows或者Linux上将jar包提交到集群中执行,提交jar包的主机无需安装hadoop集群,做法如下:

1、将如下配置文件拷贝到项目的resources目录下

core-site.xml
hdfs-site.xml
mapred-site.xml
yarn-site.xml

2、代码中指定执行的jar包

job.setJar(“G:\\idea_workspace\\MapReduce\\out\\artifacts\\MapReduce_jar\\MapReduce.jar”);
3、如果是windows环境需要配置跨平台

两种方式,第一种方式在程序中加入如下代码:

Configuration configuration=new Configuration();
configuration.set(“mapreduce.app-submission.cross-platform”,”true”);
另一种方式是在mapred-site.xml配置文件中添加如下配置

<property>
<name>mapreduce.app-submission.cross-platform</name>
<value>true</value>
</property>
过程中可能会出现如下问题:

(1)问题1

2021-02-22 20:24:16,478 INFO  [main] client.RMProxy (RMProxy.java:createRMProxy(98)) – Connecting to ResourceManager at single/192.168.128.11:8032
2021-02-22 20:24:17,003 WARN  [main] mapreduce.JobResourceUploader (JobResourceUploader.java:uploadFiles(64)) – Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2021-02-22 20:24:17,185 INFO  [main] input.FileInputFormat (FileInputFormat.java:listStatus(283)) – Total input paths to process : 1
2021-02-22 20:24:17,236 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(198)) – number of splits:1
2021-02-22 20:24:17,306 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(287)) – Submitting tokens for job: job_1608473235348_0006
2021-02-22 20:24:17,730 INFO  [main] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(273)) – Submitted application application_1608473235348_0006
2021-02-22 20:24:17,769 INFO  [main] mapreduce.Job (Job.java:submit(1294)) – The url to track the job: http://single:8088/proxy/application_1608473235348_0006/
2021-02-22 20:24:17,769 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1339)) – Running job: job_1608473235348_0006
2021-02-22 20:24:25,870 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1360)) – Job job_1608473235348_0006 running in uber mode : false
2021-02-22 20:24:25,872 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) –  map 0% reduce 0%
2021-02-22 20:24:25,885 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1380)) – Job job_1608473235348_0006 failed with state FAILED due to: Application application_1608473235348_0006 failed 2 times due to AM Container for appattempt_1608473235348_0006_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://single:8088/cluster/app/application_1608473235348_0006Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1608473235348_0006_02_000001
Exit code: 1
Exception message: /bin/bash: line 0: fg: no job control

Stack trace: ExitCodeException exitCode=1: /bin/bash: line 0: fg: no job control

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
    at org.apache.hadoop.util.Shell.run(Shell.java:479)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
2021-02-22 20:24:25,901 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1385)) – Counters: 0

 

问题1解决方法:

mapred-site.xml添加如下配置

<property>
<name>mapreduce.app-submission.cross-platform</name>
<value>true</value>
</property>
或者代码中

conf.set(“mapreduce.app-submission.cross-platform”,”true”);

 

(2)问题:2

2021-02-22 20:30:34,703 INFO  [main] client.RMProxy (RMProxy.java:createRMProxy(98)) – Connecting to ResourceManager at single/192.168.128.11:8032
2021-02-22 20:30:35,206 WARN  [main] mapreduce.JobResourceUploader (JobResourceUploader.java:uploadFiles(64)) – Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2021-02-22 20:30:35,221 WARN  [main] mapreduce.JobResourceUploader (JobResourceUploader.java:uploadFiles(171)) – No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2021-02-22 20:30:35,229 INFO  [main] input.FileInputFormat (FileInputFormat.java:listStatus(283)) – Total input paths to process : 1
2021-02-22 20:30:35,425 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(198)) – number of splits:1
2021-02-22 20:30:35,500 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(287)) – Submitting tokens for job: job_1608473235348_0007
2021-02-22 20:30:35,607 INFO  [main] mapred.YARNRunner (YARNRunner.java:createApplicationSubmissionContext(371)) – Job jar is not present. Not adding any jar to the list of resources.
2021-02-22 20:30:35,646 INFO  [main] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(273)) – Submitted application application_1608473235348_0007
2021-02-22 20:30:35,673 INFO  [main] mapreduce.Job (Job.java:submit(1294)) – The url to track the job: http://single:8088/proxy/application_1608473235348_0007/
2021-02-22 20:30:35,673 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1339)) – Running job: job_1608473235348_0007
2021-02-22 20:31:11,316 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1360)) – Job job_1608473235348_0007 running in uber mode : false
2021-02-22 20:31:11,319 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) –  map 0% reduce 0%
2021-02-22 20:31:25,813 INFO  [main] mapreduce.Job (Job.java:printTaskEvents(1406)) – Task Id : attempt_1608473235348_0007_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.leboop.www.wordcount.WordCountMapper not found
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
    at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class com.leboop.www.wordcount.WordCountMapper not found
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
    … 8 more

 

问题2解决方法:代码中指定jar包路径

如下:

job.setJar(“G:\\idea_workspace\\MapReduce\\MapReduce.jar”);
注意MapReduce.jar必须是添加了如上代码后的jar包。
 

(3)问题3

2021-02-22 21:22:18,957 WARN  [main] shortcircuit.DomainSocketFactory (DomainSocketFactory.java:<init>(117)) – The short-circuit local reads feature cannot be used because UNIX Domain sockets are not available on Windows.
2021-02-22 21:22:21,418 INFO  [main] impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(297)) – Timeline service address: http://hdp22:8188/ws/v1/timeline/
2021-02-22 21:22:21,542 INFO  [main] client.RMProxy (RMProxy.java:createRMProxy(98)) – Connecting to ResourceManager at hdp22/192.168.128.22:8050
Exception in thread “main” java.lang.IllegalArgumentException: Unable to parse ‘/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework’ as a URI, check the setting for mapreduce.application.framework.path
    at org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:443)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:142)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
    at com.leboop.www.wordcount.WordCountMain.main(WordCountMain.java:42)
Caused by: java.net.URISyntaxException: Illegal character in path at index 11: /hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework
    at java.net.URI$Parser.fail(URI.java:2848)
    at java.net.URI$Parser.checkChars(URI.java:3021)
    at java.net.URI$Parser.parseHierarchical(URI.java:3105)
    at java.net.URI$Parser.parse(URI.java:3063)
    at java.net.URI.<init>(URI.java:588)
    at org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:441)
    … 9 more

 

问题3解决方法:mapred-site.xml

<property>
<name>mapreduce.application.framework.path</name>
<value>/hdp/apps/2.6.3.0-235/mapreduce/mapreduce.tar.gz#mr-framework</value>
</property>
(4)问题4

2021-02-22 21:25:23,677 WARN  [main] shortcircuit.DomainSocketFactory (DomainSocketFactory.java:<init>(117)) – The short-circuit local reads feature cannot be used because UNIX Domain sockets are not available on Windows.
2021-02-22 21:25:24,633 INFO  [main] impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(297)) – Timeline service address: http://hdp22:8188/ws/v1/timeline/
2021-02-22 21:25:24,643 INFO  [main] client.RMProxy (RMProxy.java:createRMProxy(98)) – Connecting to ResourceManager at hdp22/192.168.128.22:8050
Exception in thread “main” org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /user/root/.staging. Name node is in safe mode.
The reported blocks 0 needs additional 47 blocks to reach the threshold 1.0000 of total blocks 46.
The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.

问题4解决方法:关闭安全模式

hdfs dfsadmin -safemode leave

 

(5)问题5

2021-02-22 21:30:09,623 WARN  [main] shortcircuit.DomainSocketFactory (DomainSocketFactory.java:<init>(117)) – The short-circuit local reads feature cannot be used because UNIX Domain sockets are not available on Windows.
2021-02-22 21:30:10,492 INFO  [main] impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(297)) – Timeline service address: http://hdp22:8188/ws/v1/timeline/
2021-02-22 21:30:10,502 INFO  [main] client.RMProxy (RMProxy.java:createRMProxy(98)) – Connecting to ResourceManager at hdp22/192.168.128.22:8050
Exception in thread “main” org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode=”/user/root/.staging”:hdfs:hdfs:drwxr-xr-x
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353)
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:325)
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:246)
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
    at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1956)
    at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1940)
    at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1923)

 

问题5解决办法:权限问题

如果是在window上,设置系统变量HADOOP_USER_NAME=hdfs

用户名具体是哪个根据实际情况设置。可参见《Java API 操作HDFS权限问题》

————————————————
版权声明:本文为CSDN博主「leboop-L」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/L_15156024189/article/details/113954410