在yarn中的application詳情頁面

http://resourcemanager/cluster/app/$applicationId

或者通過application命令

yarn application -status $applicationId

只能看到應用啟動以來佔用的資源*時間統計,比如:

Aggregate Resource Allocation : 3962853 MB-seconds, 1466 vcore-seconds

到處都找不到這個應用當前實時的資源佔用情況,比如當前佔用了多少記憶體多少核,跟進yarn程式碼發現其實是有這個統計的:

org.apache.hadoop.yarn.api.records.ApplicationResourceUsageReport

  public static ApplicationResourceUsageReport newInstance(
int numUsedContainers, int numReservedContainers, Resource usedResources,
Resource reservedResources, Resource neededResources, long memorySeconds,
long vcoreSeconds) {
ApplicationResourceUsageReport report =
Records.newRecord(ApplicationResourceUsageReport.class);
report.setNumUsedContainers(numUsedContainers);
report.setNumReservedContainers(numReservedContainers);
report.setUsedResources(usedResources);
report.setReservedResources(reservedResources);
report.setNeededResources(neededResources);
report.setMemorySeconds(memorySeconds);
report.setVcoreSeconds(vcoreSeconds);
return report;
}

其中usedResources就是當前的實時佔用資源情況,包括記憶體和cpu,這個統計是在YarnScheduler的介面中返回:

org.apache.hadoop.yarn.server.resourcemanager.scheduler.YarnScheduler

  /**
* Get a resource usage report from a given app attempt ID.
* @param appAttemptId the id of the application attempt
* @return resource usage report for this given attempt
*/
@LimitedPrivate("yarn")
@Evolving
ApplicationResourceUsageReport getAppResourceUsageReport(
ApplicationAttemptId appAttemptId);

getAppResourceUsageReport方法被RMAppAttemptImpl.getApplicationResourceUsageReport呼叫:

org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl

  @Override
public ApplicationResourceUsageReport getApplicationResourceUsageReport() {
this.readLock.lock();
try {
ApplicationResourceUsageReport report =
scheduler.getAppResourceUsageReport(this.getAppAttemptId());
if (report == null) {
report = RMServerUtils.DUMMY_APPLICATION_RESOURCE_USAGE_REPORT;
}
AggregateAppResourceUsage resUsage =
this.attemptMetrics.getAggregateAppResourceUsage();
report.setMemorySeconds(resUsage.getMemorySeconds());
report.setVcoreSeconds(resUsage.getVcoreSeconds());
return report;
} finally {
this.readLock.unlock();
}
}

RMAppAttemptImpl.getApplicationResourceUsageReport被兩個地方呼叫:

第一個呼叫

org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl

  public ApplicationReport createAndGetApplicationReport(String clientUserName,
boolean allowAccess) {
...
appUsageReport = currentAttempt.getApplicationResourceUsageReport();
...

RMAppImpl.createAndGetApplicationReport會被ClientRMService.getApplications和ClientRMService.getApplicationReport呼叫,這兩個方法分別對應命令

yarn application -list
yarn application -status $applicationId

這兩個地方展示資訊的時候都沒展示usedResources,可能作者覺得這個實時資源佔用統計沒那麼重要。

詳見:
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService

第二個呼叫

org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.AppInfo

  public AppInfo(RMApp app, Boolean hasAccess, String schemePrefix) {
...
ApplicationResourceUsageReport resourceReport = attempt
.getApplicationResourceUsageReport();
if (resourceReport != null) {
Resource usedResources = resourceReport.getUsedResources();
allocatedMB = usedResources.getMemory();
allocatedVCores = usedResources.getVirtualCores();
runningContainers = resourceReport.getNumUsedContainers();
}
...

這個建構函式會在RMWebServices.getApp和RMWebServices.getApps時被呼叫,這是個service介面,對應url分別為:

http://resourcemanager/ws/v1/cluster/apps/$applicationId
http://resourcemanager/ws/v1/cluster/apps?state=RUNNING

這兩個介面的返回值中有實時資源佔用情況如下:

<allocatedMB>56320</allocatedMB>
<allocatedVCores>21</allocatedVCores>

分別對應實時記憶體佔用和實時CPU佔用;

詳見:
org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices

如果你發現spark應用記憶體的佔用比你分配的要多,可以參考這裡:https://www.cnblogs.com/barneywill/p/10102353.html