在yarn中的application詳情頁面
http://resourcemanager/cluster/app/$applicationId
或者通過application命令
yarn application -status $applicationId
只能看到應用啟動以來佔用的資源*時間統計,比如:
Aggregate Resource Allocation : 3962853 MB-seconds, 1466 vcore-seconds
到處都找不到這個應用當前實時的資源佔用情況,比如當前佔用了多少記憶體多少核,跟進yarn程式碼發現其實是有這個統計的:
org.apache.hadoop.yarn.api.records.ApplicationResourceUsageReport
public static ApplicationResourceUsageReport newInstance(
int numUsedContainers, int numReservedContainers, Resource usedResources,
Resource reservedResources, Resource neededResources, long memorySeconds,
long vcoreSeconds) {
ApplicationResourceUsageReport report =
Records.newRecord(ApplicationResourceUsageReport.class);
report.setNumUsedContainers(numUsedContainers);
report.setNumReservedContainers(numReservedContainers);
report.setUsedResources(usedResources);
report.setReservedResources(reservedResources);
report.setNeededResources(neededResources);
report.setMemorySeconds(memorySeconds);
report.setVcoreSeconds(vcoreSeconds);
return report;
}
其中usedResources就是當前的實時佔用資源情況,包括記憶體和cpu,這個統計是在YarnScheduler的介面中返回:
org.apache.hadoop.yarn.server.resourcemanager.scheduler.YarnScheduler
/**
* Get a resource usage report from a given app attempt ID.
* @param appAttemptId the id of the application attempt
* @return resource usage report for this given attempt
*/
@LimitedPrivate("yarn")
@Evolving
ApplicationResourceUsageReport getAppResourceUsageReport(
ApplicationAttemptId appAttemptId);
getAppResourceUsageReport方法被RMAppAttemptImpl.getApplicationResourceUsageReport呼叫:
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl
@Override
public ApplicationResourceUsageReport getApplicationResourceUsageReport() {
this.readLock.lock();
try {
ApplicationResourceUsageReport report =
scheduler.getAppResourceUsageReport(this.getAppAttemptId());
if (report == null) {
report = RMServerUtils.DUMMY_APPLICATION_RESOURCE_USAGE_REPORT;
}
AggregateAppResourceUsage resUsage =
this.attemptMetrics.getAggregateAppResourceUsage();
report.setMemorySeconds(resUsage.getMemorySeconds());
report.setVcoreSeconds(resUsage.getVcoreSeconds());
return report;
} finally {
this.readLock.unlock();
}
}
RMAppAttemptImpl.getApplicationResourceUsageReport被兩個地方呼叫:
第一個呼叫
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl
public ApplicationReport createAndGetApplicationReport(String clientUserName,
boolean allowAccess) {
...
appUsageReport = currentAttempt.getApplicationResourceUsageReport();
...
RMAppImpl.createAndGetApplicationReport會被ClientRMService.getApplications和ClientRMService.getApplicationReport呼叫,這兩個方法分別對應命令
yarn application -list
yarn application -status $applicationId
這兩個地方展示資訊的時候都沒展示usedResources,可能作者覺得這個實時資源佔用統計沒那麼重要。
詳見:
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService
第二個呼叫
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.AppInfo
public AppInfo(RMApp app, Boolean hasAccess, String schemePrefix) {
...
ApplicationResourceUsageReport resourceReport = attempt
.getApplicationResourceUsageReport();
if (resourceReport != null) {
Resource usedResources = resourceReport.getUsedResources();
allocatedMB = usedResources.getMemory();
allocatedVCores = usedResources.getVirtualCores();
runningContainers = resourceReport.getNumUsedContainers();
}
...
這個建構函式會在RMWebServices.getApp和RMWebServices.getApps時被呼叫,這是個service介面,對應url分別為:
http://resourcemanager/ws/v1/cluster/apps/$applicationId
http://resourcemanager/ws/v1/cluster/apps?state=RUNNING
這兩個介面的返回值中有實時資源佔用情況如下:
<allocatedMB>56320</allocatedMB>
<allocatedVCores>21</allocatedVCores>
分別對應實時記憶體佔用和實時CPU佔用;
詳見:
org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices
如果你發現spark應用記憶體的佔用比你分配的要多,可以參考這裡:https://www.cnblogs.com/barneywill/p/10102353.html