OAL 基礎知識

基本介紹

OAL(Observability Analysis Language) 是一門用來分析流式資料的語言。

因為 OAL 聚焦於度量 Service 、 Service Instance 和 Endpoint 的指標,所以它學習和使用起來非常簡單。

OAL 基於 altlr 與 javassist 將 oal 指令碼轉化為動態生成的類檔案。

自從 6.3 版本後, OAL 引擎內建在 OAP 伺服器中,可以看做 oal-rt(OAL Runtime) 。 OAL 指令碼位置 OAL 配置目錄下( /config/oal ),使用者能夠更改指令碼並重啟生效。注意: OAL 指令碼仍然是一門編譯語言, oal-rt 動態的生成 Java 程式碼。

如果你配置了環境變數 SW_OAL_ENGINE_DEBUG=Y,能在工作目錄下的 oal-rt 目錄下找到生成的 Class 檔案。

語法

  1. // 宣告一個指標
  2. METRICS_NAME = from(SCOPE.(* | [FIELD][,FIELD ...])) // 從某一個SCOPE中獲取資料
  3. [.filter(FIELD OP [INT | STRING])] // 可以過濾掉部分資料
  4. .FUNCTION([PARAM][, PARAM ...]) // 使用某個聚合函式將資料聚合
  5. // 禁用一個指標
  6. disable(METRICS_NAME);

語法案例

oap-server/server-bootstrap/src/main/resources/oal/java-agent.oal

  1. // 從ServiceInstanceJVMMemory的used獲取資料,只需要 heapStatus 為 true的資料,並取long型的平均值
  2. instance_jvm_memory_heap = from(ServiceInstanceJVMMemory.used).filter(heapStatus == true).longAvg();

org.apache.skywalking.oap.server.core.source.ServiceInstanceJVMMemory

  1. @ScopeDeclaration(id = SERVICE_INSTANCE_JVM_MEMORY, name = "ServiceInstanceJVMMemory", catalog = SERVICE_INSTANCE_CATALOG_NAME)
  2. @ScopeDefaultColumn.VirtualColumnDefinition(fieldName = "entityId", columnName = "entity_id", isID = true, type = String.class)
  3. public class ServiceInstanceJVMMemory extends Source {
  4. @Override
  5. public int scope() {
  6. return DefaultScopeDefine.SERVICE_INSTANCE_JVM_MEMORY;
  7. }
  8. @Override
  9. public String getEntityId() {
  10. return String.valueOf(id);
  11. }
  12. @Getter @Setter
  13. private String id;
  14. @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "name", requireDynamicActive = true)
  15. private String name;
  16. @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_name", requireDynamicActive = true)
  17. private String serviceName;
  18. @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_id")
  19. private String serviceId;
  20. @Getter @Setter
  21. private boolean heapStatus;
  22. @Getter @Setter
  23. private long init;
  24. @Getter @Setter
  25. private long max;
  26. @Getter @Setter
  27. private long used;
  28. @Getter @Setter
  29. private long committed;
  30. }

可供參考的官方文件:Observability Analysis Language

從一個案例開始分析 OAL 原理

缺少的類載入資訊監控

預設的 APM/Instance 頁面,缺少關於 JVM Class 的資訊(如下圖所示),故這次將相關資訊補齊。由這次案例來分析 OAL 的原理。

Skywalking-04:擴充套件Metric監控資訊 中,講到了如何在已有 Source 類的情況下,增加一些指標。

這次直接連 Source 類以及 OAL 詞法語法關鍵字都自己定義。

可供參考的官方文件:Source and Scope extension for new metrics

確定增加的指標

通過Java ManagementFactory解析這篇文章,可以確定監控指標為“當前載入類的數量”、“已解除安裝類的數量”、“一共載入類的數量”三個指標

  1. ClassLoadingMXBean classLoadingMXBean = ManagementFactory.getClassLoadingMXBean();
  2. // 當前載入類的數量
  3. int loadedClassCount = classLoadingMXBean.getLoadedClassCount();
  4. // 已解除安裝類的數量
  5. long unloadedClassCount = classLoadingMXBean.getUnloadedClassCount();
  6. // 一共載入類的數量
  7. long totalLoadedClassCount = classLoadingMXBean.getTotalLoadedClassCount();

定義 agent 與 oap server 通訊類

apm-protocol/apm-network/src/main/proto/language-agent/JVMMetric.proto 協議檔案中增加如下定義。

apm-protocol/apm-network 目錄下執行 mvn clean package -DskipTests=true 會生成新的相關 Java 類,org.apache.skywalking.apm.network.language.agent.v3.Class 該類就是我們在程式碼中實際操作的類。

  1. message Class {
  2. int64 loadedClassCount = 1;
  3. int64 unloadedClassCount = 3;
  4. int64 totalLoadedClassCount = 2;
  5. }
  6. message JVMMetric {
  7. int64 time = 1;
  8. CPU cpu = 2;
  9. repeated Memory memory = 3;
  10. repeated MemoryPool memoryPool = 4;
  11. repeated GC gc = 5;
  12. Thread thread = 6;
  13. // 在JVM指標中新增Class的定義
  14. Class clazz = 7;
  15. }

收集 agent 的資訊後,將資訊傳送至 oap server

收集 Class 相關的指標資訊

  1. package org.apache.skywalking.apm.agent.core.jvm.clazz;
  2. import org.apache.skywalking.apm.network.language.agent.v3.Class;
  3. import java.lang.management.ClassLoadingMXBean;
  4. import java.lang.management.ManagementFactory;
  5. public enum ClassProvider {
  6. /**
  7. * instance
  8. */
  9. INSTANCE;
  10. private final ClassLoadingMXBean classLoadingMXBean;
  11. ClassProvider() {
  12. this.classLoadingMXBean = ManagementFactory.getClassLoadingMXBean();
  13. }
  14. // 構建class的指標資訊
  15. public Class getClassMetrics() {
  16. int loadedClassCount = classLoadingMXBean.getLoadedClassCount();
  17. long unloadedClassCount = classLoadingMXBean.getUnloadedClassCount();
  18. long totalLoadedClassCount = classLoadingMXBean.getTotalLoadedClassCount();
  19. return Class.newBuilder().setLoadedClassCount(loadedClassCount)
  20. .setUnloadedClassCount(unloadedClassCount)
  21. .setTotalLoadedClassCount(totalLoadedClassCount)
  22. .build();
  23. }
  24. }

org.apache.skywalking.apm.agent.core.jvm.JVMService#run 方法中,將 class 相關指標設定到 JVM 指標類中

  1. @Override
  2. public void run() {
  3. long currentTimeMillis = System.currentTimeMillis();
  4. try {
  5. JVMMetric.Builder jvmBuilder = JVMMetric.newBuilder();
  6. jvmBuilder.setTime(currentTimeMillis);
  7. jvmBuilder.setCpu(CPUProvider.INSTANCE.getCpuMetric());
  8. jvmBuilder.addAllMemory(MemoryProvider.INSTANCE.getMemoryMetricList());
  9. jvmBuilder.addAllMemoryPool(MemoryPoolProvider.INSTANCE.getMemoryPoolMetricsList());
  10. jvmBuilder.addAllGc(GCProvider.INSTANCE.getGCList());
  11. jvmBuilder.setThread(ThreadProvider.INSTANCE.getThreadMetrics());
  12. // 設定class的指標
  13. jvmBuilder.setClazz(ClassProvider.INSTANCE.getClassMetrics());
  14. // 將JVM的指標放在阻塞佇列中
  15. // org.apache.skywalking.apm.agent.core.jvm.JVMMetricsSender#run方法,會將相關資訊傳送至oap server
  16. sender.offer(jvmBuilder.build());
  17. } catch (Exception e) {
  18. LOGGER.error(e, "Collect JVM info fail.");
  19. }
  20. }

建立 Source 類

  1. public class DefaultScopeDefine {
  2. public static final int SERVICE_INSTANCE_JVM_CLASS = 11000;
  3. /** Catalog of scope, the metrics processor could use this to group all generated metrics by oal rt. */
  4. public static final String SERVICE_INSTANCE_CATALOG_NAME = "SERVICE_INSTANCE";
  5. }
  1. package org.apache.skywalking.oap.server.core.source;
  2. import lombok.Getter;
  3. import lombok.Setter;
  4. import static org.apache.skywalking.oap.server.core.source.DefaultScopeDefine.SERVICE_INSTANCE_CATALOG_NAME;
  5. import static org.apache.skywalking.oap.server.core.source.DefaultScopeDefine.SERVICE_INSTANCE_JVM_CLASS;
  6. @ScopeDeclaration(id = SERVICE_INSTANCE_JVM_CLASS, name = "ServiceInstanceJVMClass", catalog = SERVICE_INSTANCE_CATALOG_NAME)
  7. @ScopeDefaultColumn.VirtualColumnDefinition(fieldName = "entityId", columnName = "entity_id", isID = true, type = String.class)
  8. public class ServiceInstanceJVMClass extends Source {
  9. @Override
  10. public int scope() {
  11. return SERVICE_INSTANCE_JVM_CLASS;
  12. }
  13. @Override
  14. public String getEntityId() {
  15. return String.valueOf(id);
  16. }
  17. @Getter @Setter
  18. private String id;
  19. @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "name", requireDynamicActive = true)
  20. private String name;
  21. @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_name", requireDynamicActive = true)
  22. private String serviceName;
  23. @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_id")
  24. private String serviceId;
  25. @Getter @Setter
  26. private long loadedClassCount;
  27. @Getter @Setter
  28. private long unloadedClassCount;
  29. @Getter @Setter
  30. private long totalLoadedClassCount;
  31. }

將從 agent 獲取到的資訊,傳送至 SourceReceive

org.apache.skywalking.oap.server.analyzer.provider.jvm.JVMSourceDispatcher 進行如下修改

  1. public void sendMetric(String service, String serviceInstance, JVMMetric metrics) {
  2. long minuteTimeBucket = TimeBucket.getMinuteTimeBucket(metrics.getTime());
  3. final String serviceId = IDManager.ServiceID.buildId(service, NodeType.Normal);
  4. final String serviceInstanceId = IDManager.ServiceInstanceID.buildId(serviceId, serviceInstance);
  5. this.sendToCpuMetricProcess(
  6. service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getCpu());
  7. this.sendToMemoryMetricProcess(
  8. service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getMemoryList());
  9. this.sendToMemoryPoolMetricProcess(
  10. service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getMemoryPoolList());
  11. this.sendToGCMetricProcess(
  12. service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getGcList());
  13. this.sendToThreadMetricProcess(
  14. service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getThread());
  15. // class指標處理
  16. this.sendToClassMetricProcess(
  17. service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getClazz());
  18. }
  19. private void sendToClassMetricProcess(String service,
  20. String serviceId,
  21. String serviceInstance,
  22. String serviceInstanceId,
  23. long timeBucket,
  24. Class clazz) {
  25. // 拼裝Source物件
  26. ServiceInstanceJVMClass serviceInstanceJVMClass = new ServiceInstanceJVMClass();
  27. serviceInstanceJVMClass.setId(serviceInstanceId);
  28. serviceInstanceJVMClass.setName(serviceInstance);
  29. serviceInstanceJVMClass.setServiceId(serviceId);
  30. serviceInstanceJVMClass.setServiceName(service);
  31. serviceInstanceJVMClass.setLoadedClassCount(clazz.getLoadedClassCount());
  32. serviceInstanceJVMClass.setUnloadedClassCount(clazz.getUnloadedClassCount());
  33. serviceInstanceJVMClass.setTotalLoadedClassCount(clazz.getTotalLoadedClassCount());
  34. serviceInstanceJVMClass.setTimeBucket(timeBucket);
  35. // 將Source物件傳送至SourceReceive進行處理
  36. sourceReceiver.receive(serviceInstanceJVMClass);
  37. }

OAL 詞法定義和語法定義中加入 Source 相關資訊

oap-server/oal-grammar/src/main/antlr4/org/apache/skywalking/oal/rt/grammar/OALLexer.g4 定義 Class 關鍵字

  1. // Keywords
  2. FROM: 'from';
  3. FILTER: 'filter';
  4. DISABLE: 'disable';
  5. SRC_ALL: 'All';
  6. SRC_SERVICE: 'Service';
  7. SRC_SERVICE_INSTANCE: 'ServiceInstance';
  8. SRC_ENDPOINT: 'Endpoint';
  9. SRC_SERVICE_RELATION: 'ServiceRelation';
  10. SRC_SERVICE_INSTANCE_RELATION: 'ServiceInstanceRelation';
  11. SRC_ENDPOINT_RELATION: 'EndpointRelation';
  12. SRC_SERVICE_INSTANCE_JVM_CPU: 'ServiceInstanceJVMCPU';
  13. SRC_SERVICE_INSTANCE_JVM_MEMORY: 'ServiceInstanceJVMMemory';
  14. SRC_SERVICE_INSTANCE_JVM_MEMORY_POOL: 'ServiceInstanceJVMMemoryPool';
  15. SRC_SERVICE_INSTANCE_JVM_GC: 'ServiceInstanceJVMGC';
  16. SRC_SERVICE_INSTANCE_JVM_THREAD: 'ServiceInstanceJVMThread';
  17. SRC_SERVICE_INSTANCE_JVM_CLASS:'ServiceInstanceJVMClass'; // 在OAL詞法定義中新增Class的關鍵字
  18. SRC_DATABASE_ACCESS: 'DatabaseAccess';
  19. SRC_SERVICE_INSTANCE_CLR_CPU: 'ServiceInstanceCLRCPU';
  20. SRC_SERVICE_INSTANCE_CLR_GC: 'ServiceInstanceCLRGC';
  21. SRC_SERVICE_INSTANCE_CLR_THREAD: 'ServiceInstanceCLRThread';
  22. SRC_ENVOY_INSTANCE_METRIC: 'EnvoyInstanceMetric';

oap-server/oal-grammar/src/main/antlr4/org/apache/skywalking/oal/rt/grammar/OALParser.g4 新增 Class 關鍵字

  1. source
  2. : SRC_ALL | SRC_SERVICE | SRC_DATABASE_ACCESS | SRC_SERVICE_INSTANCE | SRC_ENDPOINT |
  3. SRC_SERVICE_RELATION | SRC_SERVICE_INSTANCE_RELATION | SRC_ENDPOINT_RELATION |
  4. SRC_SERVICE_INSTANCE_JVM_CPU | SRC_SERVICE_INSTANCE_JVM_MEMORY | SRC_SERVICE_INSTANCE_JVM_MEMORY_POOL |
  5. SRC_SERVICE_INSTANCE_JVM_GC | SRC_SERVICE_INSTANCE_JVM_THREAD | SRC_SERVICE_INSTANCE_JVM_CLASS |// 在OAL語法定義中新增詞法定義中定義的關鍵字
  6. SRC_SERVICE_INSTANCE_CLR_CPU | SRC_SERVICE_INSTANCE_CLR_GC | SRC_SERVICE_INSTANCE_CLR_THREAD |
  7. SRC_ENVOY_INSTANCE_METRIC |
  8. SRC_BROWSER_APP_PERF | SRC_BROWSER_APP_PAGE_PERF | SRC_BROWSER_APP_SINGLE_VERSION_PERF |
  9. SRC_BROWSER_APP_TRAFFIC | SRC_BROWSER_APP_PAGE_TRAFFIC | SRC_BROWSER_APP_SINGLE_VERSION_TRAFFIC
  10. ;

oap-server/oal-grammar 目錄下執行 mvn clean package -DskipTests=true 會生成新的相關 Java

定義 OAL 指標

oap-server/server-bootstrap/src/main/resources/oal/java-agent.oal 中新增基於 OAL 語法的 Class 相關指標定義

  1. // 當前載入類的數量
  2. instance_jvm_class_loaded_class_count = from(ServiceInstanceJVMClass.loadedClassCount).longAvg();
  3. // 已解除安裝類的數量
  4. instance_jvm_class_unloaded_class_count = from(ServiceInstanceJVMClass.unloadedClassCount).longAvg();
  5. // 一共載入類的數量
  6. instance_jvm_class_total_loaded_class_count = from(ServiceInstanceJVMClass.totalLoadedClassCount).longAvg();

配置 UI 面板

將如下介面配置匯入 APM 面板中

  1. {
  2. "name": "Instance",
  3. "children": [{
  4. "width": "3",
  5. "title": "Service Instance Load",
  6. "height": "250",
  7. "entityType": "ServiceInstance",
  8. "independentSelector": false,
  9. "metricType": "REGULAR_VALUE",
  10. "metricName": "service_instance_cpm",
  11. "queryMetricType": "readMetricsValues",
  12. "chartType": "ChartLine",
  13. "unit": "CPM - calls per minute"
  14. },
  15. {
  16. "width": 3,
  17. "title": "Service Instance Throughput",
  18. "height": "250",
  19. "entityType": "ServiceInstance",
  20. "independentSelector": false,
  21. "metricType": "REGULAR_VALUE",
  22. "metricName": "service_instance_throughput_received,service_instance_throughput_sent",
  23. "queryMetricType": "readMetricsValues",
  24. "chartType": "ChartLine",
  25. "unit": "Bytes"
  26. },
  27. {
  28. "width": "3",
  29. "title": "Service Instance Successful Rate",
  30. "height": "250",
  31. "entityType": "ServiceInstance",
  32. "independentSelector": false,
  33. "metricType": "REGULAR_VALUE",
  34. "metricName": "service_instance_sla",
  35. "queryMetricType": "readMetricsValues",
  36. "chartType": "ChartLine",
  37. "unit": "%",
  38. "aggregation": "/",
  39. "aggregationNum": "100"
  40. },
  41. {
  42. "width": "3",
  43. "title": "Service Instance Latency",
  44. "height": "250",
  45. "entityType": "ServiceInstance",
  46. "independentSelector": false,
  47. "metricType": "REGULAR_VALUE",
  48. "metricName": "service_instance_resp_time",
  49. "queryMetricType": "readMetricsValues",
  50. "chartType": "ChartLine",
  51. "unit": "ms"
  52. },
  53. {
  54. "width": 3,
  55. "title": "JVM CPU (Java Service)",
  56. "height": "250",
  57. "entityType": "ServiceInstance",
  58. "independentSelector": false,
  59. "metricType": "REGULAR_VALUE",
  60. "metricName": "instance_jvm_cpu",
  61. "queryMetricType": "readMetricsValues",
  62. "chartType": "ChartLine",
  63. "unit": "%",
  64. "aggregation": "+",
  65. "aggregationNum": ""
  66. },
  67. {
  68. "width": 3,
  69. "title": "JVM Memory (Java Service)",
  70. "height": "250",
  71. "entityType": "ServiceInstance",
  72. "independentSelector": false,
  73. "metricType": "REGULAR_VALUE",
  74. "metricName": "instance_jvm_memory_heap, instance_jvm_memory_heap_max,instance_jvm_memory_noheap, instance_jvm_memory_noheap_max",
  75. "queryMetricType": "readMetricsValues",
  76. "chartType": "ChartLine",
  77. "unit": "MB",
  78. "aggregation": "/",
  79. "aggregationNum": "1048576"
  80. },
  81. {
  82. "width": 3,
  83. "title": "JVM GC Time",
  84. "height": "250",
  85. "entityType": "ServiceInstance",
  86. "independentSelector": false,
  87. "metricType": "REGULAR_VALUE",
  88. "metricName": "instance_jvm_young_gc_time, instance_jvm_old_gc_time",
  89. "queryMetricType": "readMetricsValues",
  90. "chartType": "ChartLine",
  91. "unit": "ms"
  92. },
  93. {
  94. "width": 3,
  95. "title": "JVM GC Count",
  96. "height": "250",
  97. "entityType": "ServiceInstance",
  98. "independentSelector": false,
  99. "metricType": "REGULAR_VALUE",
  100. "queryMetricType": "readMetricsValues",
  101. "chartType": "ChartBar",
  102. "metricName": "instance_jvm_young_gc_count, instance_jvm_old_gc_count"
  103. },
  104. {
  105. "width": 3,
  106. "title": "JVM Thread Count (Java Service)",
  107. "height": "250",
  108. "entityType": "ServiceInstance",
  109. "independentSelector": false,
  110. "metricType": "REGULAR_VALUE",
  111. "queryMetricType": "readMetricsValues",
  112. "chartType": "ChartLine",
  113. "metricName": "instance_jvm_thread_live_count, instance_jvm_thread_daemon_count, instance_jvm_thread_peak_count,instance_jvm_thread_deadlocked,instance_jvm_thread_monitor_deadlocked"
  114. },
  115. {
  116. "width": 3,
  117. "title": "JVM Thread State Count (Java Service)",
  118. "height": "250",
  119. "entityType": "ServiceInstance",
  120. "independentSelector": false,
  121. "metricType": "REGULAR_VALUE",
  122. "metricName": "instance_jvm_thread_new_thread_count,instance_jvm_thread_runnable_thread_count,instance_jvm_thread_blocked_thread_count,instance_jvm_thread_wait_thread_count,instance_jvm_thread_time_wait_thread_count,instance_jvm_thread_terminated_thread_count",
  123. "queryMetricType": "readMetricsValues",
  124. "chartType": "ChartBar"
  125. },
  126. {
  127. "width": 3,
  128. "title": "JVM Class Count (Java Service)",
  129. "height": "250",
  130. "entityType": "ServiceInstance",
  131. "independentSelector": false,
  132. "metricType": "REGULAR_VALUE",
  133. "metricName": "instance_jvm_class_loaded_class_count,instance_jvm_class_unloaded_class_count,instance_jvm_class_total_loaded_class_count",
  134. "queryMetricType": "readMetricsValues",
  135. "chartType": "ChartArea"
  136. },
  137. {
  138. "width": 3,
  139. "title": "CLR CPU (.NET Service)",
  140. "height": "250",
  141. "entityType": "ServiceInstance",
  142. "independentSelector": false,
  143. "metricType": "REGULAR_VALUE",
  144. "metricName": "instance_clr_cpu",
  145. "queryMetricType": "readMetricsValues",
  146. "chartType": "ChartLine",
  147. "unit": "%"
  148. },
  149. {
  150. "width": 3,
  151. "title": "CLR GC (.NET Service)",
  152. "height": "250",
  153. "entityType": "ServiceInstance",
  154. "independentSelector": false,
  155. "metricType": "REGULAR_VALUE",
  156. "metricName": "instance_clr_gen0_collect_count, instance_clr_gen1_collect_count, instance_clr_gen2_collect_count",
  157. "queryMetricType": "readMetricsValues",
  158. "chartType": "ChartBar"
  159. },
  160. {
  161. "width": 3,
  162. "title": "CLR Heap Memory (.NET Service)",
  163. "height": "250",
  164. "entityType": "ServiceInstance",
  165. "independentSelector": false,
  166. "metricType": "REGULAR_VALUE",
  167. "metricName": "instance_clr_heap_memory",
  168. "queryMetricType": "readMetricsValues",
  169. "chartType": "ChartLine",
  170. "unit": "MB",
  171. "aggregation": "/",
  172. "aggregationNum": "1048576"
  173. },
  174. {
  175. "width": 3,
  176. "title": "CLR Thread (.NET Service)",
  177. "height": "250",
  178. "entityType": "ServiceInstance",
  179. "independentSelector": false,
  180. "metricType": "REGULAR_VALUE",
  181. "queryMetricType": "readMetricsValues",
  182. "chartType": "ChartLine",
  183. "metricName": "instance_clr_available_completion_port_threads,instance_clr_available_worker_threads,instance_clr_max_completion_port_threads,instance_clr_max_worker_threads"
  184. }
  185. ]
  186. }

結果校驗

可以看到匯入的介面中,已經有 Class 相關指標了

程式碼貢獻

參考文件

分享並記錄所學所見