Spring Cloud Ribbon 踩坑記錄及原理解析
宣告:程式碼不是我寫的=_=
現象
前兩天碰到一個ribbon相關的問題,覺得值得記錄一下。表象是對外的介面返回內部異常,這個是封裝的統
一錯誤資訊,Spring的異常處理器catch到未捕獲異常統一返回的資訊。因此到日誌平臺檢視實際的異常:
org.springframework.web.client.HttpClientErrorException: 404 null
這裡介紹一下背景,出現問題的開放閘道器,做點事情說白了就是轉發對應的請求給後端的服務。這裡用到了ribbon去做服務負載均衡、eureka負責服務發現。
這裡出現404,首先看了下請求的url以及對應的引數,都沒有發現問題,對應的後端服務也沒有收到請求。這就比較詭異了,開始懷疑是ribbon或者Eureka的快取導致請求到了錯誤的ip或埠,但由於日誌中列印的是Eureka的serviceId而不是實際的ip:port,因此先加了個日誌:
@Slf4j public class CustomHttpRequestInterceptor implements ClientHttpRequestInterceptor { @Override public ClientHttpResponse intercept(HttpRequest request, byte[] body, ClientHttpRequestExecution execution) throws IOException { log.info("Request , url:{},method:{}.", request.getURI(), request.getMethod()); return execution.execute(request, body); } }
這裡是通過給RestTemplate新增攔截器的方式,但要注意,ribbon也是通過給RestTemplate新增攔截器實現的解析serviceId到實際的ip:port,因此需要注意下優先順序新增到ribbon的 LoadBalancerInterceptor
之後,我這裡是通過Spring的初始化完成事件的回撥中新增的,另外也添加了另一條日誌,在catch到這個異常的時候,利用Eureka的 DiscoveryClient#getInstances
獲取到當前的例項資訊。
之後在測試環境中復現了這個問題,看了下日誌,eurek中快取的例項資訊是對的,但是實際呼叫的確實另外一個服務的地址,從而導致了介面404。
原始碼解析
從上述的資訊中可以知道,問題出在ribbon中,具體的原因後面會說,這裡先講一下Spring Cloud Ribbon的初始化流程。
@Configuration @ConditionalOnClass({ IClient.class, RestTemplate.class, AsyncRestTemplate.class, Ribbon.class}) @RibbonClients @AutoConfigureAfter(name = "org.springframework.cloud.netflix.eureka.EurekaClientAutoConfiguration") @AutoConfigureBefore({LoadBalancerAutoConfiguration.class, AsyncLoadBalancerAutoConfiguration.class}) @EnableConfigurationProperties({RibbonEagerLoadProperties.class, ServerIntrospectorProperties.class}) public class RibbonAutoConfiguration { }
注意這個註解 @RibbonClients
, 如果想要覆蓋Spring Cloud提供的預設Ribbon配置就可以使用這個註解,最終的解析類是:
public class RibbonClientConfigurationRegistrar implements ImportBeanDefinitionRegistrar { @Override public void registerBeanDefinitions(AnnotationMetadata metadata, BeanDefinitionRegistry registry) { Map<String, Object> attrs = metadata.getAnnotationAttributes( RibbonClients.class.getName(), true); if (attrs != null && attrs.containsKey("value")) { AnnotationAttributes[] clients = (AnnotationAttributes[]) attrs.get("value"); for (AnnotationAttributes client : clients) { registerClientConfiguration(registry, getClientName(client), client.get("configuration")); } } if (attrs != null && attrs.containsKey("defaultConfiguration")) { String name; if (metadata.hasEnclosingClass()) { name = "default." + metadata.getEnclosingClassName(); } else { name = "default." + metadata.getClassName(); } registerClientConfiguration(registry, name, attrs.get("defaultConfiguration")); } Map<String, Object> client = metadata.getAnnotationAttributes( RibbonClient.class.getName(), true); String name = getClientName(client); if (name != null) { registerClientConfiguration(registry, name, client.get("configuration")); } } private String getClientName(Map<String, Object> client) { if (client == null) { return null; } String value = (String) client.get("value"); if (!StringUtils.hasText(value)) { value = (String) client.get("name"); } if (StringUtils.hasText(value)) { return value; } throw new IllegalStateException( "Either 'name' or 'value' must be provided in @RibbonClient"); } private void registerClientConfiguration(BeanDefinitionRegistry registry, Object name, Object configuration) { BeanDefinitionBuilder builder = BeanDefinitionBuilder .genericBeanDefinition(RibbonClientSpecification.class); builder.addConstructorArgValue(name); builder.addConstructorArgValue(configuration); registry.registerBeanDefinition(name + ".RibbonClientSpecification", builder.getBeanDefinition()); } }
atrrs包含defaultConfiguration,因此會註冊RibbonClientSpecification型別的bean,注意名稱以 default.
開頭,型別是RibbonAutoConfiguration,注意上面說的RibbonAutoConfiguration被@RibbonClients修飾。
然後再回到上面的原始碼:
public class RibbonAutoConfiguration { //上文中會解析被@RibbonClients註解修飾的類,然後註冊型別為RibbonClientSpecification的bean。 //主要有兩個: RibbonAutoConfiguration、RibbonEurekaAutoConfiguration @Autowired(required = false) private List<RibbonClientSpecification> configurations = new ArrayList<>(); @Bean public SpringClientFactory springClientFactory() { //初始化SpringClientFactory,並將上面的配置注入進去,這段很重要。 SpringClientFactory factory = new SpringClientFactory(); factory.setConfigurations(this.configurations); return factory; } //其他的都是提供一些預設的bean配置 @Bean @ConditionalOnMissingBean(LoadBalancerClient.class) public LoadBalancerClient loadBalancerClient() { return new RibbonLoadBalancerClient(springClientFactory()); } @Bean @ConditionalOnClass(name = "org.springframework.retry.support.RetryTemplate") @ConditionalOnMissingBean public LoadBalancedRetryPolicyFactory loadBalancedRetryPolicyFactory(SpringClientFactory clientFactory) { return new RibbonLoadBalancedRetryPolicyFactory(clientFactory); } @Bean @ConditionalOnMissingClass(value = "org.springframework.retry.support.RetryTemplate") @ConditionalOnMissingBean public LoadBalancedRetryPolicyFactory neverRetryPolicyFactory() { return new LoadBalancedRetryPolicyFactory.NeverRetryFactory(); } @Bean @ConditionalOnClass(name = "org.springframework.retry.support.RetryTemplate") @ConditionalOnMissingBean public LoadBalancedBackOffPolicyFactory loadBalancedBackoffPolicyFactory() { return new LoadBalancedBackOffPolicyFactory.NoBackOffPolicyFactory(); } @Bean @ConditionalOnClass(name = "org.springframework.retry.support.RetryTemplate") @ConditionalOnMissingBean public LoadBalancedRetryListenerFactory loadBalancedRetryListenerFactory() { return new LoadBalancedRetryListenerFactory.DefaultRetryListenerFactory(); } @Bean @ConditionalOnMissingBean public PropertiesFactory propertiesFactory() { return new PropertiesFactory(); } @Bean @ConditionalOnProperty(value = "ribbon.eager-load.enabled", matchIfMissing = false) public RibbonApplicationContextInitializer ribbonApplicationContextInitializer() { return new RibbonApplicationContextInitializer(springClientFactory(), ribbonEagerLoadProperties.getClients()); } @Configuration @ConditionalOnClass(HttpRequest.class) @ConditionalOnRibbonRestClient protected static class RibbonClientConfig { @Autowired private SpringClientFactory springClientFactory; @Bean public RestTemplateCustomizer restTemplateCustomizer( final RibbonClientHttpRequestFactory ribbonClientHttpRequestFactory) { return new RestTemplateCustomizer() { @Override public void customize(RestTemplate restTemplate) { restTemplate.setRequestFactory(ribbonClientHttpRequestFactory); } }; } @Bean public RibbonClientHttpRequestFactory ribbonClientHttpRequestFactory() { return new RibbonClientHttpRequestFactory(this.springClientFactory); } } //TODO: support for autoconfiguring restemplate to use apache http client or okhttp @Target({ ElementType.TYPE, ElementType.METHOD }) @Retention(RetentionPolicy.RUNTIME) @Documented @Conditional(OnRibbonRestClientCondition.class) @interface ConditionalOnRibbonRestClient { } private static class OnRibbonRestClientCondition extends AnyNestedCondition { public OnRibbonRestClientCondition() { super(ConfigurationPhase.REGISTER_BEAN); } @Deprecated //remove in Edgware" @ConditionalOnProperty("ribbon.http.client.enabled") static class ZuulProperty {} @ConditionalOnProperty("ribbon.restclient.enabled") static class RibbonProperty {} } }
注意這裡的SpringClientFactory, ribbon預設情況下,每個eureka的serviceId(服務),都會分配自己獨立的Spring的上下文,即ApplicationContext, 然後這個上下文中包含了必要的一些bean,比如: ILoadBalancer
、 ServerListFilter
等。而Spring Cloud預設是使用RestTemplate封裝了ribbon的呼叫,核心是通過一個攔截器:
@Bean @ConditionalOnMissingBean public RestTemplateCustomizer restTemplateCustomizer( final LoadBalancerInterceptor loadBalancerInterceptor) { return new RestTemplateCustomizer() { @Override public void customize(RestTemplate restTemplate) { List<ClientHttpRequestInterceptor> list = new ArrayList<>( restTemplate.getInterceptors()); list.add(loadBalancerInterceptor); restTemplate.setInterceptors(list); } }; }
因此核心是通過這個攔截器實現的負載均衡:
public class LoadBalancerInterceptor implements ClientHttpRequestInterceptor { private LoadBalancerClient loadBalancer; private LoadBalancerRequestFactory requestFactory; @Override public ClientHttpResponse intercept(final HttpRequest request, final byte[] body, final ClientHttpRequestExecution execution) throws IOException { final URI originalUri = request.getURI(); //這裡傳入的url是解析之前的,即http://serviceId/服務地址的形式 String serviceName = originalUri.getHost(); //解析拿到對應的serviceId Assert.state(serviceName != null, "Request URI does not contain a valid hostname: " + originalUri); return this.loadBalancer.execute(serviceName, requestFactory.createRequest(request, body, execution)); } }
然後將請求轉發給LoadBalancerClient:
public class RibbonLoadBalancerClient implements LoadBalancerClient { @Override public <T> T execute(String serviceId, LoadBalancerRequest<T> request) throws IOException { ILoadBalancer loadBalancer = getLoadBalancer(serviceId); //獲取對應的LoadBalancer Server server = getServer(loadBalancer); //獲取伺服器,這裡會執行對應的分流策略,比如輪訓 //、隨機等 if (server == null) { throw new IllegalStateException("No instances available for " + serviceId); } RibbonServer ribbonServer = new RibbonServer(serviceId, server, isSecure(server, serviceId), serverIntrospector(serviceId).getMetadata(server)); return execute(serviceId, ribbonServer, request); } }
而這裡的LoadBalancer是通過上文中提到的SpringClientFactory獲取到的,這裡會初始化一個新的Spring上下文,然後將Ribbon預設的配置類,比如說: RibbonAutoConfiguration
、 RibbonEurekaAutoConfiguration
等新增進去, 然後將當前spring的上下文設定為parent,再呼叫refresh方法進行初始化。
public class SpringClientFactory extends NamedContextFactory<RibbonClientSpecification> { protected AnnotationConfigApplicationContext createContext(String name) { AnnotationConfigApplicationContext context = new AnnotationConfigApplicationContext(); if (this.configurations.containsKey(name)) { for (Class<?> configuration : this.configurations.get(name) .getConfiguration()) { context.register(configuration); } } for (Map.Entry<String, C> entry : this.configurations.entrySet()) { if (entry.getKey().startsWith("default.")) { for (Class<?> configuration : entry.getValue().getConfiguration()) { context.register(configuration); } } } context.register(PropertyPlaceholderAutoConfiguration.class, this.defaultConfigType); context.getEnvironment().getPropertySources().addFirst(new MapPropertySource( this.propertySourceName, Collections.<String, Object> singletonMap(this.propertyName, name))); if (this.parent != null) { // Uses Environment from parent as well as beans context.setParent(this.parent); } context.refresh(); return context; } }
最核心的就在這一段,也就是說對於每一個不同的serviceId來說,都擁有一個獨立的spring上下文,並且在第一次呼叫這個服務的時候,會初始化ribbon相關的所有bean, 如果不存在 才回去父context中去找。
再回到上文中根據分流策略獲取實際的ip:port的程式碼段:
public class RibbonLoadBalancerClient implements LoadBalancerClient { @Override public <T> T execute(String serviceId, LoadBalancerRequest<T> request) throws IOException { ILoadBalancer loadBalancer = getLoadBalancer(serviceId); //獲取對應的LoadBalancer Server server = getServer(loadBalancer); //獲取伺服器,這裡會執行對應的分流策略,比如輪訓 //、隨機等 if (server == null) { throw new IllegalStateException("No instances available for " + serviceId); } RibbonServer ribbonServer = new RibbonServer(serviceId, server, isSecure(server, serviceId), serverIntrospector(serviceId).getMetadata(server)); return execute(serviceId, ribbonServer, request); } } protected Server getServer(ILoadBalancer loadBalancer) { if (loadBalancer == null) { return null; } // 選擇對應的伺服器 return loadBalancer.chooseServer("default"); // TODO: better handling of key }
public class ZoneAwareLoadBalancer<T extends Server> extends DynamicServerListLoadBalancer<T> { @Override public Server chooseServer(Object key) { if (!ENABLED.get() || getLoadBalancerStats().getAvailableZones().size() <= 1) { logger.debug("Zone aware logic disabled or there is only one zone"); return super.chooseServer(key); //預設不配置可用區,走的是這段 } Server server = null; try { LoadBalancerStats lbStats = getLoadBalancerStats(); Map<String, ZoneSnapshot> zoneSnapshot = ZoneAvoidanceRule.createSnapshot(lbStats); logger.debug("Zone snapshots: {}", zoneSnapshot); if (triggeringLoad == null) { triggeringLoad = DynamicPropertyFactory.getInstance().getDoubleProperty( "ZoneAwareNIWSDiscoveryLoadBalancer." + this.getName() + ".triggeringLoadPerServerThreshold", 0.2d); } if (triggeringBlackoutPercentage == null) { triggeringBlackoutPercentage = DynamicPropertyFactory.getInstance().getDoubleProperty( "ZoneAwareNIWSDiscoveryLoadBalancer." + this.getName() + ".avoidZoneWithBlackoutPercetage", 0.99999d); } Set<String> availableZones = ZoneAvoidanceRule.getAvailableZones(zoneSnapshot, triggeringLoad.get(), triggeringBlackoutPercentage.get()); logger.debug("Available zones: {}", availableZones); if (availableZones != null &&availableZones.size() < zoneSnapshot.keySet().size()) { String zone = ZoneAvoidanceRule.randomChooseZone(zoneSnapshot, availableZones); logger.debug("Zone chosen: {}", zone); if (zone != null) { BaseLoadBalancer zoneLoadBalancer = getLoadBalancer(zone); server = zoneLoadBalancer.chooseServer(key); } } } catch (Exception e) { logger.error("Error choosing server using zone aware logic for load balancer={}", name, e); } if (server != null) { return server; } else { logger.debug("Zone avoidance logic is not invoked."); return super.chooseServer(key); } } //實際走到的方法 public Server chooseServer(Object key) { if (counter == null) { counter = createCounter(); } counter.increment(); if (rule == null) { return null; } else { try { return rule.choose(key); } catch (Exception e) { logger.warn("LoadBalancer [{}]:Error choosing server for key {}", name, key, e); return null; } } } }
也就是說最終會呼叫 IRule
選擇到一個節點,這裡支援很多策略,比如隨機、輪訓、響應時間權重等:

public interface IRule{ public Server choose(Object key); public void setLoadBalancer(ILoadBalancer lb); public ILoadBalancer getLoadBalancer(); }
這裡的LoadBalancer是在BaseLoadBalancer的構造器中設定的,上文說過,對於每一個serviceId服務來說,當第一次呼叫的時候會初始化對應的spring上下文,而這個上下文中包含了所有ribbon相關的bean,其中就包括ILoadBalancer、IRule。
原因
通過跟蹤堆疊,發現不同的serviceId,IRule是同一個, 而上文說過,每個serviceId都擁有自己獨立的上下文,包括獨立的loadBalancer、IRule,而IRule是同一個,因此懷疑是這個bean是通過parent context獲取到的,換句話說應用自己定義了一個這樣的bean。檢視程式碼果然如此。
這樣就會導致一個問題,IRule是共享的,而其他bean是隔離開的,因此後面的serviceId初始化的時候,會修改這個IRule的LoadBalancer, 導致之前的服務獲取到的例項資訊是錯誤的,從而導致介面404。
public class BaseLoadBalancer extends AbstractLoadBalancer implements PrimeConnections.PrimeConnectionListener, IClientConfigAware { public BaseLoadBalancer() { this.name = DEFAULT_NAME; this.ping = null; setRule(DEFAULT_RULE);// 這裡會設定IRule的loadbalancer setupPingTask(); lbStats = new LoadBalancerStats(DEFAULT_NAME); } }
解決方案
解決方法也很簡單,最簡單就將這個自定義的IRule的bean幹掉,另外更標準的做法是使用RibbonClients註解,具體做法可以參考文件。
總結
核心原因其實還是對於Spring Cloud的理解不夠深刻,用法有錯誤,導致出現了一些比較詭異的問題。對於自己使用的元件、框架、甚至於每一個註解,都要了解其原理,能夠清楚的說清楚這個註解有什麼效果,有什麼影響,而不是隻著眼於解決眼前的問題。
再次宣告:程式碼不是我寫的=_=