1. 程式人生 > >spark-調度策略之FAIR

spark-調度策略之FAIR

pre version 復制 1.0 解釋 and may stop sed

1、概述

spark有兩種調度模式:FIFO、FAIR。FIFO是先進先出,有很強的順序性,只有前一個處理完成後才會去處理後進來的。FAIR是公平調度,通過配置進行控制優先執行的任務。spark默認使用FIFO模式,如果應用場景裏面有很多比較大的查詢、也有很多小的查詢,此時建議使用FAIR模式可以先執行小的查詢在執行耗時比較舊的查詢。

2、配置

默認安裝spark後再conf目錄下有一個fairscheduler.xml.template文件,把此文件復制一份:

#cp fairscheduler.xml.template fairscheduler.xml

#cat fairscheduler.xml

<?xml version="1.0"?>

<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

<allocations>
<pool name="default">
<schedulingMode>FAIR</schedulingMode>
<weight>5</weight>
<minShare>22</minShare>
</pool>
</allocations>

參數解釋:

pool name:調度池的名稱

schedulingMode:調度模式,有兩種FIFO、FAIR

weight:配置某個線程池的資源權重,默認為1,這裏配置5,代表default池會獲得5倍的資源

minShare:給每個調度池指定一個最小的shares(cpu的核數),公平調度器通過權重重新分配資源之前總是試圖滿足所有活動調度池的最小share,默認為0

修改完fairscheduler.xml文件,還需要配置spark-default.conf,添加如下內容:

#cat spark-default.conf

spark.scheduler.mode FAIR
spark.scheduler.allocation.file /data/spark-2.2.0-bin-hadoop2.7/conf/fairscheduler.xml

3、使配置生效

#./stop-all.sh

#./start-all.sh

4、集群多任務使用

可以在fairscheduler.xml文件中添加多個調度池,配置不同的weight、minShare來控制,使用調度池要顯示指定:

SET spark.sql.thriftserver.scheduler.pool=default;

spark-調度策略之FAIR