[深度學習]實現一個博弈型的AI，從五子棋開始（1）

阿新 • • 發佈：2017-11-14

com class svm 顏色 display 深度 images += have

好久沒有寫過博客了，多久，大概8年？？？最近重新把寫作這事兒撿起來……最近在折騰AI，寫個AI相關的給團隊的小夥伴們看吧。

搞了這麽多年的機器學習，從分類到聚類，從樸素貝葉斯到SVM，從神經網絡到深度學習，各種神秘的項目裏用了無數次，但是感覺幹的各種事情離我們生活還是太遠了。最近AlphaGo Zero的發布，深度學習又火了一把，小夥伴們按捺不住內心的躁動，要搞一個遊戲AI，好吧，那就從規則簡單、老少皆宜的五子棋開始講起。

好了，廢話就說這麽多，下面進入第一講，實現一個五子棋。

小夥伴：此處省去吐槽一萬字，說好的講深度學習，怎麽開始扯實現一個五子棋程序了，大哥你不按套路出牌啊……

我：工欲善其事必先利其器，要實現五子棋的AI，連棋都沒有，AI個錘子！

老羅：什麽事？

……

五子棋分為有禁手和無禁手，我們先實現一個普通版本的無禁手版本作為例子，因為這個不影響我們實現一個AI。補充說明一下，無禁手黑棋必勝，經過比賽和各種研究，人們逐漸知道了這個事實就開始想辦法來限制黑棋先手優勢。於是出現了有禁手規則，規定黑棋不能下三三，四四和長連。但隨著比賽的結果的研究的繼續進行，發現其實即使是對黑棋有禁手限制，還是不能阻止黑棋開局必勝的事實，像直指開局中花月，山月，雲月，溪月，寒星等，斜指開局中的名月，浦月，恒星，峽月，嵐月都是黑棋必勝。於是日本人繼續提出了交換和換打的思想，到了後來發展成了國際比賽中三手交換和五手二打規則，防止執黑者下出必勝開局或者在第五手下出必勝打。所以結論是，在不正規的比賽規則或者無禁手情況下，黑棋必勝是存在的。

（1）五子棋下棋邏輯實現

這裏用Python來實現，因為之後的機器學習庫也是Python的，方便一點。

界面和邏輯要分開，解耦合，這個是毋庸置疑的，並且之後還要訓練AI，分離這是必須的。所以我們先來實現一個五子棋的邏輯。

我們先來考慮五子棋是一個15*15的棋盤，棋盤上的每一個交叉點（或格子）上一共會有3種狀態：空白、黑棋、白棋，所以先建個文件 consts.py

做如下定義：

from enum import Enum

N = 15

class ChessboardState(Enum):
    EMPTY = 0
    BLACK = 1
    WHITE  
= 2

棋盤的狀態，我們先用一個15*15的二維數組chessMap來表示，建一個類 gobang.py

currentI、currentJ、currentState 分別表示當前這步著棋的坐標和顏色，再定義一個get和set函數，最基本的框架就出來了，代碼如下：

from enum import Enum
from consts import *

class GoBang(object):
    def __init__(self):
        self.__chessMap = [[ChessboardState.EMPTY for j in range(N)] for i in range(N)]
        self.__currentI = -1
        self.__currentJ = -1
        self.__currentState = ChessboardState.EMPTY

    def get_chessMap(self):
        return self.__chessMap

    def get_chessboard_state(self, i, j):
        return self.__chessMap[i][j]

    def set_chessboard_state(self, i, j, state):
        self.__chessMap[i][j] = state
        self.__currentI = i
        self.__currentJ = j
        self.__currentState = state

這樣界面端可以調用get函數來獲取各個格子的狀態來決定是否繪制棋子，以及繪制什麽樣的棋子；每次下棋的時候呢，在對應的格子上，通過坐標來設置棋盤Map的狀態。

所以最基本的展示和下棋，上面的邏輯就夠了，接下來幹什麽呢，得考慮每次下棋之後，set了對應格子的狀態，是不是需要判斷當前有沒有獲勝。所以還需要再加兩個函數來幹這個事情，思路就是從當前位置從東、南、西、北、東南、西南、西北、東北8個方向，4根軸，看是否有連續的大於5顆相同顏色的棋子出現。假設我們目前落子在棋盤正中，需要判斷的位置如下圖所示的米字形。

技術分享

那代碼怎麽寫呢，最最笨的辦法，按照字面意思來翻譯咯，比如橫軸，先看當前位置左邊有多少顆連續同色的，再看右邊有多少顆連續同色的，左邊加右邊，就是當前橫軸上的連續數，如果大於5，則勝利。

    def have_five(self, current_i, current_j):
        #四個方向計數 豎 橫 左斜 右斜
        hcount = 1

        temp = ChessboardState.EMPTY

        #H-左
        for j in range(current_j - 1, -1, -1):  #橫向往左 from (current_j - 1) to 0
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1#H-右
        for j in range(current_j + 1, N):  #橫向往右 from (current_j + 1) to N
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1#H-結果
        if hcount >= 5:
            return True

以此類推，再看豎軸、再看左斜、再看又斜，於是，have_five函數變成這樣了：

    def have_five(self, current_i, current_j):
        #四個方向計數 豎 橫 左斜 右斜
        hcount = 1
        vcount = 1
        lbhcount = 1
        rbhcount = 1

        temp = ChessboardState.EMPTY

        #H-左
        for j in range(current_j - 1, -1, -1):  #橫向往左 from (current_j - 1) to 0
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1
        #H-右
        for j in range(current_j + 1, N):  #橫向往右 from (current_j + 1) to N
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1
        #H-結果
        if hcount >= 5:
            return True
#V-上
        for i in range(current_i - 1, -1, -1):  # from (current_i - 1) to 0
            temp = self.__chessMap[i][current_j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            vcount = vcount + 1
        #V-下
        for i in range(current_i + 1, N):  # from (current_i + 1) to N
            temp = self.__chessMap[i][current_j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            vcount = vcount + 1
        #V-結果
        if vcount >= 5:
            return True
#LB-上
        for i, j in zip(range(current_i - 1, -1, -1), range(current_j - 1, -1, -1)):  
            temp = self.__chessMap[i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            lbhcount = lbhcount + 1
        #LB-下
        for i, j in zip(range(current_i + 1, N), range(current_j + 1, N)):  
            temp = self.__chessMap[i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            lbhcount = lbhcount + 1
        #LB-結果
        if lbhcount >= 5:
            return True
#RB-上
        for i, j in zip(range(current_i - 1, -1, -1), range(current_j + 1, N)):  
            temp = self.__chessMap[i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            rbhcount = rbhcount + 1
        #RB-下
        for i, j in zip(range(current_i + 1, N), range(current_j - 1, -1, -1)):  
            temp = self.__chessMap[i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            rbhcount = rbhcount + 1
        #LB-結果
        if rbhcount >= 5:
            return True

這樣是不是就寫完了，五子棋的邏輯全部實現~

NO，別高興得太早，我想說，我好惡心，上面那個代碼，簡直醜爆了，再看一眼，重復的寫了這麽多for，這麽多if，這麽多重復的代碼塊，讓我先去吐會兒……

好了，想想辦法怎麽改，至少分了4根軸，是重復的對不對，然後每根軸分別從正負兩個方向去統計，最後加起來，兩個方向，也是重復的對不對。

於是我們能不能只寫一個方向的代碼，分別調2次，然後4根軸，分別再調4次，2*4=8，一共8行代碼搞定試試。

因為有45°和135°這兩根斜軸的存在，所以方向上應該分別從x和y兩個軸來控制正負，於是可以這樣，先寫一個函數，按照方向來統計：

xdirection=0,ydirection=1 表示從y軸正向數；

xdirection=0,ydirection=-1 表示從y軸負向數；

xdirection=1,ydirection=1 表示從45°斜軸正向數；

……

不一一列舉了，再加上邊界條件的判斷，於是有了以下函數：

    def count_on_direction(self, i, j, xdirection, ydirection, color):
        count = 0
        for step in range(1, 5): #除當前位置外,朝對應方向再看4步
            if xdirection != 0 and (j + xdirection * step < 0 or j + xdirection * step >= N):
                break
            if ydirection != 0 and (i + ydirection * step < 0 or i + ydirection * step >= N):
                break
            if self.__chessMap[i + ydirection * step][j + xdirection * step] == color:
                count += 1
            else:
                break
        return count

於是乎，前面的have_five稍微長的好看了一點，可以變成這樣：

def have_five(self, i, j, color):
        #四個方向計數 豎 橫 左斜 右斜
        hcount = 1
        vcount = 1
        lbhcount = 1
        rbhcount = 1

        hcount += self.count_on_direction(i, j, -1, 0, color)
        hcount += self.count_on_direction(i, j, 1, 0, color)
        if hcount >= 5:
            return True

        vcount += self.count_on_direction(i, j, 0, -1, color)
        vcount += self.count_on_direction(i, j, 0, 1, color)
        if vcount >= 5:
            return True

        lbhcount += self.count_on_direction(i, j, -1, 1, color)
        lbhcount += self.count_on_direction(i, j, 1, -1, color)
        if lbhcount >= 5:
            return True

        rbhcount += self.count_on_direction(i, j, -1, -1, color)
        rbhcount += self.count_on_direction(i, j, 1, 1, color)
        if rbhcount >= 5:
            return True

還是一大排重復的代碼呀，我還是覺得它醜啊，我真的不是處女座，但是這個函數是真醜啊，能不能讓它再帥一點，當然可以，4個重復塊再收成一個函數，循環調4次，是不是可以，好，就這麽幹，於是have_five就又漂亮了一點點：

    def have_five(self, i, j, color):
        #四個方向計數 豎 橫 左斜 右斜
        directions = [[(-1, 0), (1, 0)],                       [(0, -1), (0, 1)],                       [(-1, 1), (1, -1)],                       [(-1, -1), (1, 1)]]

        for axis in directions:
            axis_count = 1
            for (xdirection, ydirection) in axis:
                axis_count += self.count_on_direction(i, j, xdirection, ydirection, color)
                if axis_count >= 5:
                    return True

        return False

嗯，感覺好多了，這下判斷是否有5顆相同顏色棋子的邏輯也有了，再加一個函數來給界面層返回結果，邏輯部分的代碼就差不多了：

    def get_chess_result(self):
        if self.have_five(self.__currentI, self.__currentJ, self.__currentState):
            return self.__currentState
        else:
            return ChessboardState.EMPTY

於是，五子棋邏輯代碼就寫完了，完整代碼 gobang.py 如下：

#coding:utf-8

from enum import Enum
from consts import *

class GoBang(object):
    def __init__(self):
        self.__chessMap = [[ChessboardState.EMPTY for j in range(N)] for i in range(N)]
        self.__currentI = -1
        self.__currentJ = -1
        self.__currentState = ChessboardState.EMPTY

    def get_chessMap(self):
        return self.__chessMap

    def get_chessboard_state(self, i, j):
        return self.__chessMap[i][j]

    def set_chessboard_state(self, i, j, state):
        self.__chessMap[i][j] = state
        self.__currentI = i
        self.__currentJ = j
        self.__currentState = state

    def get_chess_result(self):
        if self.have_five(self.__currentI, self.__currentJ, self.__currentState):
            return self.__currentState
        else:
            return ChessboardState.EMPTY

    def count_on_direction(self, i, j, xdirection, ydirection, color):
        count = 0
        for step in range(1, 5): #除當前位置外,朝對應方向再看4步
            if xdirection != 0 and (j + xdirection * step < 0 or j + xdirection * step >= N):
                break
            if ydirection != 0 and (i + ydirection * step < 0 or i + ydirection * step >= N):
                break
            if self.__chessMap[i + ydirection * step][j + xdirection * step] == color:
                count += 1
            else:
                break
        return count

    def have_five(self, i, j, color):
        #四個方向計數 豎 橫 左斜 右斜
        directions = [[(-1, 0), (1, 0)],                       [(0, -1), (0, 1)],                       [(-1, 1), (1, -1)],                       [(-1, -1), (1, 1)]]

        for axis in directions:
            axis_count = 1
            for (xdirection, ydirection) in axis:
                axis_count += self.count_on_direction(i, j, xdirection, ydirection, color)
                if axis_count >= 5:
                    return True

        return False

背景音：大哥，憋了半天，就憋出這麽不到60行代碼？

我：代碼不再多，實現則靈……

明天來給它加個render，前端界面就有了，就是一個簡單的完整遊戲了，至於AI，別急嘛。

好吧，就這樣…

[深度學習]實現一個博弈型的AI，從五子棋開始（1）

com class svm 顏色 display 深度 images += have 好久沒有寫過博客了，多久，大概8年？？？最近重新把寫作這事兒撿起來……最近在折騰AI，寫個AI相關的給團隊的小夥伴們看吧。搞了這麽多年的機器學習，從分

[深度學習]實現一個博弈型的AI，從五子棋開始（1）

[深度學習]實現一個博弈型的AI，從五子棋開始（1）

從新撿起c++，從stl開始（1）

從新撿起c++，從stl開始（2）

學習Qt之基礎篇——從入門開始（1）

深度學習中的三種梯度下降方式：批量（batch），隨機（stochastic），小批量（mini-batch）

深度學習方法：受限玻爾茲曼機RBM（三）模型求解，Gibbs sampling

設計模式-行為型模式，觀察者模式（13）

設計模式-行為型模式，責任鏈模式（10）

設計模式-行為型模式，解釋器模式（12）

設計模式- 結構型模式，裝飾器模式（5）

開發實戰：基於深度學習+maven+SSM+EasyUI的高校共享汽車管理系統（二）

開發實戰：基於深度學習+maven+SSM+EasyUI的高校共享汽車管理系統（一）

設計一個 Java 程式，自定義異常類，從命令列（鍵盤）輸入一個字串，如果該字串值為“XYZ”。。。

3_深度學習中顯示卡的使用和現存的分配（20181213）

深度學習在目標檢測中的應用及其tensorflowAPI實踐（二）

Qml實用技巧：在可視元素之前半透明覆蓋一個可視元素，阻止滑鼠透（介面）傳（防止點選到被遮擋的按鈕）

【深度學習】實時物體檢測框架Single-Shot MultiBox Detector（SSD）（1）概述

深度學習在目標檢測中的應用及其tensorflowAPI實踐（一）

用Python實現一個簡易的“聽歌識曲”demo（一）

-------實現一個類似迅雷的系統“福雷（FULEI）”

[深度學習]實現一個博弈型的AI，從五子棋開始（1）

相關推薦