1. 程式人生 > >tensorflow+tensorflow-serving+docker+grpc模型上線部署(不需bazel編譯,有程式碼)

tensorflow+tensorflow-serving+docker+grpc模型上線部署(不需bazel編譯,有程式碼)

系統環境ubuntu14.04(mac上裝的parallels虛擬機器)

Python36

Tensroflow 1.8.0

Tensorflow-serving 1.9.0(1.8官方不支援python3)

Docker 18.03.1-ce

grpc

Tensorflow-model-server

1.安裝Tensorflow

Pip3 install tensorflow

2.安裝tensorflow-serving

先安裝grpc相關依賴:

sudo apt-get update && sudo apt-get install -y \

        automake \

        build-essential \

        curl \

        libcurl3-dev \

        git \

        libtool \

        libfreetype6-dev \

        libpng12-dev \

        libzmq3-dev \

        pkg-config \

        python-dev \

        python-numpy \

        python-pip \

        software-properties-common \

        swig \

        zip \

        zlib1g-dev

安裝grpc:

Pip3 install grpcio

安裝tensorflow-serving-api

Pip install tensorflow-serving-api (tensorflow-serving-api1.9同時支援py2和py3,所以pip和pip3應該不影響)

安裝tensorflow-model-server(這一步實際上是替代了bazel編譯tensorflow-serving,因為bazel編譯有時候很難完全編譯成功,本人試了三天的bazel都沒成功,估計是人品)

echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list

curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -

這兩句是把tensorflow_model_server的網址對映到伺服器環境中,讓apt-get可以訪問到

sudo apt-get update && sudo apt-get install tensorflow-model-server

注:這一句可能有時候執行不成功,可能是網路原因,多試幾次即可,本人也是試了好幾天,才裝成功的,隨緣吧

3.安裝docker

第一步:

sudo apt-get install \

    linux-image-extra-$(uname -r) \

    linux-image-extra-virtual

第二步:

sudo dpkg -i /path/to/package.deb

第三步:

sudo docker run hello-world

測試通過則說明docker安裝成功,如下圖:

4.相關環境已經安裝完成,下面開始進行模型部署

網上下載serving映象:

docker pull tensorflow/serving:latest-devel

由於我之前pull了,所以顯示映象已安裝了,第一次執行這句的話,應該需要挺久,整個映象應該有1.17M左右(devel版本),看你網速了,下圖是我的image(映象)

用 serving映象建立容器:

docker run -it -p 8500:8500 tensorflow/serving:latest-devel

即進入了容器,如下圖

可以ls一下,看看docker容器是什麼樣子的,在容器裡面實際上就和在系統終端一樣,shell命令都可以使用,docker容器的基本原理感覺和虛擬機器比較像,就是開闢了一個空間,感覺也是一個虛擬機器,但是裡面的命令都是docker命令,而不是shell,但兩者很相似。可以直接cd root 進入到根目錄下,然後你就會發現,實際上和linux檔案系統差不多,幾乎一樣。

在ubuntu終端(需要另開一個終端,記住不要在docker容器裡面)將自己的模型檔案拷貝到容器中,model的下級目錄是包括模型版本號,我的本地模型檔案路徑/media/psf/AllFiles/Users/daijie/Downloads/docker_file/model/1533369504,1533369504下面一級包括的是.pb檔案和variable資料夾

docker cp /media/psf/AllFiles/Users/daijie/Downloads/docker_file/model  acfcf6826643:/online_model

注:acfcf6826643為container id,這句有點坑,一定要理解對路徑,docker cp 就和linux 中的cp一樣,如果指定目標路徑檔名,就相當於複製之後重新命名,如果不指定資料夾名,相當於直接複製

容器中執行tensorflow_model_server服務

tensorflow_model_server —port=8500 —-model_name=dnn —model_base_path=/online_model

如下圖,即伺服器端執行成功

即完成了server端的部署

5.在伺服器端執行client.py

# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow_serving.apis import classification_pb2
from tensorflow_serving.apis import prediction_service_pb2
from grpc.beta import implementations

#
def get_input(a_list):

	def _float_feature(value):
		if value=='':
			value=0.0
		return tf.train.Feature(float_list=tf.train.FloatList(value=[float(value)]))

	def _byte_feature(value):
		return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
	'''
	age,workclass,fnlwgt,education,education_num,marital_status,occupation,
	relationship,race,gender,capital_gain,capital_loss,hours_per_week,
	native_country,income_bracket=a_list.strip('\n').strip('.').split(',')
	'''
	feature_dict={
		'age':_float_feature(a_list[0]),
		'workclass':_byte_feature(a_list[1].encode()),
		'education':_byte_feature(a_list[3].encode()),
		'education_num':_float_feature(a_list[4]),
		'marital_status':_byte_feature(a_list[5].encode()),
		'occupation':_byte_feature(a_list[6].encode()),
		'relationship':_byte_feature(a_list[7].encode()),
		'capital_gain':_float_feature(a_list[10]),
		'capital_loss':_float_feature(a_list[11]),
		'hours_per_week':_float_feature(a_list[12]),
	}
	model_input=tf.train.Example(features=tf.train.Features(feature=feature_dict))
	return model_input

def main():
	channel = implementations.insecure_channel('10.211.44.8', 8500)#the ip and port of your server host
	stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)

	# the test samples
	examples = []
	f=open('adult.test','r')
	for line in f:
		line=line.strip('\n').strip('.').split(',')
		example=get_input(line)
		examples.append(example)

	request = classification_pb2.ClassificationRequest()
	request.model_spec.name = 'dnn'#your model_name which you set in docker  container
	request.input.example_list.examples.extend(examples)

	response = stub.Classify(request, 20.0)

	for index in range(len(examples)):
		print(index)
		max_class = max(response.result.classifications[index].classes, key=lambda c: c.score)
		re=response.result.classifications[index]
		print(max_class.label,max_class.score)# the prediction class and probability
if __name__=='__main__':
	main()

以上即完成了模型部署。

可能出現的錯誤:

執行client.py時出現:

Traceback (most recent call last):
  File "client.py", line 70, in <module>
    main()
  File "client.py", line 57, in main
    response = stub.Classify(request, 20.0)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/grpc/beta/_client_adaptations.py", line 309, in __call__
    self._request_serializer, self._response_deserializer)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/grpc/beta/_client_adaptations.py", line 195, in _blocking_unary_unary
    raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.ExpirationError: ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")

可能的原因:

1)ip地址和port號設定不對,檢視一下

2)可能是埠被佔用了,如果不會kill,就直接重啟伺服器,

ps -ef | grep 埠號

kill程序掉就行, 可能需要sudo,-9啥的。

反正樓主也在這卡了很久很久,當時實在沒辦法,然後萬念俱灰的時候直接重啟機器,再執行,完美通過。

有疑問的話可以再下面留言,基本上每天都會登CSDN,會的一定會回的