目前，TensorFlow官方推荐使用Bazel编译源码和安装，但许多公司常用的构建工具是CMake。TensorFlow官方并没有提供CMake的编译示例，但提供了MakeFile文件，所以可以直接使用make进行编译安装。另一方面，模型训练成功后，官方提供了TensorFlow Servering进行预测的托管，但这个方案过于复杂。对于许多机器学习团队来说，一般都有自己的一套模型托管和预测服务，如果使用TensorFlow Servering对现存业务的侵入性太大，使用TensorFlow C++ API来导入模型并提供预测服务能方便的嵌入大部分已有业务方案，对这些团队来说比较合适。

本文以一个简单网络介绍从线下训练到线上预测的整个流程，主要包括以下几点： * 使用Python接口训练模型 * 使用make编译TensorFlow源码，得到静态库 * 调用TensorFlow C++ API编写预测代码，使用CMake构建预测服务

使用Python接口训练模型

这里用一个简单的网络来介绍，主要目的是保存网络结构和参数，用于后续的预测。

import tensorflow as tf
import numpy as np

with tf.Session() as sess:
    a = tf.Variable(5.0, name='a')
    b = tf.Variable(6.0, name='b')
    c = tf.multiply(a, b, name='c')

    sess.run(tf.global_variables_initializer())

    print(a.eval()) # 5.0
    print(b.eval()) # 6.0
    print(c.eval()) # 30.0

    tf.train.write_graph(sess.graph_def, 'simple_model/', 'graph.pb', as_text=False)

这个网络有两个输入，a和b，输出是c，最后一行用来保存模型到simple_model目录。运行后会在simple_model目录下生成一个graph.pb的protobuf二进制文件，这个文件保存了网络的结构，由于这个例子里没有模型参数，所以没有保存checkpoint文件。

源码编译TensorFlow

官方详细介绍可以看这里源码编译TensorFlow。其实很简单，以maxOS为例，只要运行以下命令即可，其他操作系统也有相应的命令。编译过程大概需要半小时，成功后会在tensorflow/tensorflow/contrib/makefile/gen/lib下看到一个100多MB的libtensorflow-core.a库文件。maxOS需要使用build_all_linux.sh，并且只能用clang，因为有第三方依赖编译时把clang写死了。

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
tensorflow/contrib/makefile/build_all_linux.sh

后续如果要依赖TensorFlow的头文件和静态库做开发，tensorflow/tensorflow/contrib/makefile目录下的几个目录需要注意： * downloads 存放第三方依赖的一些头文件和静态库，比如nsync、Eigen等 * gen 存放TensorFlow生成的C++ PB头文件、TensorFlow的静态库、ProtoBuf的头文件和静态库等等

使用TensorFlow C++ API编写预测代码

预测代码主要包括以下几个步骤： * 创建Session * 导入之前生成的模型 * 将模型设置到创建的Session里 * 设置模型输入输出，调用Session的Run做预测 * 关闭Session

创建Session

Session* session;
Status status = NewSession(SessionOptions(), &session);
if (!status.ok()) {
  std::cout << status.ToString() << std::endl;
} else {
  std::cout << "Session created successfully" << std::endl;
}

导入模型

GraphDef graph_def;
Status status = ReadBinaryProto(Env::Default(), "../demo/simple_model/graph.pb", &graph_def);
if (!status.ok()) {
  std::cout << status.ToString() << std::endl;
} else {
  std::cout << "Load graph protobuf successfully" << std::endl;
}

将模型设置到创建的Session里

Status status = session->Create(graph_def);
if (!status.ok()) {
  std::cout << status.ToString() << std::endl;
} else {
  std::cout << "Add graph to session successfully" << std::endl;
}

设置模型输入

模型的输入输出都是Tensor或Sparse Tensor。

Tensor a(DT_FLOAT, TensorShape()); // input a
a.scalar<float>()() = 3.0;

Tensor b(DT_FLOAT, TensorShape()); // input b
b.scalar<float>()() = 2.0;

预测

std::vector<std::pair<string, tensorflow::Tensor>> inputs = {
  { "a", a },
  { "b", b },
}; // input

std::vector<tensorflow::Tensor> outputs; // output

Statuc status = session->Run(inputs, {"c"}, {}, &outputs);
if (!status.ok()) {
  std::cout << status.ToString() << std::endl;
} else {
  std::cout << "Run session successfully" << std::endl;
}

查看预测结果

auto c = outputs[0].scalar<float>();
std::cout << "output value: " << c() << std::endl;

关闭Session

session->Close();

完整的代码在https://github.com/formath/tensorflow-predictor-cpp，路径为src/simple_model.cc。

使用CMake构建预测代码

这里主要的问题是头文件和静态库的路径要正确，包括TensorFlow以及第三方依赖。以macOS为例，其他平台路径会不一样。

头文件路径

tensorflow // TensorFlow头文件
tensorflow/tensorflow/contrib/makefile/gen/proto // TensorFlow PB文件生成的pb.h头文件
tensorflow/tensorflow/contrib/makefile/gen/protobuf-host/include // ProtoBuf头文件
tensorflow/tensorflow/contrib/makefile/downloads/eigen // eigen头文件
tensorflow/tensorflow/contrib/makefile/downloads/nsync/public // nsync头文件

静态库路径

tensorflow/tensorflow/contrib/makefile/gen/lib // TensorFlow静态库
/tensorflow/tensorflow/contrib/makefile/gen/protobuf-host/lib // protobuf静态库
/tensorflow/tensorflow/contrib/makefile/downloads/nsync/builds/default.macos.c++11 // nsync静态库

编译时需要这些静态库

libtensorflow-core.a
libprotobuf.a
libnsync.a
其他: pthread m z

CMake构建

git clone https://github.com/formath/tensorflow-predictor-cpp.git
cd tensorflow-predictor-cpp
mkdir build && cd build
cmake ..
make

构建完成后在bin路径下会看到一个simple_model可执行文件，运行./simple_model即可看到输出output value: 6。需要注意的时，编译选项里一定要加这些-undefined dynamic_lookup -all_load，否则在编译和运行时会报错，原因可见dynamic_lookup和Error issues。

以上用c = a * b一个简单的网络来介绍整个流程，只要简单的修改即可应用到复杂模型中去，更复杂的一个例子可见src/deep_model.cc。

Blog

使用TensorFlow C++ API构建线上预测服务 - 篇1