作者：iversonluo，騰訊 WXG 應用開發工程師

有些后臺同學將自己稱為 SQL Boy，因為負責的業務主要是對數據庫進行增刪改查。經常和 Proto 打交道的同學，是不是也會叫自己 PB Boy？因為大部分工作也是對 Proto 進行 SET 和 GET。面對大量重復且丑陋的代碼，除了宏是否有更好的解決方法？本文結合 PB 反射給出了我在運營系統開發工作中的一些代碼優化實踐。

一、背景

Protobuf(下文稱為 PB)是一種常見的數據序列化方式，常常用于后臺微服務之間傳遞數據。

筆者目前主要的工作都是和表單打交道，而表單一般涉及到大量的數據輸入，表單調用方一般將數據格式化為 JSON 后傳給 CGI，而 CGI 和后臺服務、后臺服務之前會用 PB 傳遞數據。

在寫代碼時，經常會遇到一些丑陋的、圈復雜度較高、較難維護的關于 PB 的使用代碼：

對字段的必填校驗硬編碼在代碼中：如果需要變更校驗規則，則需要修改代碼；
一個字段一個 if 校驗，圈復雜度較高：對傳進來的字段每個字段都進行多種規則校驗，例如長度，XSS，正則校驗等，一個校驗一個 if 代碼，代碼圈復雜度很高；
想要獲取 PB 中所有的非空字段，形成一個 map<string,string>，需要大量的 if 判斷和重復代碼；
在后臺服務間傳遞數據，由于模塊由不同的人開發，導致相同字段的命名不一樣，從一個 PB 中挑選一部分內容到另外一個 PB 中，需要大量的 GET 和 SET 代碼。

是否可以有方法解決上面的幾個問題呢？

答案是使用PB 反射。

二、PB 反射的使用

反射的一般定義如下：計算機程序在運行時可以訪問、檢測和修改它本身狀態或行為。

protobuf 的類圖如下：

從上圖我們可以看出，Message 類繼承于 MessageLite 類，業務一般自定義的 Person 類繼承于 Message 類。

Descriptor 類和 Reflection 類都聚合于 Message，是弱依賴的關系。

類名類描述Descriptor對 Message 進行描述，包括 message 的名字、所有字段的描述、原始 proto 文件內容等FieldDescriptor對 Message 中單個字段進行描述，包括字段名、字段屬性、原始的 field 字段等Reflection提供了動態讀和寫 message 中單個字段能力

所以一般使用 PB 反射的步驟如下：

1. 通過Message獲取單個字段的FieldDescriptor
2. 通過Message獲取其Reflection
3. 通過Reflection來操作FieldDescriptor，從而動態獲取或修改單個字段

獲取 Descript、Reflection 的函數：

const google::protobuf::Reflection* pReflection = pMessage->GetReflection();
const google::protobuf::Descriptor* pDescriptor = pMessage->GetDescriptor();

獲取 FieldDescriptor 的函數：

const google::protobuf::FieldDescriptor * pFieldDesc = pDescriptor->FindFieldByName(id);

下面分別介紹上面的三個類。

2.1 類 Descriptor 介紹

類 Descriptor 主要是對 Message 進行描述，包括 message 的名字、所有字段的描述、原始 proto 文件內容等，下面介紹該類中包含的函數。

首先是獲取自身信息的函數：

const std::string & name() const; // 獲取message自身名字
int field_count() const; // 獲取該message中有多少字段
const FileDescriptor* file() const; // The .proto file in which this message type was defined. Never nullptr.

在類 Descriptor 中，可以通過如下方法獲取類 FieldDescriptor：

const FieldDescriptor* field(int index) const; // 根據定義順序索引獲取，即從0開始到最大定義的條目
const FieldDescriptor* FindFieldByNumber(int number) const; // 根據定義的message里面的順序值獲取（option string name=3，3即為number）
const FieldDescriptor* FindFieldByName(const string& name) const; // 根據field name獲取
const FieldDescriptor* Descriptor::FindFieldByLowercaseName(const std::string & lowercase_name)const; // 根據小寫的field name獲取
const FieldDescriptor* Descriptor::FindFieldByCamelcaseName(const std::string & camelcase_name) const; // 根據駝峰的field name獲取

其中FieldDescriptor* field(int index)和FieldDescriptor* FindFieldByNumber(int number)這個函數中index和number的含義是不一樣的，如下所示：

message Student{
  optional string name = 1;
  optional string gender = 2;
  optional string phone = 5;
}

其中字段phone，其index為 5，但是其number為 2。

同時還有一個我們在調試中經常使用的函數：

std::string Descriptor::DebugString(); // 將message轉化成人可以識別出的string信息

2.2 類 FieldDescriptor 介紹

類 FieldDescriptor 的作用主要是對 Message 中單個字段進行描述，包括字段名、字段屬性、原始的 field 字段等。

其獲取獲取自身信息的函數：

const std::string & name() const; // Name of this field within the message.
const std::string & lowercase_name() const; // Same as name() except converted to lower-case.
const std::string & camelcase_name() const; // Same as name() except converted to camel-case.
CppType cpp_type() const; //C++ type of this field.

其中cpp_type()函數是來獲取該字段是什么類型的，在 PB 中，類型的類目如下：

enum FieldDescriptor::Type {
  TYPE_DOUBLE = = 1,
  TYPE_FLOAT = = 2,
  TYPE_INT64 = = 3,
  TYPE_UINT64 = = 4,
  TYPE_INT32 = = 5,
  TYPE_FIXED64 = = 6,
  TYPE_FIXED32 = = 7,
  TYPE_BOOL = = 8,
  TYPE_STRING = = 9,
  TYPE_GROUP = = 10,
  TYPE_MESSAGE = = 11,
  TYPE_BYTES = = 12,
  TYPE_UINT32 = = 13,
  TYPE_ENUM = = 14,
  TYPE_SFIXED32 = = 15,
  TYPE_SFIXED64 = = 16,
  TYPE_SINT32 = = 17,
  TYPE_SINT64 = = 18,
  MAX_TYPE = = 18
}

類 FieldDescriptor 中還可以判斷字段是否是必填，還是選填或者重復：

bool is_required() const; // 判斷字段是否是必填
bool is_optional() const; // 判斷字段是否是選填
bool is_repeated() const; // 判斷字段是否是重復值

類 FieldDescriptor 中還可以獲取單個字段的index或者tag:

int number() const; // Declared tag number.
int index() const; //Index of this field within the message's field array, or the file or extension scope's extensions array.

類 FieldDescriptor 中還有一個支持擴展的函數，函數如下：

// Get the FieldOptions for this field.  This includes things listed in
// square brackets after the field definition.  E.g., the field:
//   optional string text = 1 [ctype=CORD];
// has the "ctype" option set.  Allowed options are defined by FieldOptions in
// descriptor.proto, and any available extensions of that message.
const FieldOptions & FieldDescriptor::options() const

具體關于該函數的講解在 2.4 章。

2.3 類 Reflection 介紹

該類提供了動態讀、寫 message 中單個字段能力。

讀單個字段的函數如下：

// 這里由于篇幅，省略了一部分代碼，后面的代碼部分也有省略，有需要的可以自行閱讀源碼。
int32 GetInt32(const Message & message, const FieldDescriptor * field) const

std::string GetString(const Message & message, const FieldDescriptor * field) const

const Message & GetMessage(const Message & message, const FieldDescriptor * field, MessageFactory * factory = nullptr) const // 讀取單個message字段

寫單個字段的函數如下：

void SetInt32(Message * message, const FieldDescriptor * field, int32 value) const

void SetString(Message * message, const FieldDescriptor * field, std::string value) const

獲取重復字段的函數如下：

int32 GetRepeatedInt32(const Message & message, const FieldDescriptor * field, int index) const

std::string GetRepeatedString(const Message & message, const FieldDescriptor * field, int index) const

const Message & GetRepeatedMessage(const Message & message, const FieldDescriptor * field, int index) const

寫重復字段的函數如下：

void SetRepeatedInt32(Message * message, const FieldDescriptor * field, int index, int32 value) const

void SetRepeatedString(Message * message, const FieldDescriptor * field, int index, std::string value) const

void SetRepeatedEnumValue(Message * message, const FieldDescriptor * field, int index, int value) const // Set an enum field's value with an integer rather than EnumValueDescriptor. more..

新增重復字段設計如下：

void AddInt32(Message * message, const FieldDescriptor * field, int32 value) const

void AddString(Message * message, const FieldDescriptor * field, std::string value) const

另外有一個較為重要的函數，其可以批量獲取字段描述并將其放置到 vector 中：

void Reflection::ListFields(const Message & message, std::vector< const FieldDescriptor * > * output) const

2.4 options 介紹

PB 允許在 proto 中自定義選項并使用選項。在定義 message 的字段時，不僅可以定義字段內容，還可以設置字段的屬性，比如校驗規則，簡介等，結合反射，可以實現豐富豐富多彩的應用。

下面來介紹下：

import "google/protobuf/descriptor.proto";

extend google.protobuf.FieldOptions {
  optional uint32 attr_id              = 50000; //字段id
  optional bool is_need_encrypt        = 50001 [default = false]; // 字段是否加密,0代表不加密，1代表加密
  optional string naming_conventions1  = 50002; // 商戶組命名規范
  optional uint32 length_min           = 50003  [default = 0]; // 字段最小長度
  optional uint32 length_max           = 50004  [default = 1024]; // 字段最大長度
  optional string regex                = 50005; // 該字段的正則表達式
}

message SubMerchantInfo {
  // 商戶名稱
  optional string merchant_name = 1 [
    (attr_id) = 1,
    (is_encrypt) = 0,
    (naming_conventions1) = "company_name",
    (length_min) = 1,
    (length_max) = 80,
    (regex.field_rules) = "[a-zA-Z0-9]"
  ];

使用方法如下：

#include <google/protobuf/descriptor.h>
#include <google/protobuf/message.h>

std::string strRegex = FieldDescriptor->options().GetExtension(regex);

uint32 dwLengthMinp = FieldDescriptor->options().GetExtension(length_min);

bool bIsNeedEncrypt = FieldDescriptor->options().GetExtension(is_need_encrypt);

三、PB 反射的進階使用

第二章給出了 PB 反射，以及具體的使用細節，在本章中，作者結合自己日常的代碼，給出 PB 反射一些使用場景。并且以開發一個表單系統為例，講一下 PB 反射在開發表單系統中的進階使用。

3.1 獲取 PB 中所有非空字段

在業務中，經常會需要獲取某個 Message 中所有非空字段，形成一個 map<string,string>，使用 PB 反射寫法如下：

#include "pb_util.h"

#include <sstream>

namespace comm_tools {
int PbToMap(const google::protobuf::Message &message,
            std::map<std::string, std::string> &out) {
#define CASE_FIELD_TYPE(cpptype, method, valuetype)                            
  case google::protobuf::FieldDescriptor::CPPTYPE_##cpptype: {                 
    valuetype value = reflection->Get##method(message, field);                 
    std::ostringstream oss;                                                    
    oss << value;                                                              
    out[field->name()] = oss.str();                                            
    break;                                                                     
  }

#define CASE_FIELD_TYPE_ENUM()                                                 
  case google::protobuf::FieldDescriptor::CPPTYPE_ENUM: {                      
    int value = reflection->GetEnum(message, field)->number();                 
    std::ostringstream oss;                                                    
    oss << value;                                                              
    out[field->name()] = oss.str();                                            
    break;                                                                     
  }

#define CASE_FIELD_TYPE_STRING()                                               
  case google::protobuf::FieldDescriptor::CPPTYPE_STRING: {                    
    std::string value = reflection->GetString(message, field);                 
    out[field->name()] = value;                                                
    break;                                                                     
  }

  const google::protobuf::Descriptor *descriptor = message.GetDescriptor();
  const google::protobuf::Reflection *reflection = message.GetReflection();

  for (int i = 0; i < descriptor->field_count(); i++) {
    const google::protobuf::FieldDescriptor *field = descriptor->field(i);
    bool has_field = reflection->HasField(message, field);

    if (has_field) {
      if (field->is_repeated()) {
        return -1; // 不支持轉換repeated字段
      }

      const std::string &field_name = field->name();
      switch (field->cpp_type()) {
        CASE_FIELD_TYPE(INT32, Int32, int);
        CASE_FIELD_TYPE(UINT32, UInt32, uint32_t);
        CASE_FIELD_TYPE(FLOAT, Float, float);
        CASE_FIELD_TYPE(DOUBLE, Double, double);
        CASE_FIELD_TYPE(BOOL, Bool, bool);
        CASE_FIELD_TYPE(INT64, Int64, int64_t);
        CASE_FIELD_TYPE(UINT64, UInt64, uint64_t);
        CASE_FIELD_TYPE_ENUM();
        CASE_FIELD_TYPE_STRING();
      default:
        return -1; // 其他異常類型
      }
    }
  }

  return 0;
}
} // namespace comm_tools

通過上面的代碼，如果需要在 proto 中增加字段，不再需要修改原來的代碼。

3.2 將字段校驗規則放置在 Proto 中

后臺服務接收到前端傳來的字段后，會對字段進行校驗，比如必填校驗，長度校驗，正則校驗，xss 校驗等，這些規則我們常常會硬編碼在代碼中。但是隨著后臺字段的增加，校驗規則代碼會變得越來越多，越來越難維護。如果我們把字段的定義和校驗規則和定義放在一起，這樣是不是更好的維護？

示例 proto 如下：

syntax = "proto2";

package student;

import "google/protobuf/descriptor.proto";

message FieldRule{
    optional uint32 length_min = 1; // 字段最小長度
    optional uint32 id         = 2; // 字段映射id
}

extend google.protobuf.FieldOptions{
    optional FieldRule field_rule = 50000;
}

message Student{
    optional string name   =1 [(field_rule).length_min = 5, (field_rule).id = 1];
    optional string email = 2 [(field_rule).length_min = 10, (field_rule).id = 2];
}

然后我們自己實現 xss 校驗，必填校驗，長度校驗，選項校驗等代碼。

示例校驗最小長度代碼如下：

#include <IOStream>
#include "student.pb.h"
#include <google/protobuf/descriptor.h>
#include <google/protobuf/message.h>

using namespace std;
using namespace student;
using namespace google::protobuf;

bool minLengthCheck(const std::string &strValue, const uint32_t &dwLenthMin) {
    return strValue.size() < dwLenthMin;
}

int allCheck(const google::protobuf::Message &oMessage){
    const auto *poReflect = oMessage.GetReflection();

    vector<const FieldDescriptor *> vecFD;
    poReflect->ListFields(oMessage, &vecFD);

    for (const auto &poFiled : vecFD) {
        const auto &oFieldRule = poFiled->options().GetExtension(student::field_rule);
        if (poFiled->cpp_type() == google::protobuf::FieldDescriptor::CPPTYPE_STRING && !poFiled->is_repeated()) {
            // 類型是string并且選項非重復的才會校驗字段長度類型
            const std::string strValue = poReflect->GetString(oMessage, poFiled);
            const std::string strName = poFiled->name();

            if (oFieldRule.has_length_min()) {
                // 有才進行校驗，沒有則不進行校驗
                if (minLengthCheck(strValue, oFieldRule.length_min())) {
                    cout << "the length of " << strName << " is lower than " << oFieldRule.length_min()<<endl;
                } else {
                    cout << "check min lenth pass"<<endl;
                }
            }
        }
    }
    return 0;
}

int main() {
    Student oStudent1;
    oStudent1.set_name("xiao");

    Student oStudent2;
    oStudent2.set_name("xiaowei");

    allCheck(oStudent1);
    allCheck(oStudent2);

    return 0;
}

如上，如果需要校驗最大長度，必填，xss 校驗，只需要使用工廠模式，擴展代碼即可。

新增一個字段或者變更某個字段的校驗規則，只需要修改 Proto，不需要修改代碼，從而防止因變更代碼導致錯誤。

3.3 基于 PB 反射的前端頁面自動生成方案

在我們常見的運營系統中，經常會涉及到各種各樣的表單頁面。在前后端交互方面，當需要增加字段或者變更字段的校驗規則時，需要面臨如下問題：

前端：針對新字段編寫 html 代碼，同時需要修改前端頁面；
后臺：針對每個字段做接收，并進行校驗。

每增加或變更一個字段，我們都需要在前端和后臺進行修改，工作量大，同時頻繁變更容易導致錯誤。有什么方法可以解決這些問題嗎？答案是使用 PB 的反射能力。

通過獲取 Message 中每個字段的描述然后返回給前端，前端根據字段描述來展示頁面，并且對字段進行校驗。同時通過這種方式，前后端可以共享一份表單校驗規則。

在使用上述方案之后，當我們需要增加字段或者變更字段的校驗規則時，只需要在 Proto 中修改字段，大大節省了工作量，同時避免了因發布帶來的風險問題。

3.4 通用存儲系統

在運營系統中，前端輸入字段，傳入到后臺，后臺校驗字段之后，一般還需要把數據存儲到數據庫中。

對于某些運營系統來說，其希望能夠快速接入一些數據，傳統開發常常會面臨如下問題：

如何在不增加或變更表結構的基礎上，如何快速接入數據？
如何零開發實現頻繁添加字段、新增渠道等需求？
如何兼容不同業務、不同數據協議（比如 PB 中的不同 message）？

答案是使用 PB 的反射，使得有結構的數據轉換為非結構的數據，然后存儲到非關系型數據庫（在微信支付側一般存入到 table kv）中。

以 3.2 節中的 Proto 為例，舉例如下，學生類中定義了兩個字段，name 和 email 字段，原始信息為：

Student oStudent;
oStudent.set_name("xiaowei");
oStudent.set_email("test@tencent.com");

通過 PB 的反射，可以轉化為平鋪的結構：

[{"id":"1","value":"xiaowei"},{"id":"2","value":"test@tencent.com"}]

轉化為平鋪結構后，可以快速存入到數據庫中。如果現在學生信息里需要增加一個字段 address，則不需要修改表結構，從而完成存儲動作。利用 PB 反射，可以完成有結構數據和無結構數據之間的轉換，達到存儲和業務解耦的特性。

四、總結

本文首先給出了 PB 的反射函數，然后再結合自己平時負責的工作，給出了 PB 的進階使用。通過對 PB 的進階使用，可以大大提高開發和維護的效率，同時提升代碼的優雅度。有需要更進一步研究 PB 的，可以閱讀其源代碼，不得不說，通過閱讀優秀代碼能夠極大的促進編程能力。

需要注意的是 PB 反射需要依賴大量計算資源，在密集使用 PB 的場景下，需要注意 CPU 的使用情況。

日日操夜夜添-日日操影院-日日草夜夜操-日日干干-精品一区二区三区波多野结衣-精品一区二区三区高清免费不卡

拒做PB Boy！教你巧用 Protobuf 反射來優化代碼一、背景二、PB 反射的使用2.1 類 Descriptor 介紹2.2 類 FieldDescriptor 介紹2.3 類 Reflect

一、背景

二、PB 反射的使用

2.1 類 Descriptor 介紹

2.2 類 FieldDescriptor 介紹

2.3 類 Reflection 介紹

2.4 options 介紹

三、PB 反射的進階使用

3.1 獲取 PB 中所有非空字段

3.2 將字段校驗規則放置在 Proto 中

3.3 基于 PB 反射的前端頁面自動生成方案

3.4 通用存儲系統

四、總結

數獨大挑戰2018-06-03

答題星2018-06-03

全階人生考試2018-06-03

運動步數有氧達人2018-06-03

每日養生app2018-06-03

體育訓練成績評定2018-06-03