在Android中使用Protocol Buffers(上篇)

总览

先来看一下 FlatBuffers 项目已经为我们提供了什么,而我们在将 FlatBuffers 用到我们的项目中时又需要做什么的整体流程。如下图:

在使用 FlatBuffers 时,我们需要以特殊的格式定义我们的结构化数据,保存为 .fbs 文件。FlatBuffers 项目为我们提供了编译器,可用于将 .fbs 文件编译为Java文件,C++文件等,以用于我们的项目。FlatBuffers 编译器在我们的开发机,比如Ubuntu,Mac上运行。这些源代码文件是基于 FlatBuffers 提供的Java库生成的,同时我们也需要利用这个Java库的一些接口来序列化或解析数据。

我们将 FlatBuffers 编译器生成的Java文件及 FlatBuffers 的Java库导入我们的项目,就可以用 FlatBuffers 来对我们的结构化数据执行序列化和反序列化了。尽管每次手动执行 FlatBuffers 编译器生成Java文件非常麻烦,但不像 Protocol Buffers 那样,当前还没有Google官方提供的gradle插件可用。不过,我们这边开发了一个简单的 FlatBuffers gradle插件,后面会简单介绍一下,欢迎大家使用。

接下来我们更详细地看一下上面流程中的各个部分。


下载、编译 FlatBuffers 编译器

我们可以在如下位置:

https://github.com/google/flatbuffers/releases

获取官方发布的打包好的版本。针对Windows平台有编译好的可执行安装文件,对其它平台还是打包的源文件。我们也可以指向clone repo的代码,进行手动编译。这里我们从GitHub上clone代码并手动编译编译器:

$ git clone https://github.com/google/flatbuffers.git
Cloning into 'flatbuffers'...
remote: Counting objects: 7340, done.
remote: Compressing objects: 100% (46/46), done.
remote: Total 7340 (delta 16), reused 0 (delta 0), pack-reused 7290
Receiving objects: 100% (7340/7340), 3.64 MiB | 115.00 KiB/s, done.
Resolving deltas: 100% (4692/4692), done.
Checking connectivity... done.

下载代码之后,我们需要用cmake工具来为flatbuffers生成Makefile文件并编译:

$ cd flatbuffers/
$ cmake CMakeLists.txt 
-- The C compiler identification is AppleClang 7.3.0.7030031
-- The CXX compiler identification is AppleClang 7.3.0.7030031
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/netease/Projects/OpenSource/flatbuffers
$ make && make install

安装之后执行如下命令以确认已经装好:

$ flatc --version
flatc version 1.4.0 (Dec  7 2016)

flatc没有为我们提供 --help 选项,不过加了错误的参数时这个工具会为我们展示详细的用法:

$ flatc --help
flatc: unknown commandline argument: --help
usage: flatc [OPTION]... FILE... [-- FILE...]
  --binary     -b Generate wire format binaries for any data definitions.
  --json       -t Generate text output for any data definitions.
  --cpp        -c Generate C++ headers for tables/structs.
  --go         -g Generate Go files for tables/structs.
  --java       -j Generate Java classes for tables/structs.
  --js         -s Generate JavaScript code for tables/structs.
  --csharp     -n Generate C# classes for tables/structs.
  --python     -p Generate Python files for tables/structs.
  --php           Generate PHP files for tables/structs.
  -o PATH            Prefix PATH to all generated files.
  -I PATH            Search for includes in the specified path.
  -M                 Print make rules for generated files.
  --version          Print the version number of flatc and exit.
  --strict-json      Strict JSON: field names must be / will be quoted,
                     no trailing commas in tables/vectors.
  --allow-non-utf8   Pass non-UTF-8 input through parser and emit nonstandard
                     \x escapes in JSON. (Default is to raise parse error on
                     non-UTF-8 input.)
  --defaults-json    Output fields whose value is the default when
                     writing JSON
  --unknown-json     Allow fields in JSON that are not defined in the
                     schema. These fields will be discared when generating
                     binaries.
  --no-prefix        Don't prefix enum values with the enum type in C++.
  --scoped-enums     Use C++11 style scoped and strongly typed enums.
                     also implies --no-prefix.
  --gen-includes     (deprecated), this is the default behavior.
                     If the original behavior is required (no include
                     statements) use --no-includes.
  --no-includes      Don't generate include statements for included
                     schemas the generated file depends on (C++).
  --gen-mutable      Generate accessors that can mutate buffers in-place.
  --gen-onefile      Generate single output file for C#.
  --gen-name-strings Generate type name functions for C++.
  --escape-proto-ids Disable appending '_' in namespaces names.
  --gen-object-api   Generate an additional object-based API.
  --cpp-ptr-type T   Set object API pointer type (default std::unique_ptr)
  --raw-binary       Allow binaries without file_indentifier to be read.
                     This may crash flatc given a mismatched schema.
  --proto            Input is a .proto, translate to .fbs.
  --schema           Serialize schemas instead of JSON (use with -b)
  --conform FILE     Specify a schema the following schemas should be
                     an evolution of. Gives errors if not.
  --conform-includes Include path for the schema given with --conform
    PATH             
FILEs may be schemas, or JSON files (conforming to preceding schema)
FILEs after the -- must be binary flatbuffer format files.
Output files are named using the base file name of the input,
and written to the current directory or the path given by -o.
example: flatc -c -b schema1.fbs schema2.fbs data.json


创建 .fbs 文件

flatc支持将为 Protocol Buffers 编写的 .proto 文件转换为 .fbs 文件,如:

$ ls
addressbook.proto
$ flatc --proto addressbook.proto 
$ ls -l
total 16
-rw-r--r--  1 netease  staff  431 12  7 17:21 addressbook.fbs
-rw-r--r--@ 1 netease  staff  486 12  1 15:18 addressbook.proto

Protocol Buffers 消息文件中的一些写法,FlatBuffers 编译器还不能很好的支持,如option java_package,option java_outer_classname,和嵌套类。这里我们基于 FlatBuffers 编译器转换的 .proto 文件来获得我们的 .fbs 文件:

// Generated from addressbook.proto

namespace com.example.tutorial;

enum PhoneType : int {
  MOBILE = 0,
  HOME = 1,
  WORK = 2,
}

namespace com.example.tutorial;

table Person {
  name:string (required);
  id:int;
  email:string;
  phone:[com.example.tutorial._Person.PhoneNumber];
}

namespace com.example.tutorial._Person;

table PhoneNumber {
  number:string (required);
  type:int;
}

namespace com.example.tutorial;

table AddressBook {
  person:[com.example.tutorial.Person];
}

root_type AddressBook;

可以参考 官方的文档 来了解 .fbs 文件的详细的写法。


编译 .fbs 文件

可以通过如下命令编译 .fbs 文件:

$ flatc --java -o out addressbook.fbs

--java用于指定编译的目标编程语言。-o 参数则用于指定输出文件的路径,如过没有提供则将当前目录用作输出目录。FlatBuffers 编译器按照为不同的数据结构声明的namespace生成目录结构。对于上面的例子,会生成如下的这些文件:

$ find out
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo}span.s1 {font-variant-ligatures: no-common-ligatures}

$ find out/
out/
out//com
out//com/example
out//com/example/tutorial
out//com/example/tutorial/_Person
out//com/example/tutorial/_Person/PhoneNumber.java
out//com/example/tutorial/AddressBook.java
out//com/example/tutorial/Person.java
out//com/example/tutorial/PhoneType.java


在Android项目中使用 FlatBuffers

我们将前面由 .fbs 文件生成的Java文件拷贝到我们的项目中。我们前面提到的,FlatBuffers 的Java库比较薄,当前官方并没有发布到jcenter这样的maven仓库中,因而我们需要将这部分代码也拷贝到我们的额项目中。FlatBuffers 的Java库在其repo仓库的 java 目录下。我们有将这部分代码打包,放在公司的maven仓库中,引用的方法为,修改应用程序的 build.gradle:

repositories {
    maven {
        url "http://mvn.hz.netease.com/artifactory/libs-releases/"
    }
    maven {
        url "http://mvn.hz.netease.com/artifactory/libs-snapshots/"
    }
}

dependencies {
    compile fileTree(dir: 'libs', include: ['*.jar'])
    compile project(':netlib')

    testCompile 'junit:junit:4.12'

    compile 'com.netease.hearttouch:ht-flatbuffers:0.0.1-SNAPSHOT'
}

添加访问 FlatBuffers 的类:

package com.netease.volleydemo;

import com.example.tutorial.AddressBook;
import com.example.tutorial.Person;
import com.example.tutorial._Person.PhoneNumber;
import com.google.flatbuffers.FlatBufferBuilder;

import java.nio.ByteBuffer;

/**
 * Created by hanpfei0306 on 16-12-5.
 */

public class AddressBookFlatBuffers {
    public static byte[] encodeTest(String[] names) {
        FlatBufferBuilder builder = new FlatBufferBuilder(0);

        int[] personOffsets = new int[names.length];

        for (int i = 0; i < names.length; ++ i) {
            int name = builder.createString(names[i]);
            int email = builder.createString("zhangsan@gmail.com");

            int number1 = builder.createString("0157-23443276");
            int type1 = 1;
            int phoneNumber1 = PhoneNumber.createPhoneNumber(builder, number1, type1);

            int number2 = builder.createString("136183667387");
            int type2 = 0;
            int phoneNumber2 = PhoneNumber.createPhoneNumber(builder, number2, type2);

            int[] phoneNubers = new int[2];
            phoneNubers[0] = phoneNumber1;
            phoneNubers[1] = phoneNumber2;

            int phoneNumbersPos = Person.createPhoneVector(builder, phoneNubers);

            int person = Person.createPerson(builder, name, 13958235, email, phoneNumbersPos);

            personOffsets[i] = person;
        }
        int persons = AddressBook.createPersonVector(builder, personOffsets);

        AddressBook.startAddressBook(builder);
        AddressBook.addPerson(builder, persons);
        int eab = AddressBook.endAddressBook(builder);
        builder.finish(eab);
        byte[] data = builder.sizedByteArray();
        return data;
    }

    public static byte[] encodeTest(String[] names, int times) {
        for (int i = 0; i < times - 1; ++ i) {
            encodeTest(names);
        }
        return encodeTest(names);
    }

    public static AddressBook decodeTest(byte[] data) {
        AddressBook addressBook = null;
        ByteBuffer byteBuffer = ByteBuffer.wrap(data);
        addressBook = AddressBook.getRootAsAddressBook(byteBuffer);
        return addressBook;
    }

    public static AddressBook decodeTest(byte[] data, int times) {
        AddressBook addressBook = null;
        for (int i = 0; i < times; ++ i) {
            addressBook = decodeTest(data);
        }
        return addressBook;
    }
}


使用 flatbuf-gradle-plugin

我们有开发一个 FlatBuffers 的gradle插件,以方便开发,项目位置。这个插件的设计有参考Google的protobuf-gradle-plugin,功能及用法也与protobuf-gradle-plugin类似。

应用flatbuf-gradle-plugin

修改应用程序的 build.gradle 以应用flatbuf-gradle-plugin

  1. 为buildscript添加对flatbuf-gradle-plugin的依赖:
    buildscript {
     repositories {
         maven {
             url "http://mvn.hz.netease.com/artifactory/libs-releases/"
         }
         maven {
             url "http://mvn.hz.netease.com/artifactory/libs-snapshots/"
         }
     }
     dependencies {
         classpath 'com.netease.hearttouch:ht-flatbuf-gradle-plugin:0.0.1-SNAPSHOT'
     }
    }
    
  2. apply plugin: 'com.android.application'后面应用flatbuf的plugin:
    apply plugin: 'com.android.application'
    apply plugin: 'com.netease.flatbuf'
    
  3. 添加flatbuf块,对flatbuf-gradle-plugin的执行做配置:

    flatbuf {
     flatc {
         path = '/usr/local/bin/flatc'
     }
    
     generateFlatTasks {
         all().each { task ->
             task.builtins {
                 remove java
             }
             task.builtins {
                 java { }
             }
         }
     }
    }
    

    flatc块用于配置 FlatBuffers 编译器,这里我们指定用我们之前手动编译的编译器。 task.builtins的块必不可少,这个块用于指定我们要为那些编程语言生成代码,这里我们为Java生成代码。

  4. 指定 .fbs 文件的路径
     sourceSets {
         main {
             flat {
                 srcDir 'src/main/flat'
             }
         }
     }
    
    我们将 FlatBuffers 的IDL文件放在src/main/flat目录下。

这样我们就不用再那么麻烦每次手动执行flatc了。


FlatBuffers、Protobuf及JSON对比测试

FlatBuffers相对于Protobuf的表现又如何呢?这里我们用数据说话,对比一下FlatBuffers格式、JSON格式与Protobuf的表现。测试同样用fastjson作为JSON的编码解码工具。

测试用的数据结构所有的数据结构,Protobuf相关的测试代码,及JSON的测试代码同 在Android中使用Protocol Buffers 一文所述,FlatBuffers的测试代码如上面看到的 AddressBookFlatBuffers

通过如下的这段代码来执行测试:

    private class ProtoTestTask extends AsyncTask<Void, Void, Void> {
        private static final int BUFFER_LEN = 8192;

        private void compress(InputStream is, OutputStream os)
                throws Exception {

            GZIPOutputStream gos = new GZIPOutputStream(os);

            int count;
            byte data[] = new byte[BUFFER_LEN];
            while ((count = is.read(data, 0, BUFFER_LEN)) != -1) {
                gos.write(data, 0, count);
            }

            gos.finish();
            gos.close();
        }

        private int getCompressedDataLength(byte[] data) {
            ByteArrayInputStream bais =new ByteArrayInputStream(data);
            ByteArrayOutputStream baos = new ByteArrayOutputStream();

            try {
                compress(bais, baos);
            } catch (Exception e) {
            }

            return baos.toByteArray().length;
        }

        private void dumpDataLengthInfo(byte[] protobufData, String jsonData, byte[] flatbufData) {
            int compressedProtobufLength = getCompressedDataLength(protobufData);
            int compressedJSONLength = getCompressedDataLength(jsonData.getBytes());
            int compressedFlatbufLength = getCompressedDataLength(flatbufData);
            Log.i(TAG, String.format("%-120s", "Data length"));
            Log.i(TAG, String.format("%-20s%-20s%-20s%-20s%-20s%-20s", "Protobuf", "Protobuf (GZIP)",
                    "JSON", "JSON (GZIP)", "Flatbuf", "Flatbuf (GZIP)"));
            Log.i(TAG, String.format("%-20s%-20s%-20s%-20s%-20s%-20s",
                    String.valueOf(protobufData.length), compressedProtobufLength,
                    String.valueOf(jsonData.getBytes().length), compressedJSONLength,
                    String.valueOf(flatbufData.length), compressedFlatbufLength));
        }

        private void doEncodeTest(String[] names, int times) {
            long startTime = System.nanoTime();
            byte[] protobufData = AddressBookProtobuf.encodeTest(names, times);
            long protobufTime = System.nanoTime();
            protobufTime = protobufTime - startTime;

            startTime = System.nanoTime();
            String jsonData = AddressBookJson.encodeTest(names, times);
            long jsonTime = System.nanoTime();
            jsonTime = jsonTime - startTime;

            startTime = System.nanoTime();
            byte[] flatbufData = AddressBookFlatBuffers.encodeTest(names, times);
            long flatbufTime = System.nanoTime();
            flatbufTime = flatbufTime - startTime;

            dumpDataLengthInfo(protobufData, jsonData, flatbufData);

            Log.i(TAG, String.format("%-20s%-20s%-20s%-20s", "Encode Times", String.valueOf(times),
                    "Names Length", String.valueOf(names.length)));

            Log.i(TAG, String.format("%-20s%-20s%-20s%-20s%-20s%-20s",
                    "ProtobufTime", String.valueOf(protobufTime),
                    "JsonTime", String.valueOf(jsonTime),
                    "FlatbufTime", String.valueOf(flatbufTime)));
        }

        private void doEncodeTest10(int times) {
            doEncodeTest(TestUtils.sTestNames10, times);
        }

        private void doEncodeTest50(int times) {
            doEncodeTest(TestUtils.sTestNames50, times);
        }

        private void doEncodeTest100(int times) {
            doEncodeTest(TestUtils.sTestNames100, times);
        }

        private void doEncodeTest(int times) {
            doEncodeTest10(times);
            doEncodeTest50(times);
            doEncodeTest100(times);
        }

        private void doDecodeTest(String[] names, int times) {
            byte[] protobufBytes = AddressBookProtobuf.encodeTest(names);
            ByteArrayInputStream bais = new ByteArrayInputStream(protobufBytes);
            long startTime = System.nanoTime();
            AddressBookProtobuf.decodeTest(bais, times);
            long protobufTime = System.nanoTime();
            protobufTime = protobufTime - startTime;

            String jsonStr = AddressBookJson.encodeTest(names);
            startTime = System.nanoTime();
            AddressBookJson.decodeTest(jsonStr, times);
            long jsonTime = System.nanoTime();
            jsonTime = jsonTime - startTime;

            byte[] flatbufData = AddressBookFlatBuffers.encodeTest(names);
            startTime = System.nanoTime();
            AddressBookFlatBuffers.decodeTest(flatbufData, times);
            long flatbufTime = System.nanoTime();
            flatbufTime = flatbufTime - startTime;

            Log.i(TAG, String.format("%-20s%-20s%-20s%-20s", "Decode Times", String.valueOf(times),
                    "Names Length", String.valueOf(names.length)));
            Log.i(TAG, String.format("%-20s%-20s%-20s%-20s%-20s%-20s",
                    "ProtobufTime", String.valueOf(protobufTime),
                    "JsonTime", String.valueOf(jsonTime),
                    "FlatbufTime", String.valueOf(flatbufTime)));
        }

        private void doDecodeTest10(int times) {
            doDecodeTest(TestUtils.sTestNames10, times);
        }

        private void doDecodeTest50(int times) {
            doDecodeTest(TestUtils.sTestNames50, times);
        }

        private void doDecodeTest100(int times) {
            doDecodeTest(TestUtils.sTestNames100, times);
        }

        private void doDecodeTest(int times) {
            doDecodeTest10(times);
            doDecodeTest50(times);
            doDecodeTest100(times);
        }

        @Override
        protected Void doInBackground(Void... params) {
            TestUtils.initTest();
            doEncodeTest(5000);

            doDecodeTest(5000);
            return null;
        }

        @Override
        protected void onPostExecute(Void aVoid) {
            super.onPostExecute(aVoid);
        }
    }

这里我们执行3组编码测试及3组解码测试。对于编码测试,第一组的单个数据中包含10个Person,第二组的包含50个,第三组的包含100个,然后对每个数据分别执行5000次的编码操作。

对于解码测试,三组中单个数据同样包含10个Person、50个及100个,然后对每个数据分别执行5000次的解码码操作。

在Galaxy Nexus的Android 4.4.4 CM平台上执行上述测试,最终得到如下结果:

编码后数据长度对比 (Bytes)

Person个数 Protobuf Protobuf(GZIP) JSON JSON(GZIP) Flatbuf Flatbuf(GZIP)
10 860 288 1703 343 1532 513
50 4300 986 8463 1048 7452 1814
100 8600 1841 16913 1918 14852 3416

相同的数据,经过编码,在压缩前JSON的数据最长,FlatBuffers的数据长度与JSON的短大概10 %,而Protobuf的数据长度则大概只有JSON的一半。而在用GZIP压缩后,Protobuf的数据长度与JSON的接近,FlatBuffers的数据长度则接近两者的两倍。

编码性能对比 (S)

Person个数 Protobuf JSON FlatBuffers
10 6.000 8.952 12.464
50 26.847 45.782 56.752
100 50.602 73.688 108.426

编码性能Protobuf相对于JSON有较大幅度的提高,而FlatBuffers则有较大幅度的降低。

解码性能对比 (S)

Person个数 Protobuf JSON FlatBuffers
10 0.255 10.766 0.014
50 0.245 51.134 0.014
100 0.323 101.070 0.006

解码性能方面,Protobuf相对于JSON,有着惊人的提升。Protobuf的解码时间几乎不随着数据长度的增长而有太大的增长,而JSON则随着数据长度的增加,解码所需要的时间也越来越长。而FlatBuffers则由于无需解码,在性能方面相对于前两者更有着非常大的提升。



网易云新用户大礼包:https://www.163yun.com/gift

本文来自网易实践者社区,经作者韩鹏飞授权发布。