A few months ago I have started learning Erlang, mostly for fun but it was right about time to jump on the functional bandwagon. The best way to learn a new language is to find an engaging project, in my case its been something what has been on my mind for quite a while: a cloud communication protocol / framework for distributed computing. Some basic principles of what it is about can be found here: CloudDDS.
While learning Erlang, I decided to also learn a number of other technologies, or using other words - apply them in practice. One of those is Apache Thrift. Apache Thrift is a binary cross-language framework for distributed service development. It is a serialization protocol with its own RPC stack.
Thrift RPC stack comes with a number of supported protocols, referred to as transports
. Erlang implementation provides HTTP, JSON, file, disk log and framed transports. The project I was working on uses UDP for all communication. Below, I am going to describe how I implemented Thrift over UDP in Erlang. I will walk the reader through building a simple active-active setup of two clients communicating via UDP using Thrift messages.
§Prerequisites
Apache Thrift uses .thrift
file to describe services and types to used by the software. To generate the language code, Apache Thrift must be installed first. There is a lot of examples available how this can be achieved. I have created a bash script for Ubuntu 12.04 for my own reference, available here: Install Thrift 0.9.1 with all language support.
§Design
I will follow the standard OTP principles while developing the example.
application -> supervisor -> UDP messaging worker
| -> serialization worker
While discussing the implementation, I will not focus on the details of the messaging code, neither I will dive into the details of the standard OTP components. I will focus only on the Thrift part.
§Basic OTP modules
First step is to create a basic Erlang application. All the code can be found below. Additionally, at the bottom of the article I am publishing the link to the sources available on github.
§rebar.config
A bare minimum required for installing dependencies.
|
|
I am using my own fork of Thrift, it contains updated JSX dependencies.
§src/udp_thrift.app.src
|
|
§src/udp_thrift_app.erl
|
|
§src/udp_thrift_common.erl
|
|
§src/udp_thrift_sup.erl
|
|
The supervisor expects a number of environment properties to be set. This will be provided later on, in Running and testing section. If one or more of these is not provided, the application will not start or fail at a worker start step.
§src/udp_thrift_messaging.erl
|
|
§private/udp_thrift.thrift
The task of this example is for every client, at two seconds interval, send a Digest
to the other client. On the other side of the wire, upon receiving the Digest
, send a DigestAck
message back. Every message is wrapped in a DigestEnvelope
container so the program can easily establish what is the digest type sent. The thrift
file looks like this:
namespace erl udp_thrift
struct DigestEnvelope {
1: required string payload_type;
2: required string bin_payload;
}
struct Digest {
1: required string name;
2: required i32 port;
3: required i64 heartbeat;
4: required string id;
}
struct DigestAck {
1: required string name;
2: required i64 heartbeat;
3: required string reply_id;
}
§Generate code from Thrift and make it available
Once thrift is installed on the system, the code can be generated from the thrift
file. To do so:
|
|
The source code on github contains a Vagrant box which can be used to generate the code:
|
|
§Thrift serialisation and deserialisation
The module responsible for these tasks has been already started in the supervisor. Let’s go through the code and discuss the parts in detail.
|
|
First, these records are required. As the serialisation happens within this single module, I include them directly by copying from Erlang Thrift sources.
|
|
Following is the code for the module setup. For serialisation, a single protocol instance may be used for all serialized messages.
|
|
§Serialization
|
|
First handle_info
receives the data to serialize, it uses udp_thrift_types:struct_info
provided by the thrift generated modules to get the thrift
type definition. This is forwarded to the the second handle_info
. digest_to_binary
is where all the action happens (presented shortly). The inner digest_to_binary
call serializes the digest received from messaging
component. This is then wrapped inside of the digestEnvelope
digest which has to be converted to binary data for UDP delivery.
§Deserialization
|
|
Upon received a deserialization request, the first thing that happens is deserialize_from_binary
. This function assumes that the data received is Thrift and the data is of type digestEnvelope
. The details will be presented shortly. Once the payload carried by the envelope has been read, the program detects if the payload is of a supported type. If so, it attempts deserializing the content. If all of the above succeeds, the result is passed back to messaging (caller of the serialization). Majority of that function is simple error handling.
§digest_to_binary, digest_from_binary and payload_type_as_known_atom
|
|
digest_to_binary
serializes the payload. It uses the copy of the protocol defined in init
to write the thrift binary representation of data to be sent. All what has to be done afterwards is converting iolist
contained in the resulting OutTransport
to binary
data. This is returned to CallerPid
and sent via UDP to the destination.
|
|
digest_from_binary
receives an atom
of an expected digest type and a binary digest as an input. First, a memory_buffer
transport is constructed from the binary data. It is then converted into a binary protocol and an attempt to read the data from thrift
is made using a structure provided by udp_thrift_types:struct_info
.
|
|
The above function is rather self explanatory. The deserialization will be attempted if an atom
can be found in the KnownDigestTypes
for given digest type. Otherwise an error is returned.
The module ends with the rest of required gen_server
behaviour.
|
|
§Running and testing
Preparation:
|
|
To run, two terminal windows are required. In the first window, start Erlang with this command:
|
|
Run the application:
|
|
In the second terminal window, start Erlang:
|
|
Run the application:
|
|
Both windows should be showing output similar to this:
…
Sending digest to 6667.
=INFO REPORT==== 12-Oct-2014::21:43:43 ===
Sending digest to 6667.
=INFO REPORT==== 12-Oct-2014::21:43:43 ===
Received digestAck to digest <<"#Ref<0.0.0.95>">>.
=INFO REPORT==== 12-Oct-2014::21:43:45 ===
Received digest from <<"memberB">>. Replying to 6667
=INFO REPORT==== 12-Oct-2014::21:43:45 ===
Sending digest to 6667.
=INFO REPORT==== 12-Oct-2014::21:43:45 ===
Received digestAck to digest <<"#Ref<0.0.0.100>">>.
=INFO REPORT==== 12-Oct-2014::21:43:47 ===
Received digest from <<"memberB">>. Replying to 6667
=INFO REPORT==== 12-Oct-2014::21:43:47 ===
Sending digest to 6667.
=INFO REPORT==== 12-Oct-2014::21:43:47 ===
Received digestAck to digest <<"#Ref<0.0.0.105>">>.
…
If that is the case, both clients are communicating via UDP with Thrift protocol.
§Code and license
All code discussed above can be found on github: Using Apache Thrift with UDP in Erlang. The code is available under MIT license.