RLP SERIALIZATION
Recursive Length Prefix (RLP) serialization is used extensively in Ethereum’s execution clients. RLP standardizes the transfer of data between nodes in a space-efficient format.
The purpose of RLP is to encode arbitrarily nested arrays of binary data, and RLP is the primary encoding method used to serialize objects in Ethereum’s execution layer.
The only purpose of RLP is to encode structure; encoding specific data types (e.g. strings, floats) is left up to higher-order protocols;
You can read more of the definitions in Ethereum Yellow Paper.
In this guide, we’ll use a interactive way to show how RLP works.
Let’s take a first look at the RLP encoding.
Try to change the value of the range input, and see how the output changes. We will explain the encoding process in detail later.
RLP encoding is defined as follows:
-
If the byte array contains solely a single byte and that single byte is less than 128, then the input is exactly equal to the output.
Character
a
for example, is encoded as0x61
, which is less than128
.Another key point is that RLP does not encode data types, encoding specific data types (e.g. strings, floats) is left up to higher-order protocols.
In this example,
a
is a unicode character which is assigned to0x61
in UTF-8 encoding. UFT-8 is a higher-order protocol. RLP doesn’t care about it. Only the byte array matters.To make it easier to understand, we’ll use a square block to represent the original character and its encoding.
61a -
If the byte array contains fewer than 56 bytes, then the output is equal to the input prefixed. by the byte equal to the length of the byte array plus 128.
length < 3883=80base+03030161a0262b0363c -
Otherwise, the output is equal to the input, provided that it contains fewer than 264 bytes, prefixed by the minimal-length byte array which when interpreted as a big-endian integer is equal to the length of the input byte array, which is itself prefixed by the number of bytes required to faithfully encode this length value plus 183.
Byte arrays containing 264 or more bytes cannot be encoded. This restriction ensures that the first byte of the encoding of a byte array is always below 192, and thus it can be readily distinguished from the encodings of sequences in L.
length ≧ 38b8=b7base+013f⁸ⁿlen⁸ⁿ3f630145E0254T0348H0445E0552R0645E0755U084dM093a:0a200b41A0c200d53S0e45E0f43C1055U1152R1245E13201444D1545E1643C1745E184eN1954T1a52R1b41A1c4cL1d49I1e53S1f45E2044D21202247G2345E244eN2545E2652R2741A284cL2949I2a53S2b45E2c44D2d202e54T2f52R3041A314eN3253S3341A3443C3554T3649I374fO384eN39203a4cL3b45E3c44D3d47G3e45E3f52R -
If the concatenated serialisations of each contained item is less than 56 bytes in length, then the output is equal to that concatenation prefixed by the byte equal to the length of this byte array plus 192.
[] length < 38c361=c0base+030362a63bc -
Otherwise, the output is equal to the concatenated serialisations, provided that they contain fewer than 264 bytes, prefixed by the minimal-length byte array which when interpreted as a big-endian integer is equal to the length of the concatenated serialisations byte array, which is itself prefixed by the number of bytes required to faithfully encode this length value plus 247
[] length ≧ 38f8=f7base+015e⁸ⁿlen⁸ⁿ5e61362ablength ≧ 38b8=b7base+015a⁸ⁿlen⁸ⁿ5a900154T0268h0365e0420056cl0665e076en0867g0974t0a68h0b200c6fo0d66f0e200f74t1068h1169i1273s13201473s1565e166en1774t1865e196en1a63c1b65e1c201d69i1e73s1f20206dm216fo2272r2365e24202574t2668h2761a286en29202a3552b3552c202d62b2e79y2f74t3065e3173s322c,33203422"352c,36203722"3849I39203a6bk3b6en3c6fo3d77w3e203f69i4074t41204262b4365e4463c4561a4675u4773s4865e49204a49I4b204c70p4d72r4e65e4f2d-5064d5165e5273s5369i5467g556en5665e5764d58205969i5a74t
Sequences whose concatenated serialized items contain 264 or more bytes cannot be encoded. This restriction ensures that the first byte of the encoding does not exceed 255 (otherwise it would not be a byte)