An Interactive Guide to RLP

RLP SERIALIZATION

Recursive Length Prefix (RLP) serialization is used extensively in Ethereum’s execution clients. RLP standardizes the transfer of data between nodes in a space-efficient format.

The purpose of RLP is to encode arbitrarily nested arrays of binary data, and RLP is the primary encoding method used to serialize objects in Ethereum’s execution layer.

The only purpose of RLP is to encode structure; encoding specific data types (e.g. strings, floats) is left up to higher-order protocols;

You can read more of the definitions in Ethereum Yellow Paper.

In this guide, we’ll use a interactive way to show how RLP works.

Let’s take a first look at the RLP encoding.

HEX("abcdefghijklm")=0x6162636465666768696a6b6c6d

PBS("abcdefghijklm")=

0x0d6162636465666768696a6b6c6d

RLP("abcdefghijklm")=

0x8d6162636465666768696a6b6c6d

Try to change the value of the range input, and see how the output changes. We will explain the encoding process in detail later.

RLP encoding is defined as follows:

If the byte array contains solely a single byte and that single byte is less than 128, then the input is exactly equal to the output.

Character a for example, is encoded as 0x61, which is less than 128.

Another key point is that RLP does not encode data types, encoding specific data types (e.g. strings, floats) is left up to higher-order protocols.

In this example, a is a unicode character which is assigned to 0x61 in UTF-8 encoding. UFT-8 is a higher-order protocol. RLP doesn’t care about it. Only the byte array matters.

To make it easier to understand, we’ll use a square block to represent the original character and its encoding.

61
a
If the byte array contains fewer than 56 bytes, then the output is equal to the input prefixed. by the byte equal to the length of the byte array plus 128.

length < 38
83
=
80
base
+
03
03
01
61
a
02
62
b
03
63
c
Otherwise, the output is equal to the input, provided that it contains fewer than 2⁶⁴ bytes, prefixed by the minimal-length byte array which when interpreted as a big-endian integer is equal to the length of the input byte array, which is itself prefixed by the number of bytes required to faithfully encode this length value plus 183.

Byte arrays containing 264 or more bytes cannot be encoded. This restriction ensures that the first byte of the encoding of a byte array is always below 192, and thus it can be readily distinguished from the encodings of sequences in L.

length ≧ 38
b8
=
b7
base
+
01
3f⁸ⁿ
len⁸ⁿ
3f
63
01
45
E
02
54
T
03
48
H
04
45
E
05
52
R
06
45
E
07
55
U
08
4d
M
09
3a
:
0a
20

0b
41
A
0c
20

0d
53
S
0e
45
E
0f
43
C
10
55
U
11
52
R
12
45
E
13
20

14
44
D
15
45
E
16
43
C
17
45
E
18
4e
N
19
54
T
1a
52
R
1b
41
A
1c
4c
L
1d
49
I
1e
53
S
1f
45
E
20
44
D
21
20

22
47
G
23
45
E
24
4e
N
25
45
E
26
52
R
27
41
A
28
4c
L
29
49
I
2a
53
S
2b
45
E
2c
44
D
2d
20

2e
54
T
2f
52
R
30
41
A
31
4e
N
32
53
S
33
41
A
34
43
C
35
54
T
36
49
I
37
4f
O
38
4e
N
39
20

3a
4c
L
3b
45
E
3c
44
D
3d
47
G
3e
45
E
3f
52
R
If the concatenated serialisations of each contained item is less than 56 bytes in length, then the output is equal to that concatenation prefixed by the byte equal to the length of this byte array plus 192.

[] length < 38
c3
=
c0
base
+
03
03
61
a
62
b
63
c
Otherwise, the output is equal to the concatenated serialisations, provided that they contain fewer than 264 bytes, prefixed by the minimal-length byte array which when interpreted as a big-endian integer is equal to the length of the concatenated serialisations byte array, which is itself prefixed by the number of bytes required to faithfully encode this length value plus 247

[] length ≧ 38
f8
=
f7
base
+
01
5e⁸ⁿ
len⁸ⁿ
5e
3
61
a
62
b
length ≧ 38
b8
=
b7
base
+
01
5a⁸ⁿ
len⁸ⁿ
5a
90
01
54
T
02
68
h
03
65
e
04
20

05
6c
l
06
65
e
07
6e
n
08
67
g
09
74
t
0a
68
h
0b
20

0c
6f
o
0d
66
f
0e
20

0f
74
t
10
68
h
11
69
i
12
73
s
13
20

14
73
s
15
65
e
16
6e
n
17
74
t
18
65
e
19
6e
n
1a
63
c
1b
65
e
1c
20

1d
69
i
1e
73
s
1f
20

20
6d
m
21
6f
o
22
72
r
23
65
e
24
20

25
74
t
26
68
h
27
61
a
28
6e
n
29
20

2a
35
5
2b
35
5
2c
20

2d
62
b
2e
79
y
2f
74
t
30
65
e
31
73
s
32
2c
,
33
20

34
22
"
35
2c
,
36
20

37
22
"
38
49
I
39
20

3a
6b
k
3b
6e
n
3c
6f
o
3d
77
w
3e
20

3f
69
i
40
74
t
41
20

42
62
b
43
65
e
44
63
c
45
61
a
46
75
u
47
73
s
48
65
e
49
20

4a
49
I
4b
20

4c
70
p
4d
72
r
4e
65
e
4f
2d
-
50
64
d
51
65
e
52
73
s
53
69
i
54
67
g
55
6e
n
56
65
e
57
64
d
58
20

59
69
i
5a
74
t

Sequences whose concatenated serialized items contain 264 or more bytes cannot be encoded. This restriction ensures that the first byte of the encoding does not exceed 255 (otherwise it would not be a byte)