# Eth

Ethereum-related utilities written in Nim. Includes things like Bloom filters, private/public key utilities, RLP, devp2p, and more.

# rlp

# Introduction

A Nim implementation of the Recursive Length Prefix encoding (RLP) as specified in the Ethereum's Yellow Paper (opens new window) and Wiki (opens new window).

# Reading RLP data

The Rlp type provided by this library represents a cursor over an RLP-encoded byte stream.

proc rlpFromBytes*(data: openArray[byte]): Rlp
1

# Streaming API

Once created, the Rlp object will offer procs such as isList, isBlob, getType, listLen, blobLen to determine the type of the value under the cursor. The contents of blobs can be extracted with procs such as toString, toBytes and toInt without advancing the cursor.

Lists can be traversed with the standard items iterator, which will advance the cursor to each sub-item position and yield the Rlp object at that point. As an alternative, listElem can return a new Rlp object adjusted to a particular sub-item position without advancing the original cursor. Keep in mind that copying Rlp objects is cheap and you can create as many cursors pointing to different positions in the RLP stream as necessary.

skipElem will advance the cursor to the next position in the current list. hasData will indicate that there are no more bytes in the stream that can be consumed.

Another way to extract data from the stream is through the universal read proc that accepts a type as a parameter. You can pass any supported type such as string, int, seq[T], etc, including composite user-defined types (see Object Serialization). The cursor will be advanced just past the end of the consumed object.

The toXX and read family of procs may raise a RlpTypeMismatch in case of type mismatch with the stream contents under the cursor. A corrupted RLP stream or an attemp to read past the stream end will be signaled with the MalformedRlpError exception. If the RLP stream includes data that cannot be processed on the current platform (e.g. an integer value that is too large), the library will raise an UnsupportedRlpError exception.

# DOM API

Calling Rlp.toNodes at any position within the stream will return a tree of RlpNode objects representing the collection of values starting at that position:

type
  RlpNodeType* = enum
    rlpBlob
    rlpList

  RlpNode* = object
    case kind*: RlpNodeType
    of rlpBlob:
      bytes*: seq[byte]
    of rlpList:
      elems*: seq[RlpNode]
1
2
3
4
5
6
7
8
9
10
11

As a short-cut, you can also call decode directly on a byte sequence to avoid creating a Rlp object when obtaining the nodes. For debugging purposes, you can also create a human readable representation of the Rlp nodes by calling the inspect proc:

proc inspect*(self: Rlp, indent = 0): string
1

# Creating RLP data

The RlpWriter type can be used to encode RLP data. Instances are created with the initRlpWriter proc. This should be followed by one or more calls to append which is overloaded to accept arbitrary values. Finally, you can call finish to obtain the final seq[byte].

If the end result should be a RLP list of particular length, you can replace the initial call to initRlpWriter with initRlpList(n). Calling finish before writing the sufficient number of elements will then result in an assertion failure.

As an alternative short-cut, you can also call encode on an arbitrary value (including sequences and user-defined types) to execute all of the steps at once and directly obtain the final RLP bytes. encodeList(varargs) is another short-cut for creating RLP lists.

# Object serialization

As previously explained, generic procs such as read, append, encode and decode can be used with arbitrary used-defined object types. By default, the library will serialize all of the fields of the object using the fields iterator, but you can also include only a subset of the fields or modify the order of serialization or by employing the rlpIgnore pragma or by using the rlpFields macro:

macro rlpFields*(T: typedesc, fields: varargs[untyped])

## example usage:

type
  Transaction = object
    amount: int
    time: DateTime
    sender: string
    receiver: string

rlpFields Transaction,
  sender, receiver, amount

...

var t1 = rlp.read(Transaction)
var bytes = encode(t1)
var t2 = bytes.decode(Transaction)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

By default, sub-fields within objects are wrapped in RLP lists. You can avoid this behavior by adding the custom pragma rlpInline on a particular field. In rare circumstances, you may need to serialize the same field type differently depending on the enclosing object type. You can use the rlpCustomSerialization pragma to achieve this.

# Contributing / Testing

To test the correctness of any modifications to the library, please execute nimble test_rlp at the root of the repo.

# p2p

# Introduction

This library implements the DevP2P family of networking protocols used in the Ethereum world.

# Connecting to the Ethereum network

A connection to the Ethereum network can be created by instantiating the EthereumNode type:

proc newEthereumNode*(keys: KeyPair,
                      listeningAddress: Address,
                      networkId: uint,
                      chain: AbstractChainDB,
                      clientId = "nim-eth-p2p",
                      addAllCapabilities = true): EthereumNode =
1
2
3
4
5
6

# Parameters:

keys: A pair of public and private keys used to authenticate the node on the network and to determine its node ID. See the keys library for utilities that will help you generate and manage such keys.

listeningAddress: The network interface and port where your client will be accepting incoming connections.

networkId: The Ethereum network ID. The client will disconnect immediately from any peers who don't use the same network.

chain: An abstract instance of the Ethereum blockchain associated with the node. This library allows you to plug any instance conforming to the abstract interface defined in the eth_common (opens new window) package.

clientId: A name used to identify the software package connecting to the network (i.e. similar to the User-Agent string in a browser).

addAllCapabilities: By default, the node will support all RPLx protocols imported in your project. You can specify false if you prefer to create a node with a more limited set of protocols. Use one or more calls to node.addCapability to specify the desired set:

node.addCapability(eth)
node.addCapability(shh)
1
2

Each supplied protocol identifier is a name of a protocol introduced by the p2pProtocol macro discussed later in this document.

Instantiating an EthereumNode does not immediately connect you to the network. To start the connection process, call node.connectToNetwork:

proc connectToNetwork*(node: var EthereumNode,
                       bootstrapNodes: openarray[ENode],
                       startListening = true,
                       enableDiscovery = true)
1
2
3
4

The EthereumNode will automatically find and maintain a pool of peers using the Ethereum node discovery protocol. You can access the pool as node.peers.

# Communicating with Peers using RLPx

RLPx (opens new window) is the high-level protocol for exchanging messages between peers in the Ethereum network. Most of the client code of this library should not be concerned with the implementation details of the underlying protocols and should use the high-level APIs described in this section.

The RLPx protocols are defined as a collection of strongly-typed messages, which are grouped into sub-protocols multiplexed over the same TCP connection.

This library represents each such message as a regular Nim function call over the Peer object. Certain messages act only as notifications, while others fit the request/response pattern.

To understand more about how messages are defined and used, let's look at the definition of a RLPx protocol:

# RLPx sub-protocols

The sub-protocols are defined with the p2pProtocol macro. It will accept a short identifier for the protocol and the current protocol version:

Here is how the DevP2P wire protocol (opens new window) might look like:

p2pProtocol DevP2P(version = 0, rlpxName = "p2p"):
  proc hello(peer: Peer,
             version: uint,
             clientId: string,
             capabilities: openarray[Capability],
             listenPort: uint,
             nodeId: P2PNodeId) =
    peer.id = nodeId

  proc disconnect(peer: Peer, reason: DisconnectionReason)

  proc ping(peer: Peer) =
    await peer.pong()

  proc pong(peer: Peer) =
    echo "received pong from ", peer.id
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

As seen in the example above, a protocol definition determines both the available messages that can be sent to another peer (e.g. as in peer.pong()) and the asynchronous code responsible for handling the incoming messages.

# Protocol state

The protocol implementations are expected to maintain a state and to act like a state machine handling the incoming messages. You are allowed to define an arbitrary state type that can be specified in the peerState protocol option. Later, instances of the state object can be obtained though the state pseudo-field of the Peer object:

type AbcPeerState = object
  receivedMsgsCount: int

p2pProtocol abc(version = 1,
                peerState = AbcPeerState):

  proc incomingMessage(p: Peer) =
    p.state.receivedMsgsCount += 1

1
2
3
4
5
6
7
8
9

Besides the per-peer state demonstrated above, there is also support for maintaining a network-wide state. It's enabled by specifying the networkState option of the protocol and the state object can be obtained through accessor of the same name.

The state objects are initialized to zero by default, but you can modify this behaviour by overriding the following procs for your state types:

proc initProtocolState*(state: MyPeerState, p: Peer)
proc initProtocolState*(state: MyNetworkState, n: EthereumNode)
1
2

Sometimes, you'll need to access the state of another protocol. To do this, specify the protocol identifier to the state accessors:

  echo "ABC protocol messages: ", peer.state(abc).receivedMsgCount
1

While the state machine approach may be a particularly robust way of implementing sub-protocols (it is more amenable to proving the correctness of the implementation through formal verification methods), sometimes it may be more convenient to use more imperative style of communication where the code is able to wait for a particular response after sending a particular request. The library provides two mechanisms for achieving this:

# Waiting particular messages with nextMsg

The nextMsg helper proc can be used to pause the execution of an async proc until a particular incoming message from a peer arrives:

proc helloExample(peer: Peer) =
  ...
  # send a hello message
  await peer.hello(...)

  # wait for a matching hello response, might want to add a timeout here
  let response = await peer.nextMsg(p2p.hello)
  echo response.clientId # print the name of the Ethereum client
                         # used by the other peer (Geth, Parity, Nimbus, etc)
1
2
3
4
5
6
7
8
9

There are few things to note in the above example:

  1. The p2pProtocol definition created a pseudo-variable named after the protocol holding various properties of the protocol.

  2. Each message defined in the protocol received a corresponding type name, matching the message name (e.g. p2p.hello). This type will have fields matching the parameter names of the message. If the messages has openarray params, these will be remapped to seq types.

If the designated messages also has an attached handler, the future returned by nextMsg will be resolved only after the handler has been fully executed (so you can count on any side effects produced by the handler to have taken place). If there are multiple outstanding calls to nextMsg, they will complete together. Any other messages received in the meantime will still be dispatched to their respective handlers.

Please also note that the p2pProtocol macro will make this helloExample proc async. Practically see it as proc helloExample(peer: Peer) {.async.}, and thus never use waitFor, but rather await inside this proc.

For implementing protocol handshakes with nextMsg there are specific helpers which are explained below (opens new window).

# requestResponse pairs

p2pProtocol les(version = 2):
  ...

  requestResponse:
    proc getProofs(p: Peer, proofs: openarray[ProofRequest])
    proc proofs(p: Peer, BV: uint, proofs: openarray[Blob])

  ...
1
2
3
4
5
6
7
8

Two or more messages within the protocol may be grouped into a requestResponse block. The last message in the group is assumed to be the response while all other messages are considered requests.

When a request message is sent, the return type will be a Future that will be completed once the response is received. Please note that there is a mandatory timeout parameter, so the actual return type is Future[Option[MessageType]]. The timeout parameter can be specified for each individual call and the default value can be overridden on the level of individual message, or the entire protocol:

p2pProtocol abc(version = 1,
                useRequestIds = false,
                timeout = 5000): # value in milliseconds
  requestResponse:
    proc myReq(dataId: int, timeout = 3000)
    proc myRes(data: string)
1
2
3
4
5
6

By default, the library will take care of inserting a hidden reqId parameter as used in the LES protocol (opens new window), but you can disable this behavior by overriding the protocol setting useRequestIds.

# Implementing handshakes and reacting to other events

Besides message definitions and implementations, a protocol specification may also include handlers for certain important events such as newly connected peers or misbehaving or disconnecting peers:

p2pProtocol foo(version = fooVersion):
  onPeerConnected do (peer: Peer):
    let m = await peer.status(fooVersion,
                              timeout = chronos.milliseconds(5000))

    if m.protocolVersion == fooVersion:
      debug "Foo peer", peer, fooVersion
    else:
      raise newException(UselessPeerError, "Incompatible Foo version")

  onPeerDisconnected do (peer: Peer, reason: DisconnectionReason):
    debug "peer disconnected", peer

  handshake:
    proc status(peer: Peer,
                protocolVersion: uint)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

For handshake messages, where the same type of message needs to be send to and received from the peer, a handshake helper is introduced, as you can see in the code example above.

Thanks to the handshake helper the status message will both be send, and be awaited for receival from the peer, with the defined timeout. In case no status message is received within the defined timeout, an error will be raised which will result in a disconnect from the peer.

Note: Be aware that if currently one of the subprotocol onPeerConnected calls fails, the client will be disconnected as UselessPeer but no onPeerDisconnect calls are run.

# Checking the other peer's supported sub-protocols

Upon establishing a connection, RLPx will automatically negotiate the list of mutually supported protocols by the peers. To check whether a particular peer supports a particular sub-protocol, use the following code:

if peer.supports(les): # `les` is the identifier of the light clients sub-protocol
  peer.getReceipts(nextReqId(), neededReceipts())

1
2
3

# keys

This library is a Nim re-implementation of eth-keys (opens new window): the common API for working with Ethereum's public and private keys, signatures, and addresses.

By default, Nim eth-keys uses Bitcoin's libsecp256k1 (opens new window) as a backend. Make sure libsecp256k1 is available on your system.

An experimental pure Nim backend (Warning ⚠: do not use in production) is available with the compilation switch -d:backend_native

# keyfile

# Introduction

This library is a Nim reimplementation of ethereum/eth-keyfile (opens new window), which is used to create and load Ethereum keyfile format and the tools for handling the format and for storing private keys. Currently, the library supports only the PBKDF2 method and does not support the Scrypt method.

# trie

# Nim Implementation of the Ethereum Trie structure

# Hexary Trie

# Binary Trie

Binary-trie is a dictionary-like data structure to store key-value pair. Much like it's sibling Hexary-trie, the key-value pair will be stored into key-value flat-db. The primary difference with Hexary-trie is, each node of Binary-trie only consist of one or two child, while Hexary-trie node can contains up to 16 or 17 child-nodes.

Unlike Hexary-trie, Binary-trie store it's data into flat-db without using rlp encoding. Binary-trie store its value using simple Node-Types encoding. The encoded-node will be hashed by keccak_256 and the hash value will be the key to flat-db. Each entry in the flat-db will looks like:

key value
32-bytes-keccak-hash encoded-node(KV or BRANCH or LEAF encoded)

# Node-Types

  • KV = [0, encoded-key-path, 32 bytes hash of child]
  • BRANCH = [1, 32 bytes hash of left child, 32 bytes hash of right child]
  • LEAF = [2, value]

The KV node can have BRANCH node or LEAF node as it's child, but cannot a KV node. The internal algorithm will merge a KV(parent)->KV(child) into one KV node. Every KV node contains encoded keypath to reduce the number of blank nodes.

The BRANCH node can have KV, BRANCH, or LEAF node as it's children.

The LEAF node is the terminal node, it contains the value of a key.

# encoded-key-path

While Hexary-trie encode the path using Hex-Prefix encoding, Binary-trie encode the path using binary encoding, the scheme looks like this table below.

            |--------- odd --------|
       00mm yyyy xxxx xxxx xxxx xxxx
            |------ even -----|
  1000 00mm yyyy xxxx xxxx xxxx
1
2
3
4
symbol explanation
xxxx nibble of binary keypath in bits, 0 = left, 1 = right
yyyy nibble contains 0-3 bits padding + binary keypath
mm number of binary keypath bits modulo 4 (0-3)
00 zero zero prefix
1000 even numbered nibbles prefix

if there is no padding, then yyyy bit sequence is absent, mm also zero. yyyy = mm bits + padding bits must be 4 bits length.

# The API

The primary API for Binary-trie is set and get.

  • set(key, value) --- store a value associated with a key
  • get(key): value --- get a value using a key

Both key and value are of seq[byte] type. And they cannot have zero length.

Getting a non-existent key will return zero length seq[byte].

Binary-trie also provide dictionary syntax API for set and get.

  • trie[key] = value -- same as set
  • value = trie[key] -- same as get
  • contains(key) a.k.a. in operator

Additional APIs are:

  • exists(key) -- returns bool, to check key-value existence -- same as contains
  • delete(key) -- remove a key-value from the trie
  • deleteSubtrie(key) -- remove a key-value from the trie plus all of it's subtrie that starts with the same key prefix
  • rootNode() -- get root node
  • rootNode(node) -- replace the root node
  • getRootHash(): KeccakHash with seq[byte] type
  • getDB(): DB -- get flat-db pointer

Constructor API:

  • initBinaryTrie(DB, rootHash[optional]) -- rootHash has seq[byte] or KeccakHash type
  • init(BinaryTrie, DB, rootHash[optional])

Normally you would not set the rootHash when constructing an empty Binary-trie. Setting the rootHash occured in a scenario where you have a populated DB with existing trie structure and you know the rootHash, and then you want to continue/resume the trie operations.

# Examples

import
  eth/trie/[db, binary, utils]

var db = newMemoryDB()
var trie = initBinaryTrie(db)
trie.set("key1", "value1")
trie.set("key2", "value2")
doAssert trie.get("key1") == "value1".toBytes
doAssert trie.get("key2") == "value2".toBytes

## delete all subtrie with key prefixes "key"
trie.deleteSubtrie("key")
doAssert trie.get("key1") == []
doAssert trie.get("key2") == []]

trie["moon"] = "sun"
doAssert "moon" in trie
doAssert trie["moon"] == "sun".toBytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

Remember, set and get are trie operations. A single set operation may invoke more than one store/lookup operation into the underlying DB. The same is also happened to get operation, it could do more than one flat-db lookup before it return the requested value.

# The truth behind a lie

What kind of lie? actually, delete and deleteSubtrie doesn't remove the 'deleted' node from the underlying DB. It only make the node inaccessible from the user of the trie. The same also happened if you update the value of a key, the old value node is not removed from the underlying DB. A more subtle lie also happened when you add new entrie into the trie using set operation. The previous hash of affected branch become obsolete and replaced by new hash, the old hash become inaccessible to the user. You may think that is a waste of storage space. Luckily, we also provide some utilities to deal with this situation, the branch utils.

# The branch utils

The branch utils consist of these API:

  • checkIfBranchExist(DB; rootHash; keyPrefix): bool
  • getBranch(DB; rootHash; key): branch
  • isValidBranch(branch, rootHash, key, value): bool
  • getWitness(DB; nodeHash; key): branch
  • getTrieNodes(DB; nodeHash): branch

keyPrefix, key, and value are bytes container with length greater than zero. They can be openArray[byte].

rootHash and nodeHash also bytes container, but they have constraint: must be 32 bytes in length, and it must be a keccak_256 hash value.

branch is a list of nodes, or in this case a seq[seq[byte]]. A list? yes, the structure is stored along with the encoded node. Therefore a list is enough to reconstruct the entire trie/branch.

import
  eth/trie/[db, binary, utils]

var db = newMemoryDB()
var trie = initBinaryTrie(db)
trie.set("key1", "value1")
trie.set("key2", "value2")

doAssert checkIfBranchExist(db, trie.getRootHash(), "key") == true
doAssert checkIfBranchExist(db, trie.getRootHash(), "key1") == true
doAssert checkIfBranchExist(db, trie.getRootHash(), "ken") == false
doAssert checkIfBranchExist(db, trie.getRootHash(), "key123") == false
1
2
3
4
5
6
7
8
9
10
11
12

The tree will looks like:

    root --->  A(kvnode, *common key prefix*)
                         |
                         |
                         |
                    B(branchnode)
                     /         \
                    /           \
                   /             \
C1(kvnode, *remain kepath*) C2(kvnode, *remain kepath*)
            |                           |
            |                           |
            |                           |
  D1(leafnode, b'value1')       D2(leafnode, b'value2')
1
2
3
4
5
6
7
8
9
10
11
12
13
var branchA = getBranch(db, trie.getRootHash(), "key1")
## ==> [A, B, C1, D1]

var branchB = getBranch(db, trie.getRootHash(), "key2")
## ==> [A, B, C2, D2]

doAssert isValidBranch(branchA, trie.getRootHash(), "key1", "value1") == true
## wrong key, return zero bytes
doAssert isValidBranch(branchA, trie.getRootHash(), "key5", "") == true

doAssert isValidBranch(branchB, trie.getRootHash(), "key1", "value1") # InvalidNode

var x = getBranch(db, trie.getRootHash(), "key")
## ==> [A]

x = getBranch(db, trie.getRootHash(), "key123") # InvalidKeyError
x = getBranch(db, trie.getRootHash(), "key5") # there is still branch for non-exist key
## ==> [A]

var branch = getWitness(db, trie.getRootHash(), "key1")
## equivalent to `getBranch(db, trie.getRootHash(), "key1")`
## ==> [A, B, C1, D1]

branch = getWitness(db, trie.getRootHash(), "key")
## this will include additional nodes of "key2"
## ==> [A, B, C1, D1, C2, D2]

var wholeTrie = getWitness(db, trie.getRootHash(), "")
## this will return the whole trie
## ==> [A, B, C1, D1, C2, D2]

var node = branch[1] # B
let nodeHash = keccak256.digest(node.baseAddr, uint(node.len))
var nodes = getTrieNodes(db, nodeHash)
doAssert nodes.len == wholeTrie.len - 1
## ==> [B, C1, D1, C2, D2]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

# Remember the lie?

Because trie delete, deleteSubtrie and set operation create inaccessible nodes in the underlying DB, we need to remove them if necessary. We already see that wholeTrie = getWitness(db, trie.getRootHash(), "") will return the whole trie, a list of accessible nodes. Then we can write the clean tree into a new DB instance to replace the old one.

# Sparse Merkle Trie

Sparse Merkle Trie(SMT) is a variant of Binary Trie which uses binary encoding to represent path during trie travelsal. When Binary Trie uses three types of node, SMT only use one type of node without any additional special encoding to store it's key-path.

Actually, it doesn't even store it's key-path anywhere like Binary Trie, the key-path is stored implicitly in the trie structure during key-value insertion.

Because the key-path is not encoded in any special ways, the bits can be extracted directly from the key without any conversion.

However, the key restricted to a fixed length because the algorithm demand a fixed height trie to works properly. In this case, the trie height is limited to 160 level, or the key is of fixed length 20 bytes (8 bits x 20 = 160).

To be able to use variable length key, the algorithm can be adapted slightly using hashed key before constructing the binary key-path. For example, if using keccak256 as the hashing function, then the height of the tree will be 256, but the key itself can be any length.

# The API

The primary API for Binary-trie is set and get.

  • set(key, value, rootHash[optional]) --- store a value associated with a key
  • get(key, rootHash[optional]): value --- get a value using a key

Both key and value are of BytesRange type. And they cannot have zero length. You can also use convenience API get and set which accepts Bytes or string (a string is conceptually wrong in this context and may costlier than a BytesRange, but it is good for testing purpose).

rootHash is an optional parameter. When used, get will get a key from specific root, and set will also set a key at specific root.

Getting a non-existent key will return zero length BytesRange or a zeroBytesRange.

Sparse Merkle Trie also provide dictionary syntax API for set and get.

  • trie[key] = value -- same as set
  • value = trie[key] -- same as get
  • contains(key) a.k.a. in operator

Additional APIs are:

  • exists(key) -- returns bool, to check key-value existence -- same as contains
  • delete(key) -- remove a key-value from the trie
  • getRootHash(): KeccakHash with BytesRange type
  • getDB(): DB -- get flat-db pointer
  • prove(key, rootHash[optional]): proof -- useful for merkling

Constructor API:

  • initSparseBinaryTrie(DB, rootHash[optional])
  • init(SparseBinaryTrie, DB, rootHash[optional])

Normally you would not set the rootHash when constructing an empty Sparse Merkle Trie. Setting the rootHash occured in a scenario where you have a populated DB with existing trie structure and you know the rootHash, and then you want to continue/resume the trie operations.

# Examples

import
  eth/trie/[db, sparse_binary, utils]

var
  db = newMemoryDB()
  trie = initSparseMerkleTrie(db)

let
  key1 = "01234567890123456789"
  key2 = "abcdefghijklmnopqrst"

trie.set(key1, "value1")
trie.set(key2, "value2")
doAssert trie.get(key1) == "value1".toBytes
doAssert trie.get(key2) == "value2".toBytes

trie.delete(key1)
doAssert trie.get(key1) == []

trie.delete(key2)
doAssert trie[key2] == []
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

Remember, set and get are trie operations. A single set operation may invoke more than one store/lookup operation into the underlying DB. The same is also happened to get operation, it could do more than one flat-db lookup before it return the requested value. While Binary Trie perform a variable numbers of lookup and store operations, Sparse Merkle Trie will do constant numbers of lookup and store operations each get and set operation.

# Merkle Proofing

Using prove dan verifyProof API, we can do some merkling with SMT.

  let
    value1 = "hello world"
    badValue = "bad value"

  trie[key1] = value1
  var proof = trie.prove(key1)

  doAssert verifyProof(proof, trie.getRootHash(), key1, value1) == true
  doAssert verifyProof(proof, trie.getRootHash(), key1, badValue) == false
  doAssert verifyProof(proof, trie.getRootHash(), key2, value1) == false
1
2
3
4
5
6
7
8
9
10

# bloom: an Ethereum Bloom Filter

# Introduction

A Nim implementation of the bloom filter used by Ethereum.

# Description

Bloom filters (opens new window) are data structures that use hash functions to test whether an element is a member of a set. They work like other data structures but are probabilistic in nature: that is, they allow false positive matches but not false negative. Bloom filters use less storage space than other data structures.

Ethereum bloom filters are implemented with the Keccak-256 cryptographic hash function.

To see the bloom filter used in the context of Ethereum, please refer to the Ethereum Yellow Paper (opens new window).

# Usage

import eth/bloom, stint
var f: BloomFilter
f.incl("test1")
doAssert("test1" in f)
doAssert("test2" notin f)
f.incl("test2")
doAssert("test2" in f)
doAssert(f.value.toHex == "80000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000200000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000040000000000000000000000000000000000000000000000000000000000000000000")
1
2
3
4
5
6
7
8

# Node Discovery Protocol v5

# Introduction

The eth/p2p/discoveryv5 directory holds a Nim implementation of the Node Discovery Protocol v5.1 (opens new window).

The implemented specification is Protocol version v5.1.

This implementation does not support "Topic Advertisement" yet as this part of the specification is not complete.

The implementation relies on other modules in the eth package, namely: keys, rlp and async_utils.

# How to use

let
  rng = keys.newRng
  privKey = PrivateKey.random(rng[])
  (ip, tcpPort, udpPort) = setupNat(config) # Or fill in external IP/ports manually
  d = newProtocol(privKey, ip, tcpPort, udpPort, rng = rng)

d.open() # Start listening
1
2
3
4
5
6
7

This will initialize the Protocol and start listening. However, as no bootstrap nodes were passed in the newProtocol call, the created ENR will need to be advertised somehow ("out of band"), so that the node can become known to other nodes in the network.

To initialize with a bootnode or a set of bootnodes, the ENRs need to be passed as parameter in newProtocol.

d = newProtocol(privKey, ip, tcpPort, udpPort,
      bootstrapRecords = bootnodes)
d.open() # Start listening and add bootstrap nodes to the routing table.
1
2
3

Next there are two ways to run the protocol.

One can call d.start() and two loops will be started:

  1. Refresh loop
  2. Revalidation loop

The first loop will at specific interval do a query with a random NodeId if no manual queries were done for more than that interval period. This query will add discovered nodes to the routing table. The second loop will at random ranged interval send a ping to the least recently seen node in a random bucket. This is to keep the routing table cleared of unreachable/dead nodes.

Now within the application, manual queries or lookups can be done, for which the discovered nodes can be used. Nodes discovered during this process will be attempted to be added to the routing table. One can use the query, queryRandom or lookup calls for this. randomNodes can also be used to find nodes, but this will only look into the current routing table and not actively search for nodes on the network.

Or, one can decide not to run d.start() and do this manually within its application by using the available calls:

  • query, queryRandom or lookup for discovering more peers.
  • revalidateNode or directly ping for revalidating nodes.

Of course, in either scenario, lookups can still be done for actually finding a specific node. There is a resolve call that can help with this, it will first look in the local routing table and if it finds the node it will try to contact the node directly to check if the ENR is up to date. If any of this fail a lookup will be done.

# Test suite

To run the test suite specifically for discovery v5 related (discovery v5 + its nim-eth dependencies) tests, one can run following command:

## Install required modules
nimble install
## Run only discovery v5 related test suite
nimble tests_discv5_full
1
2
3
4

# dcli

This is a small command line application that allows you to run a discovery node. It also has the options to do a ping or findNode request to a specific node, by providing its ENR.

# Example usage

## Install required modules
## Make sure you have the latest modules, do NOT trust nimble on this.
nimble install
## Build dcli
nim c -d:chronicles_log_level:trace -d:release --threads:on eth/p2p/discoveryv5/dcli
## See all options
./eth/p2p/discoveryv5/dcli --help
## Ping another node
./eth/p2p/discoveryv5/dcli ping enr:<base64 encoding of ENR>
## Run discovery node
./eth/p2p/discoveryv5/dcli --log-level:debug --bootnode:enr:<base64 encoding of ENR>
1
2
3
4
5
6
7
8
9
10
11

# Metrics

To run dcli with metrics enabled provide the metrics flag:

## Run dcli with metrics
./eth/p2p/discoveryv5/dcli --metrics --bootnode:enr:<base64 encoding of ENR>
1
2

You can now see the metrics at http://localhost:8008/metrics. Or use e.g. Prometheus to grab the data.

# Prerequisites

  • Nim & Nimble
  • RocksDB, SQLite, LMDB (required for the trie backend tests)

E.g. on Ubuntu one can run:

apt install -y librocksdb-dev liblmdb-dev sqlite3
1

# Building & Testing

# Install required modules
nimble install
# Run full test suite
nimble test
1
2
3
4

You can also run specific parts of the test suite, e.g.:

# Test p2p functionality
nimble test_p2p
# Test rlp functionality
nimble test_rlp
1
2
3
4

# Fuzzing

Next to the test suite, there are also several fuzzing test cases available. How these can be run is explained in the fuzzing readme (opens new window).