EtherDiffer: Differential Testing on RPC Services of Ethereum Nodes
Blockchain is a distributed ledger that records transactions among users on top of a peer-to-peer network. While various blockchain platforms exist, Ethereum is the most popular general-purpose platform and its support of smart contracts led to a new form of applications called decentralized applications (DApps). A typical DApp has an off-chain frontend and on-chain backend architecture, and the frontend often needs interactions with the backend network, e.g., to acquire chain data or make transactions. Therefore, Ethereum nodes implement the official RPC specification and expose a uniform set of RPC methods to the frontend. However, the specification is not sufficient in two points: (1) lack of clarification for non-deterministic event handling, and (2) lack of specification for invalid-as-themselves arguments. To effectively disclose any deviations caused by the insufficiency, this paper introduces EtherDiffer that automatically performs differential testing on four major node implementations in terms of their RPC services. EtherDiffer first generates a non-deterministic chain by multi-concurrent transactions and propagation delay. Then, it applies our key techniques called property-based generation and type-preserving mutation to generate both semantically-valid and semantically-invalid-yet-executable test cases. EtherDiffer executes the test cases on target nodes and reports any deviations in error handling or return values. The evaluation showed the effectiveness of our test case generation techniques with the success ratios of 98.8% and 95.4%, respectively. Also, EtherDiffer detected 48 different classes of deviations including 11 implementation bugs such as crash and denial-of-service bugs. We reported 44 of the detected classes to the specification and node developers and received acknowledgements as well as bug patches. Lastly, it significantly outperformed the official node testing tool in every technical aspect. We believe that our research findings can contribute to more stable DApp ecosystem by reducing the inconsistencies among nodes.