Slide 1

Slide 1 text

Building scalable monitoring infrastructure from scratch Arseniy Reutov CTO Decurity @theRaz0r

Slide 2

Slide 2 text

[0x0] Letʼs catch smart contract exploits [0x1] Architecture [0x2] Approaches to implementation [0x3] Exploit detection techniques [0x4] Results Agenda

Slide 3

Slide 3 text

Letʼs catch smart contract exploits Goal detect DeFi exploits Requirements declarative and concise rules with unit testing and great maintainability Approach monitor activity of addresses with anonymous funding and look for specific patterns in call traces

Slide 4

Slide 4 text

Letʼs catch smart contract exploits But… ● No snort/yara/mod_security alternatives in web3 ● No open standards for threat intelligence

Slide 5

Slide 5 text

5 [0x1] Monitoring infra architecture

Slide 6

Slide 6 text

ETL - Extract, Transform, Load ethereum-etl ● can push data to message queues ● disassembles bytecode with evm-dasm ● written in Python cryo ● relatively new project ● extracts storage and balance diffs ● written in Rust

Slide 7

Slide 7 text

Data enrichment So we forked ethereum-etl • added bulk extraction with eth_blockReceipts • to calculate balance changes integrated Defillama API • added support for Geth style traces in streaming mode • integrated gigahorse-toolchain instead of evm-dasm

Slide 8

Slide 8 text

Bytecode analysis gigahorse-toolchain ● powers Dedaubʼs bytecode decompiler ● written in Datalog ● allows to create custom rules (yay) heimdall ● pretty new but powerful tool to analyze bytecode ● written in Rust ● generates Solidity code

Slide 9

Slide 9 text

Exploit bytecode analysis ● Replaced evm-dasm with gigahorse-toolchain in ethereum-etl ● Created custom analysis rules in Datalog ● Detect exploits based on specific bytecode features

Slide 10

Slide 10 text

Exploit features ● Not many functions (usually < 10) ● High rate of unknown selectors ● Presence of flashloan selectors ● Lots of external calls ● No emitted events ● Debugging symbols (e.g. console.log) ● SELFDESTRUCT ● CREATE2

Slide 11

Slide 11 text

Stream analysis Apache Kafka • Supported by ethereum-etl • Straightforward and reliable Apache Flink • Enables real-time stream processing • Performs stateful computations expressed in SQL • Supports Complex Event Processing (CEP)

Slide 12

Slide 12 text

Data processing workflow 1) Describe sources using DDL: transactions, contracts, logs, token_transfers, etc 2) Define data sinks for alerts 3) Execute continuous SQL queries

Slide 13

Slide 13 text

13

Slide 14

Slide 14 text

Source DDL example CREATE TABLE {{ NETWORK }}_logs ( log_index int, transaction_hash string, transaction_index int, address varchar, data string, topics array, block_timestamp int, block_number int, block_hash varchar, proc_time as PROCTIME(), event_time as TO_TIMESTAMP_LTZ(block_timestamp, 0), WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND ) WITH ( 'connector' = 'kafka', 'topic' = '{{ NETWORK }}.logs', 'properties.bootstrap.servers' = '{{ KAFKA_BROKER }}', 'properties.group.id' = 'defimon', 'scan.startup.mode' = 'latest-offset', 'format' = 'json' )

Slide 15

Slide 15 text

What can we do now? ● Join streams, e.g. traces with logs ● Aggregate data using tumble, hop and session windows ● Continuously calculate top N ● Detect sequences of events (CEP) ● and more ● Track withdrawals from Tornado, Fixed Float, Railgun, etc ● Track new contract deployments by these addresses ● Track any intermediate addresses in between ● Calculate total transfer sums within a transaction or some period of time ● Match specific patterns in call traces

Slide 16

Slide 16 text

Exploit detection example: selfdestruct after proxy upgraded SELECT '{{ NETWORK }}' AS network, 'CRITICAL' AS severity, 'selfdestruct_after_upgraded' AS attack_type, /* ... */ FROM {{ NETWORK }}_logs AS l JOIN {{ NETWORK }}_traces AS t ON t.transaction_hash = l.transaction_hash /* keccak('Upgraded(address)') */ WHERE l.topics[1] = '0xbc7cd75a20ee27fd9adebab32041f755214dbc6bffa90cc0225b39da2e5c2d3b' AND t.trace_type = 'suicide' AND l.address = t.from_address AND l.event_time = t.event_time

Slide 17

Slide 17 text

Exploit detection example: reentrancy MATCH_RECOGNIZE ( PARTITION BY transaction_hash PATTERN (ERC_CALLBACK ANY_CALL*? REENTER_CALL ANY_CALL_AGAIN*? ERC_CALLBACK_AGAIN) WITHIN INTERVAL '2' SECOND DEFINE /* 1) ERC callback call */ ERC_CALLBACK AS call_type = 'call' AND ( SUBSTRING(input FROM 1 FOR 10) = '0x150b7a02' OR /* ERC721: onERC721Received */ SUBSTRING(input FROM 1 FOR 10) = '0xf23a6e61' OR /* ERC1155: onERC1155Received */ SUBSTRING(input FROM 1 FOR 10) = '0xb124c41b' OR /* ERC677: callAfterTransfer */ SUBSTRING(input FROM 1 FOR 10) = '0x0023de29' /* ERC777: tokensReceived */ ), /* 2) any contract calls in between */ ANY_CALL AS POSITION(ERC_CALLBACK.trace_address IN ANY_CALL.trace_address) = 1, /* 3) exploit contract re-enters from ERC callback */ REENTER_CALL AS call_type = 'call' AND REENTER_CALL.to_address = ERC_CALLBACK.from_address AND POSITION(ERC_CALLBACK.trace_address IN REENTER_CALL.trace_address) = 1, /* 4) any contract calls in between */ ANY_CALL_AGAIN AS POSITION(REENTER_CALL.trace_address IN ANY_CALL_AGAIN.trace_address) = 1, /* 5) vulnerable contract calls ERC callback again */ ERC_CALLBACK_AGAIN AS call_type = 'call' AND SUBSTRING(input FROM 1 FOR 10) = SUBSTRING(ERC_CALLBACK.input FROM 1 FOR 10) AND POSITION(REENTER_CALL.trace_address IN ERC_CALLBACK_AGAIN.trace_address) = 1 AND ERC_CALLBACK_AGAIN.to_address = ERC_CALLBACK.to_address )

Slide 18

Slide 18 text

18

Slide 19

Slide 19 text

Results ● Scalable pipeline deployed in AWS ● Real-time alerts in dashboard and Telegram ● Extracting and analyzing all transactions from Ethereum and BSC networks ● Detected >50 exploits since May including Curve 0-day

Slide 20

Slide 20 text

Results Prevented a hack on Fortress protocol

Slide 21

Slide 21 text

Thank you Arseniy Reutov CTO Decurity @theRaz0r