Indexer Architecture
This document provides a detailed overview of the indexer's architecture and how it processes data from the Solana blockchain.
Overview
The indexer is a Rust application built on the Carbon framework. It ingests data from a Solana Geyser source (Yellowstone gRPC), processes it, and stores it in a ClickHouse database. The indexer is designed to be highly performant and reliable, with a focus on data accuracy and completeness.
Data Flow
The data flow can be summarized as follows:
- Data Ingestion: The indexer connects to a Solana Geyser gRPC stream to receive real-time account and transaction updates.
- Decoding: The raw data from Geyser is decoded into specific account and instruction types using the
carbon-raydium-clmm-decoder,carbon-orca-whirlpool-decoder, andcarbon-pump-swap-decodercrates. - Processing: The decoded data is then processed by a series of processors, each responsible for handling a specific type of data (e.g., pool state updates, swaps, tick array updates).
- Data Storage: The processed data is batched and written to a ClickHouse database using the HTTP interface.
Processors
The core of the indexer is its set of processors. Each processor is responsible for a specific part of the data processing pipeline.
Pool State Processor
The pool_state.rs processor handles updates to Raydium and Orca CLMM pool accounts.
- It uses the
ClmmPoolAccounttrait to abstract over the different pool account structures. - For each account update, it creates a
PoolStateDeltastruct, which contains the new state of the pool, includingsqrt_price_x64,liquidity, andtick_current. - It calculates the human-readable price from the
sqrt_price_x64value using the formula:price = (sqrt_price_x64 / 2^64)^2 * 10^(decimals_a - decimals_b). - The
PoolStateDeltais then serialized and sent to the ClickHouse sink.
Swap Processor
The pump.rs processor handles swap instructions for the Pump.fun and PumpSwap programs.
- It processes
BuyandSellinstructions. - For each swap, it creates a
SwapRecordstruct, which contains information about the swap, such as the amounts, mints, and side. - It uses a
BlockMetadataCacheto get the timestamp for the slot in which the swap occurred. - The
SwapRecordis then enqueued to theConfirmedSink, which handles batching and writing to ClickHouse.
Tick Array Processor
The ticks.rs processor handles updates to Raydium and Orca tick array accounts.
- It has functions to process tick arrays from both protocols (
raydium_tick_updates,orca_fixed_tick_updates,orca_dynamic_tick_updates). - For each initialized tick in the array, it creates a
TickUpdatestruct. - The
TickUpdatecontains information about the tick, such asliquidity_net,liquidity_gross, and fee growth. - The
TickUpdates are then sent to the ClickHouse sink.
ClickHouse Schema
The indexer uses a set of tables in ClickHouse to store the processed data. The schema is defined in Docs/spec.md and includes tables for:
pool_state_updates: Append-only log of pool state changes.tick_updates: Append-only log of tick array changes.pump_swaps: Append-only log of swaps.pool_state_latest: Materialized view for the latest state of each pool.pool_price_1s: Materialized view for 1-second price candles.pool_price_1m: Materialized view for 1-minute price candles.
The use of ReplacingMergeTree and materialized views allows for efficient querying of both historical and real-time data.