Introduction

ApexStore Configuration Guide

This guide explains all configuration parameters available in ApexStore. All settings can be configured via environment variables without recompilation.

Table of Contents

  1. Quick Start
  2. Server Configuration
  3. LSM Engine Configuration
  4. Performance Tuning
  5. Tuning Profiles
  6. Troubleshooting

Quick Start

# 1. Copy the example configuration
cp .env.example .env

# 2. Edit values as needed
nano .env

# 3. Run the server
cargo run --release --features api --bin apexstore-server

The server will load .env automatically and display all active configuration on startup.

Server Configuration

Network Settings

VariableDefaultDescription
HOST0.0.0.0Server bind address (0.0.0.0 = all interfaces)
PORT8080Server port

Payload Limits

VariableDefaultDescription
MAX_JSON_PAYLOAD_SIZE52428800 (50MB)Maximum JSON request/response size
MAX_RAW_PAYLOAD_SIZE52428800 (50MB)Maximum raw payload size

Recommendations:

  • Development/Testing: 50-100MB
  • Production with pagination: 10MB
  • Stress testing: 100-200MB

HTTP Server Tuning

VariableDefaultDescription
SERVER_WORKERS0 (CPU cores)Number of worker threads
SERVER_KEEP_ALIVE75Keep-alive timeout (seconds)
SERVER_CLIENT_TIMEOUT60Client request timeout (seconds)
SERVER_SHUTDOWN_TIMEOUT30Graceful shutdown timeout (seconds)
SERVER_BACKLOG2048Maximum pending connections
SERVER_MAX_CONNECTIONS25000Max concurrent connections per worker

Recommendations:

  • High-traffic: Increase SERVER_WORKERS to 8-16
  • Low-latency: Set SERVER_KEEP_ALIVE to 5-15
  • Memory-constrained: Reduce SERVER_MAX_CONNECTIONS to 5000-10000

LSM Engine Configuration

Storage Settings

VariableDefaultDescription
DATA_DIR./.lsm_dataData storage directory path

MemTable

VariableDefaultDescription
MEMTABLE_MAX_SIZE4194304 (4MB)Size threshold before flush to disk

Impact:

  • Larger (8-16MB): Fewer flushes, better compression, higher memory usage
  • Smaller (1-2MB): More flushes, lower memory usage, faster recovery

Recommendations:

  • Write-heavy: 8-16MB
  • Memory-constrained: 2MB
  • Balanced: 4MB (default)

SSTable Block Configuration

VariableDefaultDescription
BLOCK_SIZE4096 (4KB)Block size for SSTables
BLOCK_CACHE_SIZE_MB64In-memory cache for blocks (MB)
SPARSE_INDEX_INTERVAL16Blocks between index entries

Block Size Impact:

  • Larger (8KB): Better compression ratio, higher read latency
  • Smaller (2KB): Lower latency, less compression

Cache Size Recommendations:

  • Read-heavy: 256-512MB
  • Balanced: 64-128MB
  • Memory-constrained: 32MB

Sparse Index:

  • Dense (8): More memory, faster lookups
  • Sparse (32): Less memory, slower lookups

Bloom Filter

VariableDefaultDescription
BLOOM_FALSE_POSITIVE_RATE0.01 (1%)False positive probability

Impact:

  • Lower (0.001 = 0.1%): More accurate, more memory
  • Higher (0.05 = 5%): Less accurate, less memory

Recommendations:

  • Read-heavy: 0.001-0.005
  • Balanced: 0.01
  • Memory-constrained: 0.05

Write-Ahead Log (WAL)

VariableDefaultDescription
MAX_WAL_RECORD_SIZE33554432 (32MB)Maximum single record size
WAL_BUFFER_SIZE65536 (64KB)Write buffer size
WAL_SYNC_MODEalwaysFsync strategy

Sync Modes:

  • always: Safest, slowest (every write synced)
  • every_second: Balanced (1s of data loss possible)
  • manual: Fastest, least safe (crash = data loss)

Recommendations:

  • Production: always
  • High-throughput: every_second
  • Testing/Dev: manual

Compaction

VariableDefaultDescription
COMPACTION_STRATEGYlazy_levelingCompaction algorithm
SIZE_RATIO10Size ratio between levels
LEVEL0_COMPACTION_THRESHOLD4L0 file count trigger
MAX_LEVEL_COUNT7Maximum LSM tree levels
COMPACTION_THREADS2Background compaction threads

Compaction Strategies:

  • leveled: Best read performance
  • tiered: Best write performance
  • lazy_leveling: Balanced (default)

Recommendations:

  • Read-heavy: leveled, SIZE_RATIO=4-6
  • Write-heavy: tiered, SIZE_RATIO=15-20
  • High-throughput: COMPACTION_THREADS=4-8

Feature Flags

VariableDefaultDescription
FEATURE_CACHE_TTL10Cache TTL in seconds

Performance Tuning

Memory vs. Performance Trade-offs

High Memory โ†’ High Performance:
MEMTABLE_MAX_SIZE=16777216         # 16MB
BLOCK_CACHE_SIZE_MB=512            # 512MB
BLOOM_FALSE_POSITIVE_RATE=0.001    # 0.1%
SPARSE_INDEX_INTERVAL=8            # Dense

Low Memory โ†’ Acceptable Performance:
MEMTABLE_MAX_SIZE=2097152          # 2MB
BLOCK_CACHE_SIZE_MB=32             # 32MB
BLOOM_FALSE_POSITIVE_RATE=0.05     # 5%
SPARSE_INDEX_INTERVAL=32           # Sparse

Latency vs. Throughput

Low Latency:
BLOCK_SIZE=2048                    # 2KB
WAL_SYNC_MODE=every_second
SERVER_KEEP_ALIVE=5

High Throughput:
BLOCK_SIZE=8192                    # 8KB
MEMTABLE_MAX_SIZE=16777216         # 16MB
COMPACTION_THREADS=8
WAL_BUFFER_SIZE=262144             # 256KB

Tuning Profiles

Stress Testing Profile

MAX_JSON_PAYLOAD_SIZE=104857600    # 100MB
MAX_RAW_PAYLOAD_SIZE=104857600
MEMTABLE_MAX_SIZE=16777216         # 16MB
BLOCK_SIZE=8192
BLOCK_CACHE_SIZE_MB=256
WAL_SYNC_MODE=every_second
COMPACTION_THREADS=4

High Write Throughput

MEMTABLE_MAX_SIZE=8388608          # 8MB
BLOCK_SIZE=8192
BLOOM_FALSE_POSITIVE_RATE=0.05
WAL_SYNC_MODE=every_second
WAL_BUFFER_SIZE=262144             # 256KB
COMPACTION_THREADS=4
LEVEL0_COMPACTION_THRESHOLD=8
COMPACTION_STRATEGY=tiered

High Read Throughput

BLOCK_CACHE_SIZE_MB=512
BLOOM_FALSE_POSITIVE_RATE=0.001
SPARSE_INDEX_INTERVAL=8
COMPACTION_STRATEGY=leveled
SIZE_RATIO=4

Memory Constrained

MEMTABLE_MAX_SIZE=2097152          # 2MB
BLOCK_CACHE_SIZE_MB=32
BLOOM_FALSE_POSITIVE_RATE=0.05
SPARSE_INDEX_INTERVAL=32
SERVER_MAX_CONNECTIONS=5000
COMPACTION_THREADS=1

Balanced Production

MEMTABLE_MAX_SIZE=4194304          # 4MB
BLOCK_SIZE=4096
BLOCK_CACHE_SIZE_MB=128
BLOOM_FALSE_POSITIVE_RATE=0.01
WAL_SYNC_MODE=always
COMPACTION_THREADS=2
SERVER_WORKERS=4
SERVER_MAX_CONNECTIONS=10000

Troubleshooting

High Memory Usage

  1. Reduce MEMTABLE_MAX_SIZE
  2. Reduce BLOCK_CACHE_SIZE_MB
  3. Increase BLOOM_FALSE_POSITIVE_RATE
  4. Increase SPARSE_INDEX_INTERVAL
  5. Reduce SERVER_MAX_CONNECTIONS

Slow Writes

  1. Increase MEMTABLE_MAX_SIZE
  2. Change WAL_SYNC_MODE to every_second
  3. Increase WAL_BUFFER_SIZE
  4. Increase COMPACTION_THREADS
  5. Use COMPACTION_STRATEGY=tiered

Slow Reads

  1. Increase BLOCK_CACHE_SIZE_MB
  2. Decrease BLOOM_FALSE_POSITIVE_RATE
  3. Decrease SPARSE_INDEX_INTERVAL
  4. Use COMPACTION_STRATEGY=leveled
  5. Decrease SIZE_RATIO

Payload Too Large Errors

  1. Increase MAX_JSON_PAYLOAD_SIZE
  2. Increase MAX_RAW_PAYLOAD_SIZE
  3. Consider implementing pagination

Too Many Open Files

  1. Increase system file descriptor limit: ulimit -n 65536
  2. Reduce MAX_LEVEL_COUNT
  3. Decrease LEVEL0_COMPACTION_THRESHOLD
  4. Increase SIZE_RATIO

Monitoring Configuration Impact

# Enable detailed logging
RUST_LOG=debug cargo run --features api --bin apexstore-server

# Watch server startup for configuration values
# The server prints all active config on startup

# Monitor metrics via /stats endpoint
curl http://localhost:8080/stats/all

Best Practices

  1. Start with defaults - They work well for most workloads
  2. Profile first - Understand your workload before tuning
  3. Change one at a time - Easier to understand impact
  4. Monitor metrics - Use /stats/all endpoint
  5. Test in dev - Before applying to production
  6. Document changes - Track what works and what doesn't
  7. Use profiles - Quick starting points for common patterns

Environment-Specific Configs

Development

RUST_LOG=debug
MAX_JSON_PAYLOAD_SIZE=104857600
WAL_SYNC_MODE=manual

Staging

RUST_LOG=info
MAX_JSON_PAYLOAD_SIZE=52428800
WAL_SYNC_MODE=every_second

Production

RUST_LOG=warn
MAX_JSON_PAYLOAD_SIZE=10485760
WAL_SYNC_MODE=always
SERVER_WORKERS=8

References

Development Setup Guide

This guide will help you set up a development environment for ApexStore.

๐Ÿ’ป System Requirements

Minimum Requirements

  • OS: Linux, macOS, or Windows (with WSL2 recommended)
  • RAM: 4GB (8GB+ recommended for large workloads)
  • Disk: 2GB free space
  • CPU: Any modern CPU (multi-core recommended for testing)
  • OS: Ubuntu 22.04 LTS or macOS 13+
  • RAM: 16GB
  • Disk: 10GB free space (SSD preferred)
  • CPU: 4+ cores

๐Ÿ› ๏ธ Installing Prerequisites

1. Rust Toolchain

Installation

# Install Rust using rustup (recommended)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Follow the prompts, then reload your shell
source $HOME/.cargo/env

Verify Installation

# Check Rust version (should be 1.70+)
rustc --version
# Output: rustc 1.75.0 (or higher)

# Check Cargo version
cargo --version
# Output: cargo 1.75.0 (or higher)

Additional Components

# Install Clippy (linter)
rustup component add clippy

# Install rustfmt (formatter)
rustup component add rustfmt

# Install rust-src (for IDE support)
rustup component add rust-src

2. Git

Linux (Ubuntu/Debian)

sudo apt-get update
sudo apt-get install git

macOS

# Using Homebrew
brew install git

# Or install Xcode Command Line Tools
xcode-select --install

Windows

Download from git-scm.com

Verify Installation

git --version
# Output: git version 2.x.x

Install VS Code

  • Download from code.visualstudio.com
  • Or use package manager:
    # Ubuntu/Debian
    sudo snap install code --classic
    
    # macOS
    brew install --cask visual-studio-code
    

Install Extensions

Essential:

  1. rust-analyzer - Rust language support

    • ID: rust-lang.rust-analyzer
    • Features: Auto-completion, go-to-definition, inline errors
  2. CodeLLDB (optional) - Debugging support

    • ID: vadimcn.vscode-lldb

Recommended: 3. Better TOML - TOML syntax highlighting

  • ID: bungcip.better-toml
  1. Error Lens - Inline error highlighting

    • ID: usernamehw.errorlens
  2. crates - Cargo.toml dependency management

    • ID: serayuzgur.crates

VS Code Settings

Create .vscode/settings.json in project root:

{
  "rust-analyzer.checkOnSave.command": "clippy",
  "rust-analyzer.cargo.features": "all",
  "editor.formatOnSave": true,
  "[rust]": {
    "editor.defaultFormatter": "rust-lang.rust-analyzer"
  }
}

๐Ÿ“ฆ Project Setup

1. Clone the Repository

# Clone your fork
git clone https://github.com/YOUR_USERNAME/ApexStore.git
cd ApexStore

# Add upstream remote
git remote add upstream https://github.com/ElioNeto/ApexStore.git

# Verify remotes
git remote -v

2. Build the Project

# Debug build (faster compilation)
cargo build

# Release build (optimized)
cargo build --release

# Build with API feature
cargo build --release --features api

First build may take 5-10 minutes as Cargo downloads and compiles dependencies.

3. Run Tests

# Run all tests
cargo test

# Run tests with output
cargo test -- --nocapture

# Run specific test
cargo test test_memtable_insert

4. Configuration

# Copy environment template
cp .env.example .env

# Edit configuration (optional)
nano .env

Example Development Configuration (.env):

# Server
HOST=127.0.0.1
PORT=8080

# LSM Engine
DATA_DIR=.lsm_data_dev
MEMTABLE_MAX_SIZE=2097152  # 2MB for faster flushing in dev
BLOCK_CACHE_SIZE_MB=32

# Logging
RUST_LOG=debug
ENABLE_METRICS=true

๐Ÿงช Development Workflow

Running the CLI

# Debug mode
cargo run

# Release mode (faster)
cargo run --release

Available REPL Commands:

> help                 # Show available commands
> put key value        # Insert or update key
> get key              # Retrieve value
> delete key           # Delete key (tombstone)
> stats                # Show statistics
> exit                 # Exit REPL

Running the API Server

# Debug mode
cargo run --features api --bin apexstore-server

# Release mode
cargo run --release --features api --bin apexstore-server

# With custom port
PORT=3000 cargo run --release --features api --bin apexstore-server

Testing the API:

# Insert a key
curl -X POST http://localhost:8080/keys \
  -H "Content-Type: application/json" \
  -d '{"key": "user:1", "value": "Alice"}'

# Get a key
curl http://localhost:8080/keys/user:1

# Get statistics
curl http://localhost:8080/stats/all

Code Quality Checks

# Format code
cargo fmt

# Check formatting (CI mode)
cargo fmt -- --check

# Run Clippy linter
cargo clippy

# Clippy with strict mode (CI mode)
cargo clippy -- -D warnings

# Check for unused dependencies
cargo machete  # Requires: cargo install cargo-machete

Running Benchmarks

# Install criterion
cargo install cargo-criterion

# Run all benchmarks
cargo bench

# Run specific benchmark
cargo bench memtable_insert

# Generate HTML report
open target/criterion/report/index.html

๐Ÿณ Docker Development

Build Docker Image

# Build production image
docker build -t apexstore:dev .

# Run in development mode with volume mount
docker run -d \
  --name apexstore-dev \
  -p 8080:8080 \
  -v $(pwd)/data:/data \
  -v $(pwd)/.env:/app/.env \
  apexstore:dev

Docker Compose Development

# Start services
docker-compose up -d

# View logs
docker-compose logs -f apexstore

# Restart after code changes
docker-compose restart

# Stop services
docker-compose down

๐Ÿ› ๏ธ Debugging

Using LLDB (VS Code)

  1. Install CodeLLDB extension
  2. Create .vscode/launch.json:
{
  "version": "0.2.0",
  "configurations": [
    {
      "type": "lldb",
      "request": "launch",
      "name": "Debug CLI",
      "cargo": {
        "args": ["build", "--bin=apexstore"]
      },
      "args": [],
      "cwd": "${workspaceFolder}"
    },
    {
      "type": "lldb",
      "request": "launch",
      "name": "Debug Server",
      "cargo": {
        "args": ["build", "--features", "api", "--bin=apexstore-server"]
      },
      "args": [],
      "cwd": "${workspaceFolder}"
    },
    {
      "type": "lldb",
      "request": "launch",
      "name": "Debug Test",
      "cargo": {
        "args": ["test", "--no-run", "--lib"]
      },
      "args": ["test_memtable_insert"],
      "cwd": "${workspaceFolder}"
    }
  ]
}
  1. Set breakpoints in code
  2. Press F5 to start debugging

Using println! Debugging

#![allow(unused)]
fn main() {
// In your code
pub fn put(&mut self, key: &[u8], value: &str) -> Result<()> {
    println!("[DEBUG] Inserting key: {:?}", String::from_utf8_lossy(key));
    self.wal.append(key, value)?;
    println!("[DEBUG] WAL append successful");
    self.memtable.insert(key, value);
    Ok(())
}
}

Run with:

cargo run -- 2>&1 | grep DEBUG

Using RUST_LOG

# Set log level
export RUST_LOG=debug

# Or inline
RUST_LOG=trace cargo run

# Filter by module
RUST_LOG=apexstore::core::engine=debug cargo run

๐Ÿ“ˆ Performance Profiling

CPU Profiling (Linux)

# Install flamegraph
cargo install flamegraph

# Generate flamegraph
sudo cargo flamegraph --bin apexstore-server

# Open flamegraph.svg in browser
firefox flamegraph.svg

Memory Profiling

# Using valgrind (Linux)
valgrind --leak-check=full --track-origins=yes \
  ./target/debug/apexstore

# Using heaptrack (Linux)
heaptrack ./target/debug/apexstore
heaptrack_gui heaptrack.apexstore.*.gz

Benchmark Profiling

# Profile specific benchmark
cargo bench --bench memtable_bench -- --profile-time=10

# With flamegraph
cargo flamegraph --bench memtable_bench

๐Ÿงฐ Testing

Unit Tests

# Run all unit tests
cargo test --lib

# Run tests in specific module
cargo test --lib core::memtable

# Run single test
cargo test test_memtable_insert

Integration Tests

# Run all integration tests
cargo test --test '*'

# Run specific integration test file
cargo test --test recovery_test

Test Coverage

# Install tarpaulin (Linux only)
cargo install cargo-tarpaulin

# Generate coverage report
cargo tarpaulin --out Html

# Open report
firefox tarpaulin-report.html

Stress Testing

# Create stress test script
cat > stress_test.sh << 'EOF'
#!/bin/bash
for i in {1..10000}; do
  curl -X POST http://localhost:8080/keys \
    -H "Content-Type: application/json" \
    -d "{\"key\": \"key_$i\", \"value\": \"value_$i\"}" &
done
wait
EOF

chmod +x stress_test.sh

# Run stress test
./stress_test.sh

๐Ÿ“š Documentation

Generating Documentation

# Generate and open docs
cargo doc --open

# Include private items
cargo doc --document-private-items --open

# Generate for all features
cargo doc --all-features --open

Writing Documentation

All public APIs should have documentation:

#![allow(unused)]
fn main() {
/// Brief description (appears in summary).
///
/// Longer description with more details.
///
/// # Arguments
///
/// * `key` - Description of key parameter
/// * `value` - Description of value parameter
///
/// # Returns
///
/// Description of return value
///
/// # Errors
///
/// Description of possible errors
///
/// # Examples
///
/// ```
/// let engine = LsmEngine::new(config)?;
/// engine.put(b"key", "value")?;
/// ```
///
/// # Panics
///
/// Description of panic conditions (if any)
///
/// # Safety
///
/// Description of safety requirements (for unsafe functions)
pub fn put(&mut self, key: &[u8], value: &str) -> Result<()> {
    // Implementation
}
}

๐Ÿž Troubleshooting

Common Issues

Issue: Compilation Errors After Update

# Clean build artifacts
cargo clean

# Update dependencies
cargo update

# Rebuild
cargo build

Issue: Tests Failing Randomly

# Run tests serially (not in parallel)
cargo test -- --test-threads=1

Issue: Port Already in Use

# Find process using port 8080
lsof -i :8080  # macOS/Linux

# Kill process
kill -9 <PID>

# Or use different port
PORT=3000 cargo run --features api --bin apexstore-server

Issue: Out of Disk Space

# Clean target directory (safe, can be rebuilt)
rm -rf target/

# Clean cargo cache
cargo cache --autoclean

Issue: Slow Compilation

# Use faster linker (Linux)
sudo apt-get install lld
export RUSTFLAGS="-C link-arg=-fuse-ld=lld"

# Or use mold (even faster)
cargo install mold
export RUSTFLAGS="-C link-arg=-fuse-ld=mold"

# Enable incremental compilation (in Cargo.toml)
[profile.dev]
incremental = true

๐Ÿš€ Next Steps

  1. โœ… Read CONTRIBUTING.md for contribution guidelines
  2. โœ… Explore the codebase structure
  3. โœ… Pick an issue to work on
  4. โœ… Join discussions on GitHub

๐Ÿ’ฌ Getting Help


Happy Coding! ๐Ÿฆ€

Last updated: March 2026

Contributing to ApexStore

First off, thank you for considering contributing to ApexStore! ๐ŸŽ‰

This document provides guidelines and instructions for contributing to the project. Following these guidelines helps maintain code quality and makes the review process smoother.

๐Ÿ“œ Table of Contents


๐Ÿค Code of Conduct

Our Pledge

We are committed to providing a welcoming and inspiring community for all. We pledge to:

  • Be respectful and inclusive
  • Accept constructive criticism gracefully
  • Focus on what is best for the community
  • Show empathy towards other community members

Expected Behavior

  • Use welcoming and inclusive language
  • Be respectful of differing viewpoints and experiences
  • Gracefully accept constructive criticism
  • Focus on what is best for the community

Unacceptable Behavior

  • Trolling, insulting/derogatory comments, and personal or political attacks
  • Public or private harassment
  • Publishing others' private information without explicit permission
  • Other conduct which could reasonably be considered inappropriate

๐Ÿš€ Getting Started

Prerequisites

  1. Rust Toolchain (1.70 or later)

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    
  2. Git

    # Ubuntu/Debian
    sudo apt-get install git
    
    # macOS
    brew install git
    
  3. Code Editor (Recommended: VS Code with rust-analyzer)

Initial Setup

  1. Fork the Repository

  2. Clone Your Fork

    git clone https://github.com/YOUR_USERNAME/ApexStore.git
    cd ApexStore
    
  3. Add Upstream Remote

    git remote add upstream https://github.com/ElioNeto/ApexStore.git
    
  4. Install Dependencies

    cargo build
    
  5. Run Tests

    cargo test
    

For detailed setup instructions, see SETUP.md.


๐Ÿ”„ Development Workflow

Automated Workflow Overview

ApexStore uses GitHub Actions to automate the development workflow:

  • Feature/Fix branches โ†’ Auto-create PR to develop + run tests
  • Develop โ†’ Auto-create release PR to main
  • Main โ†’ Create release tag + close related issues

See WORKFLOWS.md for complete documentation.

1. Create a Feature Branch

# Update your fork
git checkout main
git pull upstream main

# Create a new branch from main
git checkout -b feature/your-feature-name

Branch Naming Conventions:

  • feature/ - New features (e.g., feature/compaction-strategy)
  • fix/ - Bug fixes (e.g., fix/wal-corruption)
  • docs/ - Documentation changes (e.g., docs/api-guide)
  • refactor/ - Code refactoring (e.g., refactor/codec-interface)
  • test/ - Test additions/improvements (e.g., test/integration-suite)
  • perf/ - Performance improvements (e.g., perf/bloom-filter-optimization)

2. Make Your Changes

# Make changes to the code
vim src/core/engine.rs

# Test your changes
cargo test --all-features

# Format code
cargo fmt

# Check for issues
cargo clippy --all-features -- -D warnings

3. Commit Your Changes

git add .
git commit -m "feat: add compaction strategy interface (#55)"

Important: Reference issues in commit messages using #issue_number for automatic tracking.

See Commit Messages for formatting guidelines.

4. Push to Your Fork

git push origin feature/your-feature-name

What Happens Next:

  • โœ… GitHub Actions automatically runs tests
  • โœ… Auto-creates PR to develop branch
  • โœ… Adds comment to referenced issues (if any)
  • โœ… Runs Clippy and format checks

5. Pull Request Review

Once the automated PR is created:

  1. Review the PR description
  2. Wait for CI checks to pass
  3. Address any reviewer feedback
  4. PR will be merged to develop once approved

Note: You don't need to manually create PRs - the workflow handles this automatically!


๐Ÿ“ Coding Standards

Rust Style Guide

We follow the Rust API Guidelines and Rust Style Guide.

Key Principles:

  1. Use cargo fmt - All code must be formatted

    cargo fmt --all
    
  2. Pass cargo clippy - Zero warnings policy

    cargo clippy --all-features -- -D warnings
    
  3. Write Documentation - Public APIs must have doc comments

    #![allow(unused)]
    fn main() {
    /// Retrieves a value from the store by key.
    ///
    /// # Arguments
    ///
    /// * `key` - The key to look up
    ///
    /// # Returns
    ///
    /// * `Ok(Some(value))` - Key found
    /// * `Ok(None)` - Key not found
    /// * `Err(e)` - Error occurred
    ///
    /// # Example
    ///
    /// ```
    /// let value = engine.get(b"user:123")?;
    /// ```
    pub fn get(&self, key: &[u8]) -> Result<Option<String>> {
        // Implementation
    }
    }

SOLID Principles

This project follows SOLID principles:

  • Single Responsibility: Each module/struct has one clear purpose
  • Open/Closed: Extend behavior through traits, not modification
  • Liskov Substitution: Implementations must be interchangeable
  • Interface Segregation: Small, focused traits
  • Dependency Inversion: Depend on abstractions, not concretions

Example:

#![allow(unused)]
fn main() {
// โœ… Good - depends on trait
pub struct LsmEngine<W: WriteAheadLog> {
    wal: W,
}

// โŒ Bad - depends on concrete type
pub struct LsmEngine {
    wal: FileBasedWal,
}
}

Error Handling

  1. Use Result<T, LsmError> for fallible operations

    #![allow(unused)]
    fn main() {
    pub fn put(&mut self, key: &[u8], value: &str) -> Result<()> {
        self.wal.append(key, value)?;
        self.memtable.insert(key, value);
        Ok(())
    }
    }
  2. Provide Context with error types

    #![allow(unused)]
    fn main() {
    use thiserror::Error;
    
    #[derive(Error, Debug)]
    pub enum LsmError {
        #[error("WAL corruption at offset {0}")]
        WalCorruption(u64),
        
        #[error("Key too large: {size} bytes (max: {max})")]
        KeyTooLarge { size: usize, max: usize },
    }
    }
  3. Don't Panic in library code (use Result instead)

Performance Considerations

  1. Minimize Allocations

    #![allow(unused)]
    fn main() {
    // โœ… Good - reuse buffer
    let mut buffer = Vec::with_capacity(1024);
    for item in items {
        buffer.clear();
        serialize_into(&mut buffer, item)?;
    }
    
    // โŒ Bad - allocate each iteration
    for item in items {
        let buffer = serialize(item)?;
    }
    }
  2. Use Appropriate Data Structures

    • BTreeMap for sorted data
    • HashMap for fast lookups
    • Vec for sequential access
  3. Benchmark Changes

    cargo bench
    

๐Ÿงช Testing Guidelines

Test Types

  1. Unit Tests - Test individual functions/modules

    #![allow(unused)]
    fn main() {
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_memtable_insert() {
            let mut memtable = MemTable::new();
            memtable.insert(b"key", "value");
            assert_eq!(memtable.get(b"key"), Some("value".to_string()));
        }
    }
    }
  2. Integration Tests - Test component interactions

    #![allow(unused)]
    fn main() {
    // tests/integration_test.rs
    #[test]
    fn test_engine_recovery() {
        let config = LsmConfig::default();
        let mut engine = LsmEngine::new(config).unwrap();
        
        engine.put(b"key", "value").unwrap();
        drop(engine);
        
        let engine = LsmEngine::new(config).unwrap();
        assert_eq!(engine.get(b"key").unwrap(), Some("value".to_string()));
    }
    }
  3. Property Tests - Test invariants (optional, using proptest)

Test Requirements

  • All new code must have tests
  • Tests must pass on all platforms
  • Test coverage should increase, not decrease
  • Use descriptive test names
    #![allow(unused)]
    fn main() {
    #[test]
    fn test_get_returns_none_for_nonexistent_key() { /* ... */ }
    }

Running Tests

# Run all tests
cargo test --all-features

# Run specific test
cargo test test_memtable_insert

# Run with output
cargo test -- --nocapture

# Run integration tests only
cargo test --test '*'

# Run with coverage (requires tarpaulin)
cargo tarpaulin --out Html

๐Ÿ“ Commit Messages

We follow the Conventional Commits specification.

Format

<type>(<scope>): <subject> (#issue)

<body>

<footer>

Types

  • feat - New feature
  • fix - Bug fix
  • docs - Documentation changes
  • style - Code style changes (formatting, etc.)
  • refactor - Code refactoring
  • perf - Performance improvements
  • test - Test additions/modifications
  • chore - Build process, dependencies, tooling
  • ci - CI/CD changes

Examples

Simple commit:

feat: add bloom filter to SSTable reader

With scope and issue:

fix(wal): prevent corruption on unclean shutdown (#42)

With body:

feat(compaction): implement leveled compaction strategy (#47)

Adds a new LeveledCompaction struct that implements the Compaction
trait. This strategy reduces read amplification by maintaining
sorted levels with exponentially increasing sizes.

Closes #47

Breaking change:

feat(api)!: change SSTable format to V2

BREAKING CHANGE: SSTable V2 is incompatible with V1.
Migration tool will be provided in v1.4.

Referencing Issues:

  • #123 - Reference issue
  • fixes #123, closes #123 - Will auto-close issue when PR merges
  • resolves #123 - Alternative close syntax

๐Ÿ” Pull Request Process

Before Pushing

  • Code compiles without errors
  • All tests pass (cargo test --all-features)
  • No clippy warnings (cargo clippy --all-features -- -D warnings)
  • Code is formatted (cargo fmt)
  • Documentation is updated (if applicable)
  • Tests are added for new functionality

Automated PR Creation

When you push to a feature/* or fix/* branch:

  1. โœ… Tests run automatically (CI/CD)
  2. โœ… PR is auto-created to develop branch
  3. โœ… Issues are commented (if referenced in commits)
  4. โœ… Checks must pass before merge

Manual Steps (If Needed)

If you need to create a PR manually:

  1. Go to your fork on GitHub
  2. Click "New Pull Request"
  3. Select develop as the base branch
  4. Fill out the PR template
  5. Submit the PR

PR Template

When creating a PR manually, use this template:

## Description

Brief description of changes.

## Type of Change

- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update

## Related Issues

Closes #123
Related to #456

## Testing

Describe how you tested your changes:
- [ ] Unit tests added
- [ ] Integration tests added
- [ ] Manual testing performed

## Checklist

- [ ] Code compiles
- [ ] Tests pass
- [ ] Clippy checks pass
- [ ] Code is formatted
- [ ] Documentation updated

## Screenshots (if applicable)

## Additional Notes

Review Process

  1. Automated Checks - CI runs tests and linters
  2. Code Review - Maintainer reviews code
  3. Feedback - Address review comments
  4. Approval - Maintainer approves PR
  5. Merge - Squash and merge to develop

Review Timeline

  • Simple PRs: 1-3 days
  • Complex PRs: 3-7 days
  • Breaking Changes: 7-14 days

๐Ÿ“ Project Structure

ApexStore/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ core/              # Core domain logic
โ”‚   โ”‚   โ”œโ”€โ”€ engine.rs      # LSM engine orchestration
โ”‚   โ”‚   โ”œโ”€โ”€ memtable.rs    # In-memory storage
โ”‚   โ”‚   โ””โ”€โ”€ log_record.rs  # Data model
โ”‚   โ”œโ”€โ”€ storage/           # Persistence layer
โ”‚   โ”‚   โ”œโ”€โ”€ wal.rs         # Write-ahead log
โ”‚   โ”‚   โ”œโ”€โ”€ sstable.rs     # SSTable reader
โ”‚   โ”‚   โ””โ”€โ”€ builder.rs     # SSTable writer
โ”‚   โ”œโ”€โ”€ infra/             # Infrastructure
โ”‚   โ”‚   โ”œโ”€โ”€ codec.rs       # Serialization
โ”‚   โ”‚   โ”œโ”€โ”€ error.rs       # Error types
โ”‚   โ”‚   โ””โ”€โ”€ config.rs      # Configuration
โ”‚   โ”œโ”€โ”€ api/               # HTTP API (feature-gated)
โ”‚   โ”œโ”€โ”€ cli/               # CLI interface
โ”‚   โ””โ”€โ”€ features/          # Feature flags
โ”œโ”€โ”€ tests/                 # Integration tests
โ”œโ”€โ”€ benches/               # Benchmarks
โ””โ”€โ”€ docs/                  # Documentation

Module Guidelines

  • core/ - Domain logic, no external dependencies
  • storage/ - File I/O, persistence
  • infra/ - Cross-cutting concerns
  • api/ - External interfaces (feature-gated)

๐ŸŽฏ Areas for Contribution

High Priority

  1. CI/CD Testing Pipeline (#55)

    • Difficulty: Easy
    • Impact: High
    • Skills: GitHub Actions, YAML
  2. Compaction Implementation (#47)

    • Difficulty: Hard
    • Impact: High
    • Skills: Rust, algorithms, file I/O
  3. Efficient Iterators (#21, #22, #23)

    • Difficulty: Medium
    • Impact: High
    • Skills: Rust, data structures

Medium Priority

  1. Benchmarking Suite (#48)

    • Difficulty: Easy
    • Impact: Medium
    • Skills: Rust, criterion
  2. CLI Command Equalization (#65)

    • Difficulty: Medium
    • Impact: Medium
    • Skills: Rust, CLI design
  3. Checksums & Integrity (#25)

    • Difficulty: Easy
    • Impact: High
    • Skills: Rust, CRC32

Good First Issues

  1. Add More Tests

    • Difficulty: Easy
    • Impact: Medium
    • Skills: Rust, testing
  2. Binary Search Optimization (#37)

    • Difficulty: Easy
    • Impact: Medium
    • Skills: Rust, algorithms
  3. Documentation Improvements

    • Difficulty: Easy
    • Impact: Medium
    • Skills: Technical writing

Advanced Topics

  1. Replication Support

    • Difficulty: Very Hard
    • Impact: Very High
    • Skills: Distributed systems, Raft
  2. Snapshot Isolation

    • Difficulty: Hard
    • Impact: High
    • Skills: Concurrency, MVCC

โ“ Questions?


๐Ÿš€ Ready to Contribute?

  1. Find an issue or create one
  2. Comment that you're working on it
  3. Fork the repo and create a branch from main
  4. Make your changes with tests
  5. Push to your fork (automated PR will be created)
  6. Wait for review and address feedback

Thank you for contributing! ๐ŸŽ‰


Last updated: March 2026