Performance

Benchmarks

sqry is built for speed at every layer. Key measured numbers:

MetricValueNotes
Graph query latency12–21 msvs 1,300–2,400 ms for embedding-based search (100–200x faster)
JavaScript indexing throughput760K LOC/sMeasured with parallel graph indexing enabled
C++ indexing throughput1.1M LOC/s
Python indexing throughput~500K LOC/s
Cache speedup113x (452 ms → 4 ms)Warm multiprocess-safe AST cache vs cold parse
C indirect-call indexing fix10.8x fasterv16.0.6 Pass 5b fix, verified on a Linux kernel drivers/net subset
Fuzzy candidate reduction99.8%Jaccard trigram pre-filter on short queries
NL intent classification2.1 ms (P50)all-MiniLM-L6-v2, INT8 quantized, 99.75% accuracy

The 113x cache speedup applies to repeated queries against an indexed codebase. The first query after sqry index loads the snapshot; subsequent queries (including across processes via the persistent cache) return in 4 ms.


Linux Kernel Benchmark

sqry has been tested against the Linux kernel source tree (~28M LOC, ~70,000 files) to validate performance at extreme scale.

MetricValue
Index time1m48s (24-core machine)
Nodes indexed11,205,544
Edges resolved18,292,255 (deduplicated)
Raw edges22,243,307
Snapshot size1.8 GB
Files indexed63,074 C, 1,077 Shell, 354 Python, 344 Rust, 58 Perl
Caller query (printk, 100 results)~85 ms
# Index the Linux kernel
git clone --depth 1 https://github.com/torvalds/linux.git
sqry index linux/

# Trace the write() syscall path through VFS, ext4, and block layer
cd linux && sqry graph trace-path __x64_sys_write submit_bio --json

# Security audit: find all copy_from_user callers in drivers
sqry query "callers:copy_from_user AND path:drivers/**" --json --limit 1000

# Dead code detection in staging drivers
sqry unused drivers/staging --scope function --lang c --json

# Cross-subsystem cycle detection
sqry cycles --type calls --max-results 500 --json

Benchmark run on Linux kernel commit a75cb869a8cc with sqry v10.0.2 (historical baseline). Current sqry releases retain the unified graph architecture and v16.0.6 includes a Pass 5b repair for quadratic C indirect-call resolution, verified as a 10.8x speedup on a Linux kernel drivers/net subset. Rerun against the same kernel commit and your target release to verify full-tree numbers in your environment.


Cache Environment Variables

sqry uses a two-layer cache: an in-memory eviction policy plus a persisted .sqry-cache/ directory. All cache settings are controlled by environment variables.

VariableDefaultDescription
SQRY_CACHE_ROOT.sqry-cacheCache directory location
SQRY_CACHE_MAX_BYTES50 MBMaximum total cache size in bytes
SQRY_CACHE_DISABLE_PERSIST0Set to 1 to disable disk persistence (memory-only; useful in CI/containers)
SQRY_CACHE_POLICYlruEviction policy: lru, tiny_lfu, or hybrid
SQRY_CACHE_POLICY_WINDOW0.20Protected window ratio for TinyLFU and hybrid policies (range: 0.05–0.95)
SQRY_CACHE_DEBUG0Set to 1 to emit CacheStats{...} on stderr without modifying CLI flags

Choosing an eviction policy:


Lexer Pool Tuning

The query lexer uses thread-local buffer pooling to reduce allocations on repeated queries. The defaults work for most workloads.

VariableDefaultDescription
SQRY_LEXER_POOL_MAX4Pool size per thread. Set to 0 to disable pooling entirely.
SQRY_LEXER_POOL_MAX_CAP256Buffer capacity limit in tokens before the pool shrinks a buffer
SQRY_LEXER_POOL_SHRINK_RATIO8Shrink ratio applied when a buffer exceeds SQRY_LEXER_POOL_MAX_CAP

Increase SQRY_LEXER_POOL_MAX for high-concurrency server workloads. Set it to 0 only when micro-benchmarking or when sub-millisecond latency requirements make per-allocation overhead relevant.


Cache Management Commands

# Show cache statistics (hit rates, size, entry counts)
sqry cache stats

# Show cache statistics as JSON (for monitoring/automation)
sqry cache stats --json

# Remove entries older than 30 days
sqry cache prune --days 30

# Cap cache to 1 GB, removing oldest entries first
sqry cache prune --size 1GB

# Preview what would be removed without deleting
sqry cache prune --days 7 --dry-run

# Clear all cached entries (requires --confirm to prevent accidental deletion)
sqry cache clear --confirm

Watch Mode

Watch mode keeps the index current with real-time file monitoring. sqry uses OS-level file system events (inotify on Linux, FSEvents on macOS, ReadDirectoryChangesW on Windows) with a debounce window to coalesce rapid saves.

# Start watching with an initial index build
sqry watch --build

# Show statistics for each incremental update
sqry watch --stats

# Set a custom debounce window in milliseconds (default: 100–400 ms, platform-dependent)
sqry watch --debounce 500

Watch mode is useful during active development: the index stays warm so sqry query returns up-to-date results without a manual sqry update step. Index update latency after a file save is typically under 1 ms for the file-detection step; the graph rebuild time depends on the number of changed files.

The debounce window (--debounce) controls how long sqry waits after the last file event before triggering a rebuild. Lower values give faster updates; higher values reduce redundant rebuilds when many files are saved in quick succession (for example, during a git checkout).


Index Validation

--validate and --auto-rebuild are global flags accepted on every sqry subcommand, but they only take effect when an existing snapshot is loaded by sqry search or sqry query. Other subcommands (the build path sqry index, and analysis/graph subcommands that load via the shared snapshot loader) parse the flags but do not consume them — the build always writes a fresh snapshot, and the analysis loader resolves plugins without invoking the validation pass.

# Run a query with strict validation; abort (or auto-rebuild) when the index is stale
sqry query "kind:function" . --validate=fail --auto-rebuild

With --validate=fail --auto-rebuild, sqry walks the file registry in the loaded snapshot and counts how many files have since been deleted on disk. When the orphaned-file ratio exceeds 20%, validation marks the index as stale; --auto-rebuild then triggers a full rebuild and the query continues against the fresh snapshot. Without --auto-rebuild, strict mode aborts with a “stale index” error instead. This is the recommended setting for CI pipelines where a stale index would otherwise return results for files that no longer exist in the workspace.

Other validation levels:

# Warn on validation errors but continue (default)
sqry query --validate=warn "kind:function" .

# Skip validation entirely (fastest, no safety net)
sqry query --validate=off "kind:function" .

Exit codes with --validate=fail: