Open Source · Kubernetes-Native · AI-Powered

Upgrade Fearlessly.
Validate Everything.
Lose Nothing.

NebulaCB is the complete Couchbase management platform. Orchestrate rolling upgrades, validate XDCR replication integrity, monitor multi-cluster health, and get AI-powered root cause analysis — all from a mission-control cockpit dashboard.

Get Started View on GitHub
Zero Data Loss Validation Kubernetes-Aware Upgrades Bidirectional XDCR Local AI with Ollama
🚀
Rolling Upgrades
Helm-based orchestration with pause, resume, abort, and rollback. Track node-by-node progress in real time.
🔍
Data Integrity
SHA-256 hash validation, sequence gap detection, and continuous doc-count monitoring across clusters.
🤖
AI Analysis
Local Ollama integration. Ask AI about cluster health, get root cause analysis, and auto-train on your logs.
🌍
Multi-Region
Manage clusters across regions with cross-region XDCR, automatic failover, and region-aware monitoring.

Everything You Need to Manage Couchbase

From rolling upgrades to AI-powered troubleshooting, NebulaCB covers every aspect of Couchbase cluster lifecycle management.

⚙️
Upgrade Orchestrator
Automate Couchbase rolling upgrades via the Kubernetes Operator. NebulaCB patches the CouchbaseCluster CR image, monitors pod rollover, and tracks rebalance completion.
  • Helm-based rolling upgrade via CouchbaseCluster CR
  • Pre-check validation (all nodes ready)
  • Node-by-node progress tracking (10s poll)
  • Downgrade button to roll back to previous version
  • Abort to stop tracking mid-upgrade
🔁
XDCR Replication Management
Monitor cross-datacenter replication in real time. Track replication lag, pipeline restarts, topology changes, and GOXDCR delay with a 5-minute countdown timer.
  • Bidirectional XDCR monitoring (source ↔ target)
  • Pipeline pause / resume / restart / stop controls
  • GOXDCR delay detection with countdown timer
  • Topology change tracking during upgrades
  • Integrated XDCR troubleshooting modal
Data Integrity Validation
Continuous proof that zero data loss occurs during upgrades. Compare source and target clusters with hash verification and sequence gap detection.
  • Document count timeline with delta convergence chart
  • SHA-256 hash sampling for content verification
  • Sequence gap detection for ordered streams
  • Full audit on demand (all keys compared)
  • Real-time convergence status (ZERO LOSS CONFIRMED)
Storm Load Generator
Simulate production traffic during upgrades. Generate writes, reads, and deletes with configurable rates, burst patterns, and hot-key distributions.
  • Configurable writes/reads per second and doc sizes
  • Burst mode with multiplier and interval
  • Hot key percentage for realistic access patterns
  • Standalone xdcr-loadtest script for dual-cluster writes
  • Real-time latency P50/P95/P99 tracking
🛡️
HA & Automatic Failover
Configure automatic failover with health checks, timeout thresholds, and recovery modes. Supports manual and graceful failover between clusters.
  • Auto-failover with configurable timeout
  • Manual and graceful failover triggers
  • Failover history and event timeline
  • Cross-region failover support
  • Preserve data mode for safe recovery
💾
Backup & Restore
Scheduled backups with retention policies, compression, and encryption. Restore to any cluster with progress tracking.
  • Cron-based backup scheduling
  • Compression and encryption options
  • Configurable retention (days)
  • Point-in-time restore to any cluster
  • Repository size and health monitoring
📦
Data Migration
Migrate data between clusters with parallel workers, batch processing, and optional transformation rules. Validates integrity after migration.
  • Parallel worker pool (configurable)
  • Batch processing with retry logic
  • Transform rules (rename, convert, filter)
  • Post-migration validation
  • Progress tracking with ETA
☸️
Kubernetes Operator Integration
Deep integration with the Couchbase Autonomous Operator. Auto-discovers pods, manages NodePort exposure, and patches CRDs for upgrades.
  • Auto port-forwarding for k8s clusters
  • Direct NodePort access with kv_port config
  • CouchbaseCluster CR patching for upgrades
  • Pod discovery and health monitoring
  • Helm chart for deploying NebulaCB itself

Mission Control Dashboard

A cockpit-style interface with real-time WebSocket updates, live cluster health, XDCR flow visualization, and one-click controls.

📊
Cluster Health & Metrics
Node status, CPU/memory, ops/sec, doc counts, version, edition, rebalance state per cluster.
🔁
XDCR Replication Flow
Visual pipeline: source → target with lag, restarts, topology changes, mutation queue, GOXDCR delay.
Data Loss Proof Panel
Doc count timeline, delta convergence chart, hash sampling results, monitoring duration, zero-loss verdict.
🎮
Control Panel
16 action buttons: load, upgrade, downgrade, XDCR, audit, AI analyze, backup, failover, chaos injection.

Tab Navigation

Dashboard
Main cockpit view with all panels
🤖
Ask AI
Chat with AI about cluster issues
🔍
RCA
Root cause analysis reports
📚
Knowledge Base
12+ built-in troubleshooting guides
📊
AI Insights
History of all AI analyses

AI-Powered Analysis with Ollama

Run AI locally with Ollama — no cloud API keys needed. NebulaCB learns from your cluster logs and metrics to provide context-aware recommendations.

💬
Ask AI
Chat with NebulaCB AI about any cluster issue. It has full context of your cluster state, XDCR status, alerts, and metrics. Ask questions in natural language and get actionable answers.
> "Why is XDCR replication lag increasing?"
> "Is the cluster ready for an upgrade?"
> "Analyze performance bottlenecks"
> "What's the best backup strategy?"
🔍
Root Cause Analysis
Trigger AI-powered RCA for specific categories: XDCR issues, upgrade failures, performance problems, failover events, backup errors, or data integrity concerns. Get structured reports with evidence chains and remediation steps.
Result: severity, root cause, evidence chain,
remediation steps with risk levels + commands
📚
Knowledge Base
Built-in library of 12+ common Couchbase issues covering XDCR lag, pipeline restarts, stuck rebalances, auto-failover, backup failures, memory pressure, disk queues, and Kubernetes NodePort configuration. Searchable and filterable.
Categories: XDCR, Upgrade, Failover, Backup,
Performance, Data Integrity, Configuration
🧠
Ollama Integration
Run AI 100% locally with Ollama. No data leaves your network. Supports llama3, llama4, and any Ollama-compatible model. Also supports Anthropic Claude and OpenAI as cloud providers.
config.json:
"ai": {
  "enabled": true,
  "provider": "ollama",
  "model": "llama3",
  "api_endpoint": "http://127.0.0.1:11434"
}

How It Works

From setup to production-grade upgrade validation in four steps.

Deploy Clusters
Set up source and target Couchbase clusters via Docker Compose, k3s with the Couchbase Operator, or existing infrastructure. Configure XDCR replication between them.
Configure NebulaCB
Edit config.json with cluster addresses, credentials, and NodePort KV ports. Enable AI with Ollama. Start the server with make run.
Generate Load & Upgrade
Start the Storm generator or xdcr-loadtest script to simulate production traffic. Trigger a rolling upgrade from the dashboard. Monitor XDCR and data integrity in real time.
Validate & Report
Run a full data audit after upgrade. Use AI to analyze any issues. Generate a comprehensive report with upgrade timeline, XDCR gap analysis, and zero-loss proof.

Installation

Three ways to get started with NebulaCB.

Build from Source (Go 1.24+)

# Clone and build git clone https://github.com/bwalia/nebulacb.git cd nebulacb make build # Configure your clusters vim config.json # Start the server make run # Open dashboard open http://localhost:8899 # Login: admin / nebulacb

Run the XDCR Load Test

# Send random writes to both clusters during upgrades go run ./cmd/xdcr-loadtest/ -rate 100 -duration 30m # Custom: 70% to source, 30% to target, larger docs go run ./cmd/xdcr-loadtest/ -rate 500 -ratio 0.7 -doc-min 1024 -doc-max 8192

Enable AI with Ollama

# Install Ollama (macOS) brew install ollama # Pull a model ollama pull llama3 # Update config.json "ai": { "enabled": true, "provider": "ollama", "model": "llama3", "api_endpoint": "http://127.0.0.1:11434" }

Docker Compose (includes two Couchbase clusters)

# Start everything: NebulaCB + Couchbase 7.2.2 + Couchbase 7.6.0 docker-compose up -d # Open dashboard at http://localhost:8080 # Source cluster: http://localhost:8091 # Target cluster: http://localhost:9091 # Tear down docker-compose down -v

After starting, initialize both Couchbase clusters, create the test bucket, and set up XDCR replication. See the README for step-by-step instructions.

Deploy to Kubernetes with Helm

# Install NebulaCB helm install nebulacb deploy/helm/nebulacb \ -n nebulacb --create-namespace # Upgrade helm upgrade nebulacb deploy/helm/nebulacb -n nebulacb # Uninstall helm uninstall nebulacb -n nebulacb

Expose Couchbase Clusters via NodePort

# Patch Couchbase Operator for external access kubectl patch couchbasecluster cb-local -n couchbase --type merge \ -p '{"spec":{"networking":{"exposedFeatures":["client","admin"]}}}' # Set kv_port in config.json to skip port-forwarding "source": { "host": "192.168.1.193:32451", "kv_port": 32419, ... }

Architecture

NebulaCB runs on your machine and connects to Couchbase clusters via REST API and the gocb SDK.

                             React Dashboard (:8899)
                                    |
                            WebSocket + REST API
                                    |
                          NebulaCB Go Server
                    /    |    |    |    |    |    \
              Storm  XDCR  Validator  Orchestrator  Monitor  AI   Failover
                |      |      |           |           |      |      |
             ClientPool (gocb SDK + REST + NodePort connections)
              /                                                   \
   Couchbase Source                                      Couchbase Target
   (k8s / docker / native)                              (k8s / docker / native)
              \_____________________ XDCR _____________________/
                              (bidirectional)

   Local AI: Ollama (llama3) at 127.0.0.1:11434
   Metrics: Prometheus endpoint at :9090/metrics

CLI Commands

CommandDescription
nebulacb-cli statusFull dashboard status (clusters, upgrade, XDCR, load, integrity, alerts)
nebulacb-cli start-loadStart the Storm load generator
nebulacb-cli stop-loadStop load generation
nebulacb-cli start-upgradeTrigger rolling upgrade
nebulacb-cli abort-upgradeStop tracking the upgrade
nebulacb-cli restart-xdcrRestart XDCR pipeline
nebulacb-cli run-auditRun full data integrity audit
nebulacb-cli reportGenerate post-upgrade report

Project Structure

nebulacb/ cmd/ nebulacb/ # Main server cli/ # CLI client xdcr-loadtest/ # Dual-cluster load test internal/ ai/ # AI analyzer (Ollama/Claude/OpenAI) api/ # HTTP + WebSocket server storm/ # Load generator xdcr/ # XDCR replication engine validator/ # Data integrity validation orchestrator/ # Upgrade + downgrade failover/ # HA & failover backup/ # Backup & restore migration/ # Data migration monitor/ # Multi-cluster polling region/ # Multi-region management pkg/ couchbase/ # SDK client + connection pool kubernetes/ # K8s client + port-forward web/nebulacb-ui/ # React dashboard deploy/helm/nebulacb/ # Helm chart

Ready to Upgrade Fearlessly?

Start managing your Couchbase clusters with confidence. Zero data loss guaranteed.

Get Started Star on GitHub