The Challenge: Account Model vs UTXO Model
If you've ever designed a custodian solution for Ethereum, you might have found it surprisingly straightforward. Creating wallets from mnemonics, managing deposits and withdrawals, and even implementing multisig solutions using Safe (onchain) or MPC/TSS (offchain) are relatively simple tasks. This ease stems from Ethereum's account model, which treats balances as fungible assets in individual accounts.
Bitcoin, however, operates fundamentally differently due to its UTXO (Unspent Transaction Output) architecture. This creates unique challenges for custodians managing Bitcoin at scale.
Understanding the UTXO Model
In Bitcoin's UTXO model:
- Every deposit requires a unique address (locking script)
- Money behaves like physical currency notes rather than a simple account balance
- UTXOs are atomic units - when you spend one, you must spend it entirely
- Partial spending requires change - similar to receiving change when paying with cash
While Bitcoin is certainly fungible, the mechanics of how it works differ significantly from Ethereum. Think of UTXOs as individual checks: if you want to spend a check, you must spend it in full. To spend a partial amount, you write a new check back to yourself for the change.
This creates a unique challenge for custodians: efficiently managing potentially millions of "checks" of varying amounts.
The Scale Problem: Managing Millions of UTXOs
Imagine managing millions of UTXOs across thousands of user deposits, processing thousands of withdrawals daily. When a user requests a withdrawal of 0.1 BTC, how do you efficiently select which UTXOs to spend?
This isn't just a theoretical problem—it's the core challenge for exchanges, bridges, and payment processors handling Bitcoin at scale.
Why Standard Algorithms Fail at Scale
Approach 1: Branch and Bound (Bitcoin Core's Method)
Bitcoin Core uses the Branch and Bound algorithm for optimal UTXO selection:
def branch_and_bound(utxos, target_amount, fee_rate):
# Recursively search for exact match combinations
# Time complexity: O(2^n) with pruning
# Works well for wallet-sized UTXO sets (hundreds to thousands)
pass
Why this fails at scale:
- Bitcoin Core wallets: 100-1,000 UTXOs
- Custodian platform: 10,000,000+ UTXOs
- At millions of UTXOs: O(2^n) becomes computationally impossible
- Even with aggressive pruning and timeouts, selection would take seconds or minutes
- Would create a denial-of-service vulnerability under load
The fundamental insight: We need O(log n) or O(1) selection, not O(2^n).
Approach 2: The Greedy Algorithm
The simplest approach is to find the largest UTXO and use it:
withdrawal_amount = 0.1 BTC
selected_utxo = max(utxos) // e.g., 0.5 BTC
change = 0.4 BTC
Problems:
- Creates severe UTXO fragmentation over time
- Eventually, you'll have many small UTXOs
- Large withdrawals become difficult as UTXOs get smaller
Approach 3: Binary Search on Sorted UTXOs
Sort UTXOs and use binary search to find an optimal match:
sorted_utxos = sort(utxos, ascending)
selected_utxo = binary_search(sorted_utxos, withdrawal_amount)
Problems:
- Works reasonably well for small withdrawals
- Fails when the requested amount exceeds the largest UTXO
- Example: User requests 5 BTC, but largest UTXO is 2 BTC
Approach 4: Consolidation (Combining Smallest UTXOs)
Combine multiple small UTXOs to fulfill the withdrawal:
selected_utxos = []
total = 0
for utxo in sorted(utxos, ascending):
selected_utxos.append(utxo)
total += utxo.amount
if total >= withdrawal_amount:
break
Problems:
- More inputs = larger transaction size = higher fees
- Can become prohibitively expensive
- Doesn't scale practically for regular operations
The Solution: bucket-Based Cascading Selection
None of the approaches above work efficiently at scale. The solution combines the cascading greedy approach with strategic consolidation, mimicking how physical currency denominations work.
How Physical Currency Works
When paying $50 in cash, you might use:
- 5×$10 notes
- 1×$50 note
- 1×$100 note and receive $50 change
The choice depends on what denominations you have available. We can apply this same principle programmatically to Bitcoin UTXOs.
Implementation: Gaussian Distribution Buckets
Step 1: Define UTXO Tiers
TIER_HIERARCHY = [
{"name": "dust", "min": 0.0001, "max": 0.001}, # 10k-100k sats
{"name": "tiny", "min": 0.001, "max": 0.005}, # 100k-500k sats
{"name": "small", "min": 0.005, "max": 0.01}, # 500k-1M sats
{"name": "medium", "min": 0.01, "max": 0.02}, # 1M-2M sats
{"name": "large", "min": 0.02, "max": 0.05}, # 2M-5M sats
{"name": "xlarge", "min": 0.05, "max": 0.2}, # 5M-20M sats
{"name": "whale", "min": 0.2, "max": None}, # 20M+ sats
]
Tier ranges are not arbitrary - they're derived from:
- Historical withdrawal distribution analysis
- Fee optimization modeling
- Consolidation opportunity windows
- Minimum relay fee and dust thresholds
Why this works at scale:
- Reduces search space from millions to constant T tiers
- Each tier independently manageable
- Enables efficient database indexing
- Natural distribution boundaries
Step 2: Tier Cascading Strategy
Here's the critical difference from naive approaches—we don't just pick a tier, we **cascade through tiers intelligently
def find_utxos_for_withdrawal(amount):
"""
Tier cascading: Start at preferred tier, cascade UP to larger tiers
Key insight: Cascading UP prevents fragmentation
"""
# Determine which tier this amount naturally belongs to
preferred_tier = determine_tier(amount)
# Start from preferred tier and go UP through larger tiers
tier_index = TIER_HIERARCHY.index(preferred_tier)
for i in range(tier_index, len(TIER_HIERARCHY)):
current_tier = TIER_HIERARCHY[i]
# Check if tier has sufficient total balance
tier_balance = get_tier_balance(current_tier)
if tier_balance >= amount:
# Try to select UTXOs from this tier
utxos = select_utxos_in_tier(current_tier, amount)
if utxos:
return utxos
# If selection fails, continue to next tier
raise InsufficientFundsError("No tier has sufficient balance")
Why cascade UP, not DOWN:
Example: 0.03 BTC withdrawal
Cascading DOWN (naive approach):
medium (0.01-0.05) → small (0.005-0.01) → tiny → dust
Result: Many small UTXOs = fragmentation + high fees
Cascading UP (our approach):
medium (0.01-0.05) → large (0.05-0.1) → xlarge
Result: Fewer larger UTXOs = consolidation + low fees
Step 3: UTXO Selection Within Tier
Once we've selected a tier, we need to pick specific UTXOs. This is where the knapsack problem comes in:
def select_utxos_in_tier(tier, amount, max_inputs=5):
"""
Within a tier, select the smallest set of UTXOs that covers the amount.
This is a bounded knapsack approximation.
"""
# Get available UTXOs in tier, sorted by amount
available_utxos = get_available_utxos(tier, sort_by="amount_asc", limit_by=max_inputs)
# Greedy knapsack: Select smallest UTXOs first
selected = []
total = 0
for utxo in available_utxos:
selected.append(utxo)
total += utxo.amount
if total >= amount:
# Atomically reserve these UTXOs in database
reserve_utxos(selected)
return selected
# Tier doesn't have enough UTXOs
return None
Why smallest-first selection?
- Minimizes change output size
- Naturally consolidates small UTXOs
- Cleans up tier fragmentation over time
Note on implementation: In production, this is implemented entirely within PostgreSQL using:
-
SELECT FOR UPDATE SKIP LOCKEDfor concurrency-safe selection - Atomic transactions for reservation
- Database-level constraints for consistency
- Sub-millisecond selection performance with proper indexing
Step 4: Fee-Aware Tier Selection
At scale, transaction fees become a significant cost center. The tier selection should factor in current fee rates:
def find_utxos_with_fee_awareness(amount, fee_rate_sats_per_vb):
"""
Adjust tier selection based on current fee market conditions
"""
preferred_tier = determine_tier(amount)
tier_index = TIER_HIERARCHY.index(preferred_tier)
# During high-fee periods, start from a higher tier
# This reduces input count = lower fees
if fee_rate_sats_per_vb > 50: # High fee period
# Skip to next larger tier to minimize inputs
tier_index = min(tier_index + 1, len(TIER_HIERARCHY) - 1)
# Rest of cascading logic...
Real-world impact:
Scenario: 0.03 BTC withdrawal during 100 sat/vB fee spike
Option A: 3 UTXOs from medium tier
Fee: 3 inputs × 68 vB × 100 sat/vB = 20,400 sats ($20.40)
Option B: 1 UTXO from large tier (fee-aware selection)
Fee: 1 input × 68 vB × 100 sat/vB = 6,800 sats ($6.80)
Savings: $13.60 per withdrawal during peak hours
Why This Approach Works at Scale
Computational Complexity
| Approach | Time Complexity | Scale Limit |
|---|---|---|
| Branch and Bound | O(2^n) | < 1,000 UTXOs |
| Binary Search | O(log n) | Fails on large amounts |
| Tier Cascading | O(tiers) + O(k*log(k)) | Millions of UTXOs |
Where:
- n = total UTXOs
- tiers = number of tiers (constant: 7)
- k = UTXOs selected per transaction (typically 2-3)
Our approach: O(7) + O(3*log(3)) = O(1) effectively
Natural Consolidation Mechanics
The smallest-first selection within tiers creates automatic consolidation:
Example:
Medium tier has 1000 UTXOs
Withdrawal: 0.03 BTC
Selection: Uses 3 smallest UTXOs (0.01, 0.011, 0.012)
Result: 1000 → 997 UTXOs, naturally cleans up small values
This is self-healing - no manual consolidation jobs required during normal operations. Tiers gets replenished by new deposits.
Gaussian Distribution Alignment
Real-world withdrawal patterns follow a log-normal distribution (similar to Gaussian):
Distribution of withdrawals (from production data):
- 10th percentile: 0.002 BTC (tiny tier)
- 50th percentile: 0.02 BTC (medium tier) ← most common
- 90th percentile: 0.15 BTC (xlarge tier)
- 99th percentile: 0.5 BTC (whale tier)
By sizing tiers around this distribution, we optimize for the common case while handling edge cases efficiently.
Operational Considerations
Tier Health Monitoring
You need visibility into tier distribution to maintain system health:
def monitor_tier_health():
"""
Track tier metrics for operational alerting
"""
for tier in TIER_HIERARCHY:
metrics = {
"utxo_count": count_utxos(tier),
"total_balance": sum_balance(tier),
"avg_utxo_size": average_size(tier),
"daily_utilization": withdrawals_from_tier(tier, days=1)
}
# Alert on depletion risk
if metrics["utxo_count"] < 10:
alert(f"Tier {tier['name']} depleting: {metrics['utxo_count']} UTXOs left")
# Alert on imbalance
runway_days = metrics["total_balance"] / metrics["daily_utilization"]
if runway_days < 2:
alert(f"Tier {tier['name']} has <2 days runway")
Strategic Rebalancing
While the system is self-healing under normal operation, you need proactive rebalancing during quiet periods:
def rebalance_tiers_during_low_fees():
"""
Run during low-fee periods (typically 2-6 AM UTC, weekends)
Consolidate dust, split whales, rebalance distribution
"""
current_fee_rate = get_current_fee_rate()
# Only rebalance when fees are low
if current_fee_rate > 10: # sats/vB
return
# Strategy 1: Consolidate dust and tiny tiers
dust_utxos = get_tier_utxos("dust")
if len(dust_utxos) > 100:
consolidate_into_tier(dust_utxos, target_tier="small")
# Strategy 2: Ensure minimum UTXO count per tier
for tier in TIER_HIERARCHY:
if count_utxos(tier) < 20:
create_utxos_for_tier(tier, target_count=50, source_tier="whale")
When to rebalance:
- Daily during off-peak hours
- After major withdrawal spikes
- When tier depletion alerts trigger
- During extended low-fee periods (accumulate savings)
Database Implementation Notes
The algorithm is implemented entirely in PostgreSQL for:
-
Atomicity:
BEGIN/COMMIT/ROLLBACKblocks ensure consistency -
Concurrency:
SELECT FOR UPDATE SKIP LOCKEDhandles concurrent withdrawals -
Performance: Indexed queries on
(tier, state, amount)enable sub-millisecond selection - Reliability: ACID guarantees prevent double-spending
This enables the database to instantly find the smallest available UTXOs in any tier.
Real-World Performance
In production environments processing thousands of withdrawals daily:
- Average inputs per transaction: 2.1 UTXOs (down from 4.5 with naive approaches)
- Fee reduction: 55% vs. simple consolidation approach
- UTXO selection latency: <5ms (database-level selection)
- Tier cascade events: <2% of withdrawals (most succeed on preferred tier)
- Natural consolidation rate: thousands of UTXOs/day consolidated without manual intervention
- System uptime during rebalancing: 100% (non-blocking background process)
Cost Impact at Scale:
Platform: 10,000 withdrawals/day
Average fee savings: $2.50/withdrawal
Monthly savings: $750,000
Annual savings: $9,000,000
Limitations and Trade-offs
What This Approach Sacrifices
1. Theoretical Optimality
- Branch and Bound finds the perfect UTXO combination
- Tier cascading finds a good enough combination
- Trade-off: ~5% more change outputs for 10,000x faster selection
2. Privacy Considerations
- Selecting multiple UTXOs from same tier can link addresses
- Common input ownership heuristic applies
- Mitigation: Use mixing services or CoinJoin for privacy-critical applications
3. Minimum Withdrawal Threshold
- Dust tier prevents withdrawals < 50,000 sats (~$50)
- Necessary to avoid economically unspendable UTXOs
- Alternative: Lightning Network for micro-withdrawals
When This Approach May Not Fit
Not suitable for:
- Personal wallets (use Bitcoin Core's Branch and Bound)
- Privacy-focused applications (use Wasabi/Samourai algorithms)
- < 10,000 UTXOs (simpler approaches work fine)
Ideal for:
- Exchanges and payment processors
- Cross-chain bridges
- Custodial services at scale
- Any platform managing 1M+ UTXOs
Conclusion
Building a Bitcoin custodian at scale requires fundamentally different thinking than Ethereum-based solutions. The UTXO model, while more complex, can be efficiently managed using tier-based cascading strategies.
Key takeaways:
- Standard algorithms don't scale - Branch and Bound is O(2^n), impossible at millions of UTXOs
- Tier cascading UP prevents fragmentation while maintaining availability
- Fee-aware selection can save millions annually at scale
- Natural consolidation keeps the system healthy through smallest-first selection
- Database-level implementation provides atomicity and concurrency control
- Operational monitoring is critical - track tier health and rebalance proactively
By treating Bitcoin UTXOs like physical currency denominations and selecting them through intelligent tier traversal, custodians can process millions of transactions efficiently while minimizing fees and maintaining system health.
The real innovation isn't just the algorithm—it's designing a system that scales to millions of UTXOs while remaining operationally manageable.
Top comments (0)