snapshot
This commit is contained in:
101
CONSUL_PERSISTENCE.md
Normal file
101
CONSUL_PERSISTENCE.md
Normal file
@@ -0,0 +1,101 @@
|
||||
# Consul Persistence for Connection Monitor
|
||||
|
||||
This document describes the Consul-based state persistence implementation for the connection monitoring script.
|
||||
|
||||
## Overview
|
||||
|
||||
The connection monitor now supports state persistence using Consul's KV store. This allows the script to resume from its previous state if restarted, maintaining continuity of remediation processes and connection state tracking.
|
||||
|
||||
## Configuration
|
||||
|
||||
### Consul Server
|
||||
- **URL**: `http://consul.service.dc1.consul:8500` (configurable via constructor parameter)
|
||||
- **Authentication**: None required (no ACL tokens)
|
||||
- **Key Structure**: All state is stored under `qbitcheck/connection_monitor/`
|
||||
|
||||
### State Data Persisted
|
||||
|
||||
The following state variables are persisted to Consul:
|
||||
|
||||
#### Connection State (`state/`)
|
||||
- `connection_state`: Current connection state ('stable' or 'unstable')
|
||||
- `last_state_change_time`: Timestamp of last state transition
|
||||
- `consecutive_failures`: Count of consecutive connection failures
|
||||
- `consecutive_stable_checks`: Count of consecutive stable checks
|
||||
|
||||
#### Remediation State (`remediation/`)
|
||||
- `state`: Current remediation phase (None, 'stopping_torrents', 'restarting_nomad', 'waiting_for_stability', 'restarting_torrents')
|
||||
- `start_time`: When remediation process started
|
||||
- `stabilization_checks`: Count of stabilization checks during remediation
|
||||
|
||||
#### Stability Tracking (`stability/`)
|
||||
- `start_time`: When stability timer started (for 1-hour requirement)
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### State Persistence Points
|
||||
|
||||
State is automatically saved to Consul at these critical points:
|
||||
|
||||
1. **Connection State Transitions**:
|
||||
- When transitioning from stable to unstable (`on_stable_to_unstable`)
|
||||
- When transitioning from unstable to stable (`on_unstable_to_stable`)
|
||||
|
||||
2. **Remediation Process**:
|
||||
- When remediation starts (`start_remediation`)
|
||||
- After each remediation state transition:
|
||||
- Stopping torrents → Restarting Nomad
|
||||
- Restarting Nomad → Waiting for stability
|
||||
- Waiting for stability → Restarting torrents
|
||||
- When remediation completes successfully
|
||||
- When remediation fails or times out
|
||||
- On unexpected errors during remediation
|
||||
|
||||
3. **Stability Tracking**:
|
||||
- When 1-hour stability requirement is met
|
||||
|
||||
### Error Handling
|
||||
|
||||
- If Consul is unavailable, the script continues operation with graceful degradation
|
||||
- Consul connection errors are logged but don't interrupt monitoring
|
||||
- State loading failures result in default initialization
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Usage
|
||||
```python
|
||||
monitor = ConnectionMonitor(
|
||||
qbittorrent_url='http://sp.service.dc1.consul:8080',
|
||||
nomad_url='http://192.168.4.36:4646',
|
||||
tracker_name='your_tracker_name',
|
||||
consul_url='http://consul.service.dc1.consul:8500' # Optional, defaults to above
|
||||
)
|
||||
```
|
||||
|
||||
### Without Consul
|
||||
If the `python-consul` package is not installed, state persistence is automatically disabled with a warning message.
|
||||
|
||||
## Dependencies
|
||||
|
||||
Add to requirements.txt:
|
||||
```
|
||||
python-consul>=1.1.0
|
||||
```
|
||||
|
||||
Install with:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## Benefits
|
||||
|
||||
1. **State Continuity**: Script can be restarted without losing track of ongoing remediation processes
|
||||
2. **Crash Recovery**: Survives process restarts and system reboots
|
||||
3. **Monitoring**: External systems can monitor the state via Consul
|
||||
4. **Debugging**: Historical state available for troubleshooting
|
||||
|
||||
## Limitations
|
||||
|
||||
- Requires Consul server to be available
|
||||
- State is eventually consistent (saved after transitions)
|
||||
- No built-in state expiration or cleanup (manual Consul management required)
|
||||
Reference in New Issue
Block a user