chore(conductor): Archive track 'fix_litefs_config'

This commit is contained in:
2026-02-08 17:50:13 -08:00
parent 01fa06e7dc
commit 5a557145ac
4 changed files with 73 additions and 0 deletions

View File

@@ -0,0 +1,5 @@
# Track fix_litefs_config_20260208 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)

View File

@@ -0,0 +1,8 @@
{
"track_id": "fix_litefs_config_20260208",
"type": "feature",
"status": "new",
"created_at": "2026-02-08T18:00:00Z",
"updated_at": "2026-02-08T18:00:00Z",
"description": "Fix LiteFS configuration to use 'exec' for Navidrome and ensure it only runs on the Primary node. Also fix DB path configuration."
}

View File

@@ -0,0 +1,22 @@
# Plan: Fix LiteFS Configuration and Process Management (`fix_litefs_config`)
## Phase 1: Configuration and Image Structure [ ]
- [x] Task: Update `litefs.yml` to include the `exec` block (396dfeb)
- [x] Task: Update `Dockerfile` to use LiteFS as the supervisor (`ENTRYPOINT ["litefs", "mount"]`) (ef91b8e)
- [x] Task: Update `navidrome-litefs-v2.nomad` with corrected storage paths (`ND_DATAFOLDER`, `ND_CACHEFOLDER`, `ND_BACKUP_PATH`) (5cbb657)
- [ ] Task: Conductor - User Manual Verification 'Phase 1: Configuration and Image Structure' (Protocol in workflow.md)
## Phase 2: Entrypoint and Registration Logic [x] [checkpoint: 9cd5455]
- [x] Task: Refactor `entrypoint.sh` to handle leadership-aware process management (9cd5455)
- [x] Integrate Consul registration logic (from `register.sh`)
- [x] Implement loop to start/stop Navidrome based on `/data/.primary` existence
- [x] Ensure proper signal handling for Navidrome shutdown
- [x] Task: Clean up redundant scripts (e.g., `register.sh` if fully integrated) (9cd5455)
- [ ] Task: Conductor - User Manual Verification 'Phase 2: Entrypoint and Registration Logic' (Protocol in workflow.md)
## Phase 3: Deployment and Failover Verification [ ]
- [~] Task: Build and push the updated Docker image via Gitea Actions (if possible) or manual trigger
- [~] Task: Deploy the updated Nomad job
- [ ] Task: Verify cluster health and process distribution using `cluster_status` script
- [ ] Task: Perform a manual failover (stop primary allocation) and verify Navidrome migrates correctly
- [ ] Task: Conductor - User Manual Verification 'Phase 3: Deployment and Failover Verification' (Protocol in workflow.md)

View File

@@ -0,0 +1,38 @@
# Specification: Fix LiteFS Configuration and Process Management (`fix_litefs_config`)
## Overview
Reconfigure the Navidrome/LiteFS process management to ensure Navidrome and its Consul service registration only occur on the Primary node. This will be achieved by leveraging the LiteFS `exec` block and updating the `entrypoint.sh` logic. Additionally, correct the Navidrome database and storage paths to properly utilize the LiteFS replicated mount.
## Functional Requirements
- **LiteFS Configuration (`litefs.yml`):**
- Enable the `exec` block to trigger `/usr/local/bin/entrypoint.sh`.
- This allows LiteFS to manage the lifecycle of the application.
- **Entrypoint Logic (`entrypoint.sh`):**
- Implement a supervision loop that monitors leadership via the `/data/.primary` file.
- **On Primary:**
- Register the node as the `navidrome` (primary) service in Consul.
- Start the Navidrome process.
- **On Replica:**
- Ensure Navidrome is NOT running.
- Deregister the `navidrome` primary service if previously registered.
- (Optional) Register as a replica service or simply wait.
- **On Transition:** Handle graceful shutdown of Navidrome if the node loses leadership.
- **Storage and Path Configuration (`navidrome-litefs-v2.nomad`):**
- Set `ND_DATAFOLDER` to `/data` (the LiteFS FUSE mount).
- Set `ND_CACHEFOLDER` to `/shared_data/cache` (shared persistent storage).
- Set `ND_BACKUP_PATH` to `/shared_data/backup` (shared persistent storage).
- **Dockerfile Updates:**
- Update `ENTRYPOINT` to `["litefs", "mount"]` to allow LiteFS to act as the supervisor.
## Non-Functional Requirements
- **Robustness:** Use a simple bash loop for process management to avoid extra dependencies.
- **Signal Handling:** Ensure signals (SIGTERM) are correctly forwarded to Navidrome for graceful shutdown.
## Acceptance Criteria
- [ ] Navidrome process runs ONLY on the Primary node.
- [ ] Consul service `navidrome` correctly points to the current Primary.
- [ ] Navidrome database (`navidrome.db`) is confirmed to be on the `/data` mount.
- [ ] Cluster failover correctly stops Navidrome on the old primary and starts it on the new one.
## Out of Scope
- Implementation of complex init systems like `tini` (bash loop selected by user).