chore(conductor): Archive track 'fix routing - use litefs to register the navidrome service with consul'
All checks were successful
Build and Push Docker Image / build-and-push (push) Successful in 28s

This commit is contained in:
2026-02-07 18:55:39 -08:00
parent 264d518e6e
commit 0af6a3270e
5 changed files with 65 additions and 6 deletions

View File

@@ -0,0 +1,5 @@
# Track fix_routing_20260207 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)

View File

@@ -0,0 +1,8 @@
{
"track_id": "fix_routing_20260207",
"type": "bug",
"status": "new",
"created_at": "2026-02-07T17:36:00Z",
"updated_at": "2026-02-07T17:36:00Z",
"description": "fix routing - use litefs to register the navidrome service with consul. the serivce should point to the master and avoid the litefs proxy (it breaks navidrome)"
}

View File

@@ -0,0 +1,25 @@
# Implementation Plan: Direct Primary Routing for Navidrome-LiteFS
This plan outlines the steps to reconfigure the Navidrome-LiteFS cluster to bypass the LiteFS write-forwarding proxy and use direct primary node routing for improved reliability and performance.
## Phase 1: Infrastructure Configuration Update [checkpoint: 5a57902]
In this phase, we will modify the Nomad job and LiteFS configuration to support direct port access and primary-aware health checks.
- [x] Task: Update `navidrome-litefs-v2.nomad` to point service directly to Navidrome port
- [x] Modify `service` block to use port 4533 instead of dynamic mapped port.
- [x] Replace HTTP health check with a script check running `litefs is-primary`.
- [x] Task: Update `litefs.yml` to ensure consistent internal API binding (if needed)
- [x] Task: Conductor - User Manual Verification 'Infrastructure Configuration Update' (Protocol in workflow.md)
## Phase 2: Deployment and Validation
In this phase, we will deploy the changes and verify that the cluster correctly handles primary election and routing.
- [x] Task: Deploy updated Nomad job
- [x] Execute `nomad job run navidrome-litefs-v2.nomad`.
- [x] Task: Verify Consul health status
- [x] Confirm that only the LiteFS primary node is marked as `passing`.
- [x] Confirm that replica nodes are marked as `critical`.
- [x] Task: Verify Ingress Routing
- [x] Confirm Traefik correctly routes traffic only to the primary node.
- [x] Verify that Navidrome is accessible and functional.
- [x] Task: Conductor - User Manual Verification 'Deployment and Validation' (Protocol in workflow.md)

View File

@@ -0,0 +1,26 @@
# Specification: Direct Primary Routing for Navidrome-LiteFS
## Overview
This track aims to fix routing issues caused by the LiteFS proxy. We will reconfigure the Nomad service registration to point directly to the Navidrome process (port 4533) on the primary node, bypassing the LiteFS write-forwarding proxy (port 8080). To ensure Traefik only routes traffic to the node capable of writes, we will implement a "Primary-only" health check.
## Functional Requirements
- **Direct Port Mapping:** Update the Nomad `service` block to use the host port `4533` directly instead of the LiteFS proxy port.
- **Primary-Aware Health Check:** Replace the standard HTTP health check with a script check.
- **Check Logic:** The script will execute `litefs is-primary`.
- If the node is the primary, the command exits with `0` (Passing).
- If the node is a replica, the command exits with a non-zero code (Critical).
- **Service Tags:** Retain all existing Traefik tags so ingress routing continues to work.
## Non-Functional Requirements
- **Failover Reliability:** In the event of a leader election, the old primary must become unhealthy and the new primary must become healthy in Consul, allowing Traefik to update its backends automatically.
- **Minimal Latency:** Bypassing the proxy eliminates the extra network hop for reads and potential compatibility issues with Navidrome's connection handling.
## Acceptance Criteria
- [ ] Consul reports the service as `passing` only on the node currently holding the LiteFS primary lease.
- [ ] Consul reports the service as `critical` on all replica nodes.
- [ ] Traefik correctly routes traffic to the primary node.
- [ ] Navidrome is accessible and functions correctly without the LiteFS proxy intermediary.
## Out of Scope
- Modifying Navidrome internal logic.
- Implementing an external health-check responder.

View File

@@ -1,8 +1,3 @@
# Project Tracks
This file tracks all major tracks for the project. Each track has its own detailed plan in its respective folder.
---
- [x] **Track: fix routing - use litefs to register the navidrome service with consul. the serivce should point to the master and avoid the litefs proxy (it breaks navidrome)**
*Link: [./tracks/fix_routing_20260207/](./tracks/fix_routing_20260207/)*
This file tracks all major tracks for the project. Each track has its own detailed plan in its respective folder.