working - moved to compose

This commit is contained in:
2025-08-23 16:06:57 -07:00
parent 6c1fe70fa2
commit 939163806b
13 changed files with 542 additions and 312 deletions

View File

@@ -3,50 +3,6 @@
## Overview
Extend the existing GarminSync FIT parser to calculate cycling-specific metrics including power estimation and singlespeed gear ratio analysis for activities without native power data.
## Phase 1: Core Infrastructure Setup
### 1.1 Database Schema Extensions
**File: `garminsync/database.py`**
- Extend existing `PowerAnalysis` table with cycling-specific fields:
```python
# Add to PowerAnalysis class:
peak_power_1s = Column(Float, nullable=True)
peak_power_5s = Column(Float, nullable=True)
peak_power_20s = Column(Float, nullable=True)
peak_power_300s = Column(Float, nullable=True)
normalized_power = Column(Float, nullable=True)
intensity_factor = Column(Float, nullable=True)
training_stress_score = Column(Float, nullable=True)
```
- Extend existing `GearingAnalysis` table:
```python
# Add to GearingAnalysis class:
estimated_chainring_teeth = Column(Integer, nullable=True)
estimated_cassette_teeth = Column(Integer, nullable=True)
gear_ratio = Column(Float, nullable=True)
gear_inches = Column(Float, nullable=True)
development_meters = Column(Float, nullable=True)
confidence_score = Column(Float, nullable=True)
analysis_method = Column(String, default="singlespeed_estimation")
```
### 1.2 Enhanced FIT Parser
**File: `garminsync/fit_processor/parser.py`**
- Extend `FITParser` to extract cycling-specific data points:
```python
def _extract_cycling_data(self, message):
"""Extract cycling-specific metrics from FIT records"""
# GPS coordinates for elevation/gradient
# Speed and cadence for gear analysis
# Power data (if available) for validation
# Temperature for air density calculations
```
## Phase 2: Power Estimation Engine
### 2.1 Physics-Based Power Calculator
**New file: `garminsync/fit_processor/power_estimator.py`**
**Key Components:**
- **Environmental factors**: Air density, wind resistance, temperature
@@ -84,10 +40,9 @@ class PowerEstimator:
- Normalized Power (NP) calculation using 30-second rolling average
- Training Stress Score (TSS) estimation based on NP and ride duration
## Phase 3: Singlespeed Gear Ratio Analysis
## Singlespeed Gear Ratio Analysis
### 3.1 Gear Ratio Calculator
**New file: `garminsync/fit_processor/gear_analyzer.py`**
### Gear Ratio Calculator
**Strategy:**
- Analyze flat terrain segments (gradient < 3%)
@@ -121,101 +76,3 @@ class SinglespeedAnalyzer:
2. **Ratio calculation**: `gear_ratio = (speed_ms × 60) ÷ (cadence_rpm × wheel_circumference_m)`
3. **Ratio matching**: Compare calculated ratios against theoretical singlespeed options
4. **Confidence scoring**: Based on data consistency and segment duration
## Phase 4: Integration with Existing System
### 4.1 FIT Processing Workflow Enhancement
**File: `garminsync/fit_processor/analyzer.py`**
- Integrate power estimation and gear analysis into existing analysis workflow
- Add cycling-specific analysis triggers (detect cycling activities)
- Store results in database using existing schema
### 4.2 Database Population
**Migration strategy:**
- Extend existing migration system to handle new fields
- Process existing FIT files retroactively
- Add processing status tracking for cycling analysis
### 4.3 CLI Integration
**File: `garminsync/cli.py`**
- Add new command: `garminsync analyze --cycling --activity-id <id>`
- Add batch processing: `garminsync analyze --cycling --missing`
- Add reporting: `garminsync report --power-analysis --gear-analysis`
## Phase 5: Validation and Testing
### 5.1 Test Data Requirements
- FIT files with known power data for validation
- Various singlespeed configurations for gear ratio testing
- Different terrain types (flat, climbing, mixed)
### 5.2 Validation Methodology
- Compare estimated vs. actual power (where available)
- Validate gear ratio estimates against known bike configurations
- Test edge cases (very low/high cadence, extreme gradients)
### 5.3 Performance Optimization
- Efficient gradient calculation from GPS data
- Optimize power calculation loops for large datasets
- Cache intermediate calculations
## Phase 6: Advanced Features (Future)
### 6.1 Environmental Corrections
- Wind speed/direction integration
- Barometric pressure for accurate altitude
- Temperature-based air density adjustments
### 6.2 Machine Learning Enhancement
- Train models on validated power data
- Improve gear ratio detection accuracy
- Personalized power estimation based on rider history
### 6.3 Comparative Analysis
- Compare estimated metrics across rides
- Trend analysis for fitness progression
- Gear ratio optimization recommendations
## Implementation Priority
**High Priority:**
1. Database schema extensions
2. Basic power estimation using physics model
3. Singlespeed gear ratio analysis for flat terrain
4. Integration with existing FIT processing pipeline
**Medium Priority:**
1. Peak power analysis (1s, 5s, 20s, 5min)
2. Normalized Power and TSS calculations
3. Advanced gear analysis with confidence scoring
4. CLI commands for analysis and reporting
**Low Priority:**
1. Environmental corrections (wind, pressure)
2. Machine learning enhancements
3. Advanced comparative analysis features
4. Web UI integration for visualizing results
## Success Criteria
1. **Power Estimation**: Within ±10% of actual power data (where available for validation)
2. **Gear Ratio Detection**: Correctly identify gear ratios within ±1 tooth accuracy
3. **Processing Speed**: Analyze typical FIT file (1-hour ride) in <5 seconds
4. **Data Coverage**: Successfully analyze 90%+ of cycling FIT files
5. **Integration**: Seamlessly integrate with existing GarminSync workflow
## File Structure Summary
```
garminsync/
├── fit_processor/
│ ├── parser.py (enhanced)
│ ├── analyzer.py (enhanced)
│ ├── power_estimator.py (new)
│ └── gear_analyzer.py (new)
├── database.py (enhanced)
├── cli.py (enhanced)
└── migrate_cycling_analysis.py (new)
```
This plan provides a comprehensive roadmap for implementing cycling-specific FIT analysis while building on the existing GarminSync infrastructure and maintaining compatibility with current functionality.

View File

@@ -0,0 +1,29 @@
version: '3.8'
services:
backend:
extends:
file: docker-compose.yml
service: backend
command: pytest -v tests/
environment:
- DATABASE_URL=postgresql://garmin:sync@db_test/garminsync_test
- TESTING=True
db_test:
image: postgres:15-alpine
volumes:
- postgres_test_data:/var/lib/postgresql/data
environment:
- POSTGRES_USER=garmin
- POSTGRES_PASSWORD=sync
- POSTGRES_DB=garminsync_test
networks:
- garmin-net
volumes:
postgres_test_data:
networks:
garmin-net:
external: true

12
docker/docker-compose.yml Normal file
View File

@@ -0,0 +1,12 @@
services:
garminsync:
build:
context: ..
dockerfile: Dockerfile
# Removed entrypoint to rely on Dockerfile configuration
volumes:
- ../data:/app/data # Persistent storage for SQLite database
ports:
- "8888:8888"
env_file:
- ../.env # Use the root .env file

View File

@@ -1,31 +1,39 @@
#!/bin/bash
# Conditionally run database migrations
if [ "${RUN_MIGRATIONS:-1}" = "1" ]; then
echo "$(date) - Starting database migrations..."
echo "ALEMBIC_CONFIG: ${ALEMBIC_CONFIG:-/app/migrations/alembic.ini}"
echo "ALEMBIC_SCRIPT_LOCATION: ${ALEMBIC_SCRIPT_LOCATION:-/app/migrations/versions}"
# Run migrations with timing
# Always run database migrations with retries
echo "$(date) - Starting database migrations..."
echo "ALEMBIC_CONFIG: ${ALEMBIC_CONFIG:-/app/migrations/alembic.ini}"
echo "ALEMBIC_SCRIPT_LOCATION: ${ALEMBIC_SCRIPT_LOCATION:-/app/migrations/versions}"
max_retries=5
retry_count=0
migration_status=1
export ALEMBIC_CONFIG=${ALEMBIC_CONFIG:-/app/migrations/alembic.ini}
export ALEMBIC_SCRIPT_LOCATION=${ALEMBIC_SCRIPT_LOCATION:-/app/migrations/versions}
while [ $retry_count -lt $max_retries ] && [ $migration_status -ne 0 ]; do
echo "Attempt $((retry_count+1))/$max_retries: Running migrations..."
start_time=$(date +%s)
export ALEMBIC_CONFIG=${ALEMBIC_CONFIG:-/app/migrations/alembic.ini}
export ALEMBIC_SCRIPT_LOCATION=${ALEMBIC_SCRIPT_LOCATION:-/app/migrations/versions}
alembic upgrade head
migration_status=$?
end_time=$(date +%s)
duration=$((end_time - start_time))
if [ $migration_status -ne 0 ]; then
echo "$(date) - Migration failed after ${duration} seconds!" >&2
exit 1
echo "$(date) - Migration attempt failed after ${duration} seconds! Retrying..."
retry_count=$((retry_count+1))
sleep 2
else
echo "$(date) - Migrations completed successfully in ${duration} seconds"
fi
else
echo "$(date) - Skipping database migrations (RUN_MIGRATIONS=${RUN_MIGRATIONS})"
done
if [ $migration_status -ne 0 ]; then
echo "$(date) - Migration failed after $max_retries attempts!" >&2
exit 1
fi
# Start the application
echo "$(date) - Starting application..."
exec python -m garminsync.cli daemon --start --port 8888
sleep infinity

View File

@@ -2,7 +2,10 @@ import os
import gzip
import fitdecode
import xml.etree.ElementTree as ET
from datetime import datetime
import numpy as np
from .fit_processor.power_estimator import PowerEstimator
from .fit_processor.gear_analyzer import SinglespeedAnalyzer
from math import radians, sin, cos, sqrt, atan2
def detect_file_type(file_path):
"""Detect file format (FIT, XML, or unknown)"""
@@ -58,9 +61,48 @@ def parse_xml_file(file_path):
except Exception:
return None
def compute_gradient(altitudes, positions, distance_m=10):
"""Compute gradient percentage for each point using elevation changes"""
if len(altitudes) < 2:
return [0] * len(altitudes)
gradients = []
for i in range(1, len(altitudes)):
elev_change = altitudes[i] - altitudes[i-1]
if positions and i < len(positions):
distance = distance_between_points(positions[i-1], positions[i])
else:
distance = distance_m
gradients.append((elev_change / distance) * 100)
return [gradients[0]] + gradients
def distance_between_points(point1, point2):
"""Calculate distance between two (lat, lon) points in meters using Haversine"""
R = 6371000 # Earth radius in meters
lat1, lon1 = radians(point1[0]), radians(point1[1])
lat2, lon2 = radians(point2[0]), radians(point2[1])
dlat = lat2 - lat1
dlon = lon2 - lon1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * atan2(sqrt(a), sqrt(1-a))
return R * c
def parse_fit_file(file_path):
"""Parse FIT file to extract activity metrics"""
"""Parse FIT file to extract activity metrics and detailed cycling data"""
metrics = {}
detailed_metrics = {
'speeds': [], 'cadences': [], 'altitudes': [],
'positions': [], 'gradients': [], 'powers': [], 'timestamps': []
}
power_estimator = PowerEstimator()
gear_analyzer = SinglespeedAnalyzer()
try:
with open(file_path, 'rb') as f:
magic = f.read(2)
@@ -73,7 +115,49 @@ def parse_fit_file(file_path):
with BytesIO(gz_file.read()) as fit_data:
fit = fitdecode.FitReader(fit_data)
for frame in fit:
if frame.frame_type == fitdecode.FrameType.DATA and frame.name == 'session':
if frame.frame_type == fitdecode.FrameType.DATA:
if frame.name == 'record':
if timestamp := frame.get_value('timestamp'):
detailed_metrics['timestamps'].append(timestamp)
if (lat := frame.get_value('position_lat')) and (lon := frame.get_value('position_long')):
detailed_metrics['positions'].append((lat, lon))
if altitude := frame.get_value('altitude'):
detailed_metrics['altitudes'].append(altitude)
if speed := frame.get_value('speed'):
detailed_metrics['speeds'].append(speed)
if cadence := frame.get_value('cadence'):
detailed_metrics['cadences'].append(cadence)
if power := frame.get_value('power'):
detailed_metrics['powers'].append(power)
elif frame.name == 'session':
metrics = {
"sport": frame.get_value("sport"),
"total_timer_time": frame.get_value("total_timer_time"),
"total_distance": frame.get_value("total_distance"),
"max_heart_rate": frame.get_value("max_heart_rate"),
"avg_power": frame.get_value("avg_power"),
"total_calories": frame.get_value("total_calories")
}
else:
with fitdecode.FitReader(file_path) as fit:
for frame in fit:
if frame.frame_type == fitdecode.FrameType.DATA:
if frame.name == 'record':
if timestamp := frame.get_value('timestamp'):
detailed_metrics['timestamps'].append(timestamp)
if (lat := frame.get_value('position_lat')) and (lon := frame.get_value('position_long')):
detailed_metrics['positions'].append((lat, lon))
if altitude := frame.get_value('altitude'):
detailed_metrics['altitudes'].append(altitude)
if speed := frame.get_value('speed'):
detailed_metrics['speeds'].append(speed)
if cadence := frame.get_value('cadence'):
detailed_metrics['cadences'].append(cadence)
if power := frame.get_value('power'):
detailed_metrics['powers'].append(power)
elif frame.name == 'session':
metrics = {
"sport": frame.get_value("sport"),
"total_timer_time": frame.get_value("total_timer_time"),
@@ -82,21 +166,32 @@ def parse_fit_file(file_path):
"avg_power": frame.get_value("avg_power"),
"total_calories": frame.get_value("total_calories")
}
break
else:
with fitdecode.FitReader(file_path) as fit:
for frame in fit:
if frame.frame_type == fitdecode.FrameType.DATA and frame.name == 'session':
metrics = {
"sport": frame.get_value("sport"),
"total_timer_time": frame.get_value("total_timer_time"),
"total_distance": frame.get_value("total_distance"),
"max_heart_rate": frame.get_value("max_heart_rate"),
"avg_power": frame.get_value("avg_power"),
"total_calories": frame.get_value("total_calories")
}
break
# Compute gradients if data available
if detailed_metrics['altitudes']:
detailed_metrics['gradients'] = compute_gradient(
detailed_metrics['altitudes'],
detailed_metrics['positions']
)
# Process cycling-specific metrics
if metrics.get('sport') in ['cycling', 'road_biking', 'mountain_biking']:
# Estimate power if not present
if not detailed_metrics['powers']:
for speed, gradient in zip(detailed_metrics['speeds'], detailed_metrics['gradients']):
estimated_power = power_estimator.calculate_power(speed, gradient)
detailed_metrics['powers'].append(estimated_power)
metrics['avg_power'] = np.mean(detailed_metrics['powers']) if detailed_metrics['powers'] else None
# Run gear analysis
if detailed_metrics['speeds'] and detailed_metrics['cadences']:
gear_analysis = gear_analyzer.analyze_gear_ratio(
detailed_metrics['speeds'],
detailed_metrics['cadences'],
detailed_metrics['gradients']
)
metrics['gear_analysis'] = gear_analysis or {}
return {
"activityType": {"typeKey": metrics.get("sport", "other")},
"summaryDTO": {
@@ -104,27 +199,53 @@ def parse_fit_file(file_path):
"distance": metrics.get("total_distance"),
"maxHR": metrics.get("max_heart_rate"),
"avgPower": metrics.get("avg_power"),
"calories": metrics.get("total_calories")
}
"calories": metrics.get("total_calories"),
"gearAnalysis": metrics.get("gear_analysis", {})
},
"detailedMetrics": detailed_metrics
}
except Exception:
except Exception as e:
print(f"Error parsing FIT file: {str(e)}")
return None
def get_activity_metrics(activity, client=None):
def get_activity_metrics(activity, client=None, force_reprocess=False):
"""
Get activity metrics from local file or Garmin API
Returns parsed metrics or None
:param activity: Activity object
:param client: Optional GarminClient instance
:param force_reprocess: If True, re-process file even if already parsed
:return: Activity metrics dictionary
"""
metrics = None
if activity.filename and os.path.exists(activity.filename):
# Always re-process if force_reprocess is True
if force_reprocess and activity.filename and os.path.exists(activity.filename):
file_type = detect_file_type(activity.filename)
if file_type == 'fit':
metrics = parse_fit_file(activity.filename)
elif file_type == 'xml':
metrics = parse_xml_file(activity.filename)
if not metrics and client:
try:
metrics = client.get_activity_details(activity.activity_id)
except Exception:
pass
return metrics
if file_type == 'fit':
metrics = parse_fit_file(activity.filename)
elif file_type == 'xml':
metrics = parse_xml_file(activity.filename)
except Exception as e:
print(f"Error parsing activity file: {str(e)}")
# Only parse if metrics not already obtained through force_reprocess
if not metrics:
if activity.filename and os.path.exists(activity.filename):
file_type = detect_file_type(activity.filename)
try:
if file_type == 'fit':
metrics = parse_fit_file(activity.filename)
elif file_type == 'xml':
metrics = parse_xml_file(activity.filename)
except Exception as e:
print(f"Error parsing activity file: {str(e)}")
if not metrics and client:
try:
metrics = client.get_activity_details(activity.activity_id)
except Exception as e:
print(f"Error fetching activity from API: {str(e)}")
# Return summary DTO for compatibility
return metrics.get("summaryDTO") if metrics and "summaryDTO" in metrics else metrics

View File

@@ -212,6 +212,154 @@ def migrate_activities():
typer.echo("Database migration failed!")
raise typer.Exit(code=1)
@app.command("analyze")
def analyze_activities(
activity_id: Annotated[int, typer.Option("--activity-id", help="Activity ID to analyze")] = None,
missing: Annotated[bool, typer.Option("--missing", help="Analyze all cycling activities missing analysis")] = False,
cycling: Annotated[bool, typer.Option("--cycling", help="Run cycling-specific analysis")] = False,
):
"""Analyze activity data for cycling metrics"""
from tqdm import tqdm
from .database import Activity, get_session
from .activity_parser import get_activity_metrics
if not cycling:
typer.echo("Error: Currently only cycling analysis is supported")
raise typer.Exit(code=1)
session = get_session()
activities = []
if activity_id:
activity = session.query(Activity).get(activity_id)
if not activity:
typer.echo(f"Error: Activity with ID {activity_id} not found")
raise typer.Exit(code=1)
activities = [activity]
elif missing:
activities = session.query(Activity).filter(
Activity.activity_type == 'cycling',
Activity.analyzed == False # Only unanalyzed activities
).all()
if not activities:
typer.echo("No unanalyzed cycling activities found")
return
else:
typer.echo("Error: Please specify --activity-id or --missing")
raise typer.Exit(code=1)
typer.echo(f"Analyzing {len(activities)} cycling activities...")
for activity in tqdm(activities, desc="Processing"):
metrics = get_activity_metrics(activity)
if metrics and "gearAnalysis" in metrics:
# Update activity with analysis results
activity.analyzed = True
activity.gear_ratio = metrics["gearAnalysis"].get("gear_ratio")
activity.gear_inches = metrics["gearAnalysis"].get("gear_inches")
# Add other metrics as needed
session.commit()
typer.echo("Analysis completed successfully")
@app.command("reprocess")
def reprocess_activities(
all: Annotated[bool, typer.Option("--all", help="Reprocess all activities")] = False,
missing: Annotated[bool, typer.Option("--missing", help="Reprocess activities missing metrics")] = False,
activity_id: Annotated[int, typer.Option("--activity-id", help="Reprocess specific activity by ID")] = None,
):
"""Reprocess activities to calculate missing metrics"""
from tqdm import tqdm
from .database import Activity, get_session
from .activity_parser import get_activity_metrics
session = get_session()
activities = []
if activity_id:
activity = session.query(Activity).get(activity_id)
if not activity:
typer.echo(f"Error: Activity with ID {activity_id} not found")
raise typer.Exit(code=1)
activities = [activity]
elif missing:
activities = session.query(Activity).filter(
Activity.reprocessed == False
).all()
if not activities:
typer.echo("No activities to reprocess")
return
elif all:
activities = session.query(Activity).filter(
Activity.downloaded == True
).all()
if not activities:
typer.echo("No downloaded activities found")
return
else:
typer.echo("Error: Please specify one of: --all, --missing, --activity-id")
raise typer.Exit(code=1)
typer.echo(f"Reprocessing {len(activities)} activities...")
for activity in tqdm(activities, desc="Reprocessing"):
# Use force_reprocess=True to ensure we parse the file again
metrics = get_activity_metrics(activity, force_reprocess=True)
# Update activity metrics
if metrics:
activity.activity_type = metrics.get("activityType", {}).get("typeKey")
activity.duration = int(float(metrics.get("duration", 0))) if metrics.get("duration") else activity.duration
activity.distance = float(metrics.get("distance", 0)) if metrics.get("distance") else activity.distance
activity.max_heart_rate = int(float(metrics.get("maxHR", 0))) if metrics.get("maxHR") else activity.max_heart_rate
activity.avg_heart_rate = int(float(metrics.get("avgHR", 0))) if metrics.get("avgHR") else activity.avg_heart_rate
activity.avg_power = float(metrics.get("avgPower", 0)) if metrics.get("avgPower") else activity.avg_power
activity.calories = int(float(metrics.get("calories", 0))) if metrics.get("calories") else activity.calories
# Mark as reprocessed
activity.reprocessed = True
session.commit()
typer.echo("Reprocessing completed")
@app.command("report")
def generate_report(
power_analysis: Annotated[bool, typer.Option("--power-analysis", help="Generate power metrics report")] = False,
gear_analysis: Annotated[bool, typer.Option("--gear-analysis", help="Generate gear analysis report")] = False,
):
"""Generate performance reports for cycling activities"""
from .database import Activity, get_session
from .web import app as web_app
if not any([power_analysis, gear_analysis]):
typer.echo("Error: Please specify at least one report type")
raise typer.Exit(code=1)
session = get_session()
activities = session.query(Activity).filter(
Activity.activity_type == 'cycling',
Activity.analyzed == True
).all()
if not activities:
typer.echo("No analyzed cycling activities found")
return
# Simple CLI report - real implementation would use web UI
typer.echo("Cycling Analysis Report")
typer.echo("=======================")
for activity in activities:
typer.echo(f"\nActivity ID: {activity.activity_id}")
typer.echo(f"Date: {activity.start_time}")
if power_analysis:
typer.echo(f"- Average Power: {activity.avg_power}W")
# Add other power metrics as needed
if gear_analysis:
typer.echo(f"- Gear Ratio: {activity.gear_ratio}")
typer.echo(f"- Gear Inches: {activity.gear_inches}")
typer.echo("\nFull reports available in the web UI at http://localhost:8080")
def main():
app()

View File

@@ -1,3 +1,4 @@
import os
import signal
import sys
import threading
@@ -34,8 +35,9 @@ class GarminSyncDaemon:
# Load configuration from database
config_data = self.load_config()
# Setup scheduled job
# Setup scheduled jobs
if config_data["enabled"]:
# Sync job
cron_str = config_data["schedule_cron"]
try:
# Validate cron string
@@ -51,7 +53,30 @@ class GarminSyncDaemon:
id="sync_job",
replace_existing=True,
)
logger.info(f"Scheduled job created with cron: '{cron_str}'")
logger.info(f"Sync job scheduled with cron: '{cron_str}'")
except Exception as e:
logger.error(f"Failed to create sync job: {str(e)}")
# Fallback to default schedule
self.scheduler.add_job(
func=self.sync_and_download,
trigger=CronTrigger.from_crontab("0 */6 * * *"),
id="sync_job",
replace_existing=True,
)
logger.info("Using default schedule for sync job: '0 */6 * * *'")
# Reprocess job - run daily at 2 AM
reprocess_cron = "0 2 * * *" # Daily at 2 AM
try:
self.scheduler.add_job(
func=self.reprocess_activities,
trigger=CronTrigger.from_crontab(reprocess_cron),
id="reprocess_job",
replace_existing=True,
)
logger.info(f"Reprocess job scheduled with cron: '{reprocess_cron}'")
except Exception as e:
logger.error(f"Failed to create reprocess job: {str(e)}")
except Exception as e:
logger.error(f"Failed to create scheduled job: {str(e)}")
# Fallback to default schedule
@@ -281,3 +306,61 @@ class GarminSyncDaemon:
return session.query(Activity).filter_by(downloaded=False).count()
finally:
session.close()
def reprocess_activities(self):
"""Reprocess activities to calculate missing metrics"""
from .database import get_session
from .activity_parser import get_activity_metrics
from .database import Activity
from tqdm import tqdm
logger.info("Starting reprocess job")
session = get_session()
try:
# Get activities that need reprocessing
activities = session.query(Activity).filter(
Activity.downloaded == True,
Activity.reprocessed == False
).all()
if not activities:
logger.info("No activities to reprocess")
return
logger.info(f"Reprocessing {len(activities)} activities")
success_count = 0
# Reprocess each activity
for activity in tqdm(activities, desc="Reprocessing"):
try:
# Use force_reprocess=True to ensure we parse the file again
metrics = get_activity_metrics(activity, client=None, force_reprocess=True)
# Update activity metrics if we got new data
if metrics:
activity.activity_type = metrics.get("activityType", {}).get("typeKey")
activity.duration = int(float(metrics.get("duration", 0))) if metrics.get("duration") else activity.duration
activity.distance = float(metrics.get("distance", 0)) if metrics.get("distance") else activity.distance
activity.max_heart_rate = int(float(metrics.get("maxHR", 0))) if metrics.get("maxHR") else activity.max_heart_rate
activity.avg_heart_rate = int(float(metrics.get("avgHR", 0))) if metrics.get("avgHR") else activity.avg_heart_rate
activity.avg_power = float(metrics.get("avgPower", 0)) if metrics.get("avgPower") else activity.avg_power
activity.calories = int(float(metrics.get("calories", 0))) if metrics.get("calories") else activity.calories
# Mark as reprocessed regardless of success
activity.reprocessed = True
session.commit()
success_count += 1
except Exception as e:
logger.error(f"Error reprocessing activity {activity.activity_id}: {str(e)}")
session.rollback()
logger.info(f"Reprocessed {success_count}/{len(activities)} activities successfully")
self.log_operation("reprocess", "success", f"Reprocessed {success_count} activities")
self.update_daemon_last_run()
except Exception as e:
logger.error(f"Reprocess job failed: {str(e)}")
self.log_operation("reprocess", "error", str(e))
finally:
session.close()

View File

@@ -26,6 +26,7 @@ class Activity(Base):
calories = Column(Integer, nullable=True)
filename = Column(String, unique=True, nullable=True)
downloaded = Column(Boolean, default=False, nullable=False)
reprocessed = Column(Boolean, default=False, nullable=False)
created_at = Column(String, nullable=False)
last_sync = Column(String, nullable=True)
@@ -103,7 +104,7 @@ def init_db():
Returns:
SQLAlchemy engine instance
"""
db_path = os.path.join(os.getenv("DATA_DIR", "data"), "garmin.db")
db_path = os.getenv("DB_PATH", "data/garmin.db")
engine = create_engine(f"sqlite:///{db_path}")
Base.metadata.create_all(engine)
return engine

View File

@@ -1,112 +1,44 @@
#!/usr/bin/env python3
"""
Migration script to populate new activity fields from FIT files or Garmin API
Migration script to populate activity fields from FIT files or Garmin API
"""
import os
import sys
from datetime import datetime
import logging
from sqlalchemy import MetaData, Table, create_engine, text
from sqlalchemy.exc import OperationalError
from sqlalchemy.orm import sessionmaker
# Add the parent directory to the path to import garminsync modules
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
# Add parent directory to path to import garminsync modules
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from garminsync.database import Activity, get_session, init_db
from garminsync.garmin import GarminClient
from garminsync.activity_parser import get_activity_metrics
def add_columns_to_database():
"""Add new columns to the activities table if they don't exist"""
# Add the parent directory to the path to import garminsync modules
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from garminsync.database import Activity, get_session, init_db
from garminsync.garmin import GarminClient
def add_columns_to_database():
"""Add new columns to the activities table if they don't exist"""
print("Adding new columns to database...", flush=True)
# Get database engine
db_path = os.path.join(os.getenv("DATA_DIR", "data"), "garmin.db")
engine = create_engine(f"sqlite:///{db_path}")
try:
# Reflect the existing database schema
metadata = MetaData()
metadata.reflect(bind=engine)
# Get the activities table
activities_table = metadata.tables["activities"]
# Check if columns already exist
existing_columns = [col.name for col in activities_table.columns]
new_columns = [
"activity_type",
"duration",
"distance",
"max_heart_rate",
"avg_power",
"calories",
]
# Add missing columns
with engine.connect() as conn:
for column_name in new_columns:
if column_name not in existing_columns:
print(f"Adding column {column_name}...", flush=True)
if column_name in ["distance", "avg_power"]:
conn.execute(
text(
f"ALTER TABLE activities ADD COLUMN {column_name} REAL"
)
)
elif column_name in ["duration", "max_heart_rate", "calories"]:
conn.execute(
text(
f"ALTER TABLE activities ADD COLUMN {column_name} INTEGER"
)
)
else:
conn.execute(
text(
f"ALTER TABLE activities ADD COLUMN {column_name} TEXT"
)
)
conn.commit()
print(f"Column {column_name} added successfully", flush=True)
else:
print(f"Column {column_name} already exists", flush=True)
print("Database schema updated successfully", flush=True)
return True
except Exception as e:
print(f"Failed to update database schema: {e}", flush=True)
return False
def migrate_activities():
"""Migrate activities to populate new fields from FIT files or Garmin API"""
print("Starting activity migration...", flush=True)
# First, add columns to database
if not add_columns_to_database():
return False
"""Migrate activities to populate fields from FIT files or Garmin API"""
logger.info("Starting activity migration...")
# We assume database schema has been updated via Alembic migrations
# during container startup. Columns should already exist.
# Initialize Garmin client
try:
client = GarminClient()
print("Garmin client initialized successfully", flush=True)
logger.info("Garmin client initialized successfully")
except Exception as e:
print(f"Failed to initialize Garmin client: {e}", flush=True)
logger.error(f"Failed to initialize Garmin client: {e}")
# Continue with migration but without Garmin data
client = None
@@ -115,67 +47,59 @@ def migrate_activities():
try:
# Get all activities that need to be updated (those with NULL activity_type)
activities = (
session.query(Activity).filter(Activity.activity_type.is_(None)).all()
)
print(f"Found {len(activities)} activities to migrate", flush=True)
activities = session.query(Activity).filter(Activity.activity_type.is_(None)).all()
logger.info(f"Found {len(activities)} activities to migrate")
# If no activities found, try to get all activities (in case activity_type column was just added)
if len(activities) == 0:
activities = session.query(Activity).all()
print(f"Found {len(activities)} total activities", flush=True)
# If no activities found, exit early
if not activities:
logger.info("No activities found for migration")
return True
updated_count = 0
error_count = 0
for i, activity in enumerate(activities):
try:
print(
f"Processing activity {i+1}/{len(activities)} (ID: {activity.activity_id})",
flush=True
)
logger.info(f"Processing activity {i+1}/{len(activities)} (ID: {activity.activity_id})")
# Use shared parser to get activity metrics
activity_details = get_activity_metrics(activity, client)
if activity_details is not None:
print(f" Successfully parsed metrics for activity {activity.activity_id}", flush=True)
else:
print(f" Could not retrieve metrics for activity {activity.activity_id}", flush=True)
# Update activity fields if we have details
if activity_details:
logger.info(f"Successfully parsed metrics for activity {activity.activity_id}")
# Update activity fields
activity.activity_type = activity_details.get(
"activityType", {}
).get("typeKey")
activity.activity_type = activity_details.get("activityType", {}).get("typeKey", "Unknown")
# Extract duration in seconds
duration = activity_details.get("summaryDTO", {}).get("duration")
if duration is not None:
activity.duration = int(float(duration))
# Extract distance in meters
distance = activity_details.get("summaryDTO", {}).get("distance")
if distance is not None:
activity.distance = float(distance)
# Extract max heart rate
max_hr = activity_details.get("summaryDTO", {}).get("maxHR")
if max_hr is not None:
activity.max_heart_rate = int(float(max_hr))
# Extract average power
avg_power = activity_details.get("summaryDTO", {}).get("avgPower")
if avg_power is not None:
activity.avg_power = float(avg_power)
# Extract calories
calories = activity_details.get("summaryDTO", {}).get("calories")
if calories is not None:
activity.calories = int(float(calories))
else:
# Set default values for activity type if we can't get details
# Set default values if we can't get details
activity.activity_type = "Unknown"
logger.warning(f"Could not retrieve metrics for activity {activity.activity_id}")
# Update last sync timestamp
activity.last_sync = datetime.now().isoformat()
@@ -183,26 +107,25 @@ def migrate_activities():
session.commit()
updated_count += 1
# Print progress every 10 activities
# Log progress every 10 activities
if (i + 1) % 10 == 0:
print(f" Progress: {i+1}/{len(activities)} activities processed", flush=True)
logger.info(f"Progress: {i+1}/{len(activities)} activities processed")
except Exception as e:
print(f" Error processing activity {activity.activity_id}: {e}", flush=True)
logger.error(f"Error processing activity {activity.activity_id}: {e}")
session.rollback()
error_count += 1
continue
print(f"Migration completed. Updated: {updated_count}, Errors: {error_count}", flush=True)
return True # Allow partial success
logger.info(f"Migration completed. Updated: {updated_count}, Errors: {error_count}")
return updated_count > 0 or error_count == 0 # Success if we updated any or had no errors
except Exception as e:
print(f"Migration failed: {e}", flush=True)
logger.error(f"Migration failed: {e}")
return False
finally:
session.close()
if __name__ == "__main__":
success = migrate_activities()
sys.exit(0 if success else 1)

View File

@@ -261,6 +261,51 @@ async def clear_logs():
finally:
session.close()
@router.post("/activities/{activity_id}/reprocess")
async def reprocess_activity(activity_id: int):
"""Reprocess a single activity to update metrics"""
from garminsync.database import Activity, get_session
from garminsync.activity_parser import get_activity_metrics
session = get_session()
try:
activity = session.query(Activity).get(activity_id)
if not activity:
raise HTTPException(status_code=404, detail="Activity not found")
metrics = get_activity_metrics(activity, force_reprocess=True)
if metrics:
# Update activity metrics
activity.activity_type = metrics.get("activityType", {}).get("typeKey")
activity.duration = int(float(metrics.get("duration", 0))) if metrics.get("duration") else activity.duration
activity.distance = float(metrics.get("distance", 0)) if metrics.get("distance") else activity.distance
activity.max_heart_rate = int(float(metrics.get("maxHR", 0))) if metrics.get("maxHR") else activity.max_heart_rate
activity.avg_heart_rate = int(float(metrics.get("avgHR", 0))) if metrics.get("avgHR") else activity.avg_heart_rate
activity.avg_power = float(metrics.get("avgPower", 0)) if metrics.get("avgPower") else activity.avg_power
activity.calories = int(float(metrics.get("calories", 0))) if metrics.get("calories") else activity.calories
# Mark as reprocessed
activity.reprocessed = True
session.commit()
return {"message": f"Activity {activity_id} reprocessed successfully"}
except Exception as e:
session.rollback()
raise HTTPException(status_code=500, detail=f"Reprocessing failed: {str(e)}")
finally:
session.close()
@router.post("/reprocess")
async def reprocess_activities(all: bool = False):
"""Reprocess all activities or just missing ones"""
from garminsync.daemon import daemon_instance
try:
# Trigger reprocess job in daemon
daemon_instance.reprocess_activities()
return {"message": "Reprocess job started in background"}
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start reprocess job: {str(e)}")
@router.get("/activities")
async def get_activities(

View File

@@ -6,4 +6,5 @@
- always rebuild the container before running tests
- if you need clarification return to PLAN mode
- force rereading of the mandates on each cycle
- always track progress of plans in todo.md
</Mandates>

View File

@@ -18,3 +18,5 @@ sqlalchemy==2.0.30
pylint==3.1.0
pygments==2.18.0
fitdecode
numpy==1.26.0
scipy==1.11.1