From 1754775f4c680832427ea3db128cb0058175fe1c Mon Sep 17 00:00:00 2001 From: sstent Date: Sun, 24 Aug 2025 07:44:32 -0700 Subject: [PATCH] working - checkpoint 2 --- Design.md | 826 ----------------------- Dockerfile | 75 ++- cycling.md | 221 ------ cyclingpower.md | 78 --- garminsync/daemon.py | 145 +++- garminsync/database.py | 292 ++++---- justfile | 4 +- plan.md | 1453 ++++++++++++++++++++++++++++++++++++++++ requirements.txt | 3 + todo.md | 44 -- 10 files changed, 1769 insertions(+), 1372 deletions(-) delete mode 100644 Design.md delete mode 100644 cycling.md delete mode 100644 cyclingpower.md delete mode 100644 todo.md diff --git a/Design.md b/Design.md deleted file mode 100644 index 2483fbe..0000000 --- a/Design.md +++ /dev/null @@ -1,826 +0,0 @@ -# **GarminSync Application Design (Python Version)** - -## **Basic Info** - -**App Name:** GarminSync -**What it does:** A CLI application that downloads `.fit` files for every activity in Garmin Connect. - ------ - -## **Core Features** - -### **CLI Mode (Implemented)** -1. List all activities (`garminsync list --all`) -2. List activities that have not been downloaded (`garminsync list --missing`) -3. List activities that have been downloaded (`garminsync list --downloaded`) -4. Download all missing activities (`garminsync download --missing`) - -### **Enhanced Features (Implemented)** -5. **Offline Mode**: List activities without polling Garmin Connect (`garminsync list --missing --offline`) -6. **Daemon Mode**: Run as background service with scheduled downloads (`garminsync daemon --start`) -7. **Web UI**: Browser-based interface for daemon monitoring and configuration (`http://localhost:8080`) - ------ - -## **Tech Stack ๐Ÿ** - -* **Frontend:** CLI (**Python** with Typer) + Web UI (FastAPI + Jinja2) -* **Backend:** **Python** -* **Database:** SQLite (`garmin.db`) -* **Hosting:** Docker container -* **Key Libraries:** - * **`python-garminconnect`**: The library for Garmin Connect API communication. - * **`typer`**: A modern and easy-to-use CLI framework (built on `click`). - * **`python-dotenv`**: For loading credentials from a `.env` file. - * **`sqlalchemy`**: A robust ORM for database interaction and schema management. - * **`tqdm`**: For creating user-friendly progress bars. - * **`fastapi`**: Modern web framework for the daemon web UI. - * **`uvicorn`**: ASGI server for running the FastAPI web interface. - * **`apscheduler`**: Advanced Python Scheduler for daemon mode scheduling. - * **`pydantic`**: Data validation and settings management for configuration. - * **`jinja2`**: Template engine for web UI rendering. - ------ - -## **Data Structure** - -The application uses SQLAlchemy ORM with expanded models for daemon functionality: - -**SQLAlchemy Models (`database.py`):** - -```python -class Activity(Base): - __tablename__ = 'activities' - - activity_id = Column(Integer, primary_key=True) - start_time = Column(String, nullable=False) - filename = Column(String, unique=True, nullable=True) - downloaded = Column(Boolean, default=False, nullable=False) - created_at = Column(String, nullable=False) # When record was added - last_sync = Column(String, nullable=True) # Last successful sync - -class DaemonConfig(Base): - __tablename__ = 'daemon_config' - - id = Column(Integer, primary_key=True, default=1) - enabled = Column(Boolean, default=True, nullable=False) - schedule_cron = Column(String, default="0 */6 * * *", nullable=False) # Every 6 hours - last_run = Column(String, nullable=True) - next_run = Column(String, nullable=True) - status = Column(String, default="stopped", nullable=False) # stopped, running, error - -class SyncLog(Base): - __tablename__ = 'sync_logs' - - id = Column(Integer, primary_key=True, autoincrement=True) - timestamp = Column(String, nullable=False) - operation = Column(String, nullable=False) # sync, download, daemon_start, daemon_stop - status = Column(String, nullable=False) # success, error, partial - message = Column(String, nullable=True) - activities_processed = Column(Integer, default=0, nullable=False) - activities_downloaded = Column(Integer, default=0, nullable=False) -``` - ------ - -## **User Flow** - -### **CLI Mode (Implemented)** -1. User sets up credentials in `.env` file with `GARMIN_EMAIL` and `GARMIN_PASSWORD` -2. User launches the container: `docker run -it --env-file .env -v $(pwd)/data:/app/data garminsync` -3. User runs commands like `garminsync download --missing` -4. Application syncs with Garmin Connect, shows progress bars, and downloads activities - -### **Offline Mode (Implemented)** -1. User runs `garminsync list --missing --offline` to view cached data without API calls -2. Application queries local database only, showing last known state -3. Useful for checking status without network connectivity or API rate limits - -### **Daemon Mode (Implemented)** -1. User starts daemon: `garminsync daemon` (runs continuously in foreground) -2. Daemon automatically starts web UI and background scheduler -3. User accesses web UI at `http://localhost:8080` for monitoring and configuration -4. Web UI provides real-time status, logs, and schedule management -5. Daemon can be stopped with `Ctrl+C` or through web UI stop functionality - ------ - -## **File Structure** - -``` -/garminsync -โ”œโ”€โ”€ garminsync/ # Main application package -โ”‚ โ”œโ”€โ”€ __init__.py # Empty package file -โ”‚ โ”œโ”€โ”€ cli.py # Typer CLI commands and main entrypoint -โ”‚ โ”œโ”€โ”€ config.py # Configuration and environment variable loading -โ”‚ โ”œโ”€โ”€ database.py # SQLAlchemy models and database operations -โ”‚ โ”œโ”€โ”€ garmin.py # Garmin Connect client wrapper with robust download logic -โ”‚ โ”œโ”€โ”€ daemon.py # Daemon mode implementation with APScheduler -โ”‚ โ”œโ”€โ”€ utils.py # Shared utilities and helpers -โ”‚ โ””โ”€โ”€ web/ # Web UI components -โ”‚ โ”œโ”€โ”€ __init__.py -โ”‚ โ”œโ”€โ”€ app.py # FastAPI application setup -โ”‚ โ”œโ”€โ”€ routes.py # API endpoints for web UI -โ”‚ โ”œโ”€โ”€ static/ # CSS, JavaScript, images -โ”‚ โ”‚ โ”œโ”€โ”€ style.css -โ”‚ โ”‚ โ””โ”€โ”€ app.js -โ”‚ โ””โ”€โ”€ templates/ # Jinja2 HTML templates -โ”‚ โ”œโ”€โ”€ base.html -โ”‚ โ”œโ”€โ”€ dashboard.html -โ”‚ โ””โ”€โ”€ config.html -โ”œโ”€โ”€ data/ # Directory for downloaded .fit files and SQLite DB -โ”œโ”€โ”€ .env # Stores GARMIN_EMAIL/GARMIN_PASSWORD (gitignored) -โ”œโ”€โ”€ .gitignore # Excludes .env file and data directory -โ”œโ”€โ”€ Dockerfile # Production-ready container configuration -โ”œโ”€โ”€ Design.md # This design document -โ”œโ”€โ”€ plan.md # Implementation notes and fixes -โ””โ”€โ”€ requirements.txt # Python dependencies with compatibility fixes -``` - ------ - -## **Technical Implementation Details** - -### **Architecture** -- **CLI Framework**: Uses Typer with proper type hints and validation -- **Module Separation**: Clear separation between CLI commands, database operations, and Garmin API interactions -- **Error Handling**: Comprehensive exception handling with user-friendly error messages -- **Session Management**: Proper SQLAlchemy session management with cleanup - -### **Authentication & Configuration** -- Credentials loaded via `python-dotenv` from environment variables -- Configuration validation ensures required credentials are present -- Garmin client handles authentication automatically with session persistence - -### **Database Operations** -- SQLite database with SQLAlchemy ORM for type safety -- Database initialization creates tables automatically -- Sync functionality reconciles local database with Garmin Connect activities -- Proper transaction management with rollback on errors - -### **File Management** -- Files named with pattern: `activity_{activity_id}_{timestamp}.fit` -- Timestamp sanitized for filesystem compatibility (colons and spaces replaced) -- Downloads saved to configurable data directory -- Database tracks both download status and file paths - -### **API Integration** -- **Rate Limiting**: 2-second delays between API requests to respect Garmin's servers -- **Robust Downloads**: Multiple fallback methods for downloading FIT files: - 1. Default download method - 2. Explicit 'fit' format parameter - 3. Alternative parameter names and formats - 4. Graceful fallback with detailed error reporting -- **Activity Fetching**: Configurable batch sizes (currently 1000 activities per sync) - -### **User Experience** -- **Progress Indicators**: tqdm progress bars for all long-running operations -- **Informative Output**: Clear status messages and operation summaries -- **Input Validation**: Prevents invalid command combinations -- **Exit Codes**: Proper exit codes for script integration - ------ - -## **Development Status โœ…** - -### **โœ… Completed Features** - -#### **Phase 1: Core Infrastructure** -- [x] **Dockerfile**: Production-ready Python 3.10 container with proper layer caching -- [x] **Environment Configuration**: `python-dotenv` integration with validation -- [x] **CLI Framework**: Complete Typer implementation with type hints and help text -- [x] **Garmin Integration**: Robust `python-garminconnect` wrapper with authentication - -#### **Phase 2: Activity Listing** -- [x] **Database Schema**: SQLAlchemy models with proper relationships -- [x] **Database Operations**: Session management, initialization, and sync functionality -- [x] **List Commands**: All filter options (`--all`, `--missing`, `--downloaded`) implemented -- [x] **Progress Display**: tqdm integration for user feedback during operations - -#### **Phase 3: Download Pipeline** -- [x] **FIT File Downloads**: Multi-method download approach with fallback strategies -- [x] **Idempotent Operations**: Prevents re-downloading existing files -- [x] **Database Updates**: Proper status tracking and file path storage -- [x] **File Management**: Safe filename generation and directory creation - -#### **Phase 4: Enhanced Features** -- [x] **Offline Mode**: List activities without API calls using cached data -- [x] **Daemon Mode**: Background service with APScheduler for automatic sync -- [x] **Web UI**: FastAPI-based dashboard with real-time monitoring -- [x] **Schedule Configuration**: Configurable cron-based sync schedules -- [x] **Activity Logs**: Comprehensive logging of sync operations - -#### **Phase 5: Web Interface** -- [x] **Dashboard**: Real-time statistics and daemon status monitoring -- [x] **API Routes**: RESTful endpoints for configuration and control -- [x] **Templates**: Responsive HTML templates with Bootstrap styling -- [x] **JavaScript Integration**: Auto-refreshing status and interactive controls -- [x] **Configuration Management**: Web-based daemon settings and schedule updates - -### **๐Ÿ”ง Recent Fixes and Improvements** - -#### **Dependency Management** -- [x] **Pydantic Compatibility**: Fixed version constraints to avoid conflicts with `garth` -- [x] **Requirements Lock**: Updated to `pydantic>=2.0.0,<2.5.0` for stability -- [x] **Package Versions**: Verified compatibility across all dependencies - -#### **Code Quality Fixes** -- [x] **Missing Fields**: Added `created_at` field to Activity model and sync operations -- [x] **Import Issues**: Resolved circular import problems in daemon module -- [x] **Error Handling**: Improved exception handling and logging throughout -- [x] **Method Names**: Corrected method calls and parameter names - -#### **Web UI Enhancements** -- [x] **Template Safety**: Added fallback handling for missing template files -- [x] **API Error Handling**: Improved error responses and status codes -- [x] **JavaScript Functions**: Added missing daemon control functions -- [x] **Status Updates**: Real-time status updates with proper data formatting - ------ - -## **Docker Usage** - -### **Build the Container** -```bash -docker build -t garminsync . -``` - -### **Run with Environment File** -```bash -docker run -it --env-file .env -v $(pwd)/data:/app/data garminsync --help -``` - -### **Example Commands** -```bash -# List all activities -docker run -it --env-file .env -v $(pwd)/data:/app/data garminsync list --all - -# List missing activities offline -docker run -it --env-file .env -v $(pwd)/data:/app/data garminsync list --missing --offline - -# Download missing activities -docker run -it --env-file .env -v $(pwd)/data:/app/data garminsync download --missing - -# Start daemon with web UI -docker run -it --env-file .env -v $(pwd)/data:/app/data -p 8080:8080 garminsync daemon -``` - ------ - -## **Environment Setup** - -Create a `.env` file in the project root: -``` -GARMIN_EMAIL=your_email@example.com -GARMIN_PASSWORD=your_password -``` - ------ - -## **Key Implementation Highlights** - -### **Robust Download Logic** -The `garmin.py` module implements a sophisticated download strategy that tries multiple methods to handle variations in the Garmin Connect API: - -```python -methods_to_try = [ - lambda: self.client.download_activity(activity_id), - lambda: self.client.download_activity(activity_id, fmt='fit'), - lambda: self.client.download_activity(activity_id, format='fit'), - # ... additional fallback methods -] -``` - -### **Database Synchronization** -The sync process efficiently updates the local database with new activities from Garmin Connect: - -```python -def sync_database(garmin_client): - """Sync local database with Garmin Connect activities""" - activities = garmin_client.get_activities(0, 1000) - for activity in activities: - # Only add new activities, preserve existing download status - existing = session.query(Activity).filter_by(activity_id=activity_id).first() - if not existing: - new_activity = Activity( - activity_id=activity_id, - start_time=start_time, - downloaded=False, - created_at=datetime.now().isoformat(), - last_sync=datetime.now().isoformat() - ) - session.add(new_activity) -``` - -### **Daemon Implementation** -The daemon uses APScheduler for reliable background task execution: - -```python -class GarminSyncDaemon: - def __init__(self): - self.scheduler = BackgroundScheduler() - self.running = False - self.web_server = None - - def start(self, web_port=8080): - config_data = self.load_config() - if config_data['enabled']: - self.scheduler.add_job( - func=self.sync_and_download, - trigger=CronTrigger.from_crontab(config_data['schedule_cron']), - id='sync_job', - replace_existing=True - ) -``` - -### **Web API Integration** -FastAPI provides RESTful endpoints for daemon control and monitoring: - -```python -@router.get("/status") -async def get_status(): - """Get current daemon status with logs""" - config = session.query(DaemonConfig).first() - logs = session.query(SyncLog).order_by(SyncLog.timestamp.desc()).limit(10).all() - return { - "daemon": {"running": config.status == "running"}, - "recent_logs": [{"timestamp": log.timestamp, "status": log.status} for log in logs] - } -``` - ------ - -## **Known Issues & Limitations** - -### **Current Limitations** -1. **Web Interface**: Some components need completion (detailed below) -2. **Error Recovery**: Limited automatic retry logic for failed downloads -3. **Batch Processing**: No support for selective activity date range downloads -4. **Authentication**: No support for two-factor authentication (2FA) - -### **Dependency Issues Resolved** -- โœ… **Pydantic Conflicts**: Fixed version constraints to avoid `garth` compatibility issues -- โœ… **Missing Fields**: Added all required database fields -- โœ… **Import Errors**: Resolved circular import problems - ------ - -## **Performance Considerations** - -- **Rate Limiting**: 2-second delays between API requests prevent server overload -- **Batch Processing**: Fetches up to 1000 activities per sync operation -- **Efficient Queries**: Database queries optimized for filtering operations -- **Memory Management**: Proper session cleanup and resource management -- **Docker Optimization**: Layer caching and minimal base image for faster builds -- **Background Processing**: Daemon mode prevents blocking CLI operations - ------ - -## **Security Considerations** - -- **Credential Storage**: Environment variables prevent hardcoded credentials -- **File Permissions**: Docker container runs with appropriate user permissions -- **API Rate Limiting**: Respects Garmin Connect rate limits to prevent account restrictions -- **Error Logging**: Sensitive information excluded from logs and error messages - ------ - -## **Documentation ๐Ÿ“š** - -Here are links to the official documentation for the key libraries used: - -* **Garmin API:** [python-garminconnect](https://github.com/cyberjunky/python-garminconnect) -* **CLI Framework:** [Typer](https://typer.tiangolo.com/) -* **Environment Variables:** [python-dotenv](https://github.com/theskumar/python-dotenv) -* **Database ORM:** [SQLAlchemy](https://docs.sqlalchemy.org/en/20/) -* **Progress Bars:** [tqdm](https://github.com/tqdm/tqdm) -* **Web Framework:** [FastAPI](https://fastapi.tiangolo.com/) -* **Task Scheduler:** [APScheduler](https://apscheduler.readthedocs.io/) - ------ - -## **Web Interface Implementation Steps** - -### **๐ŸŽฏ Missing Components to Complete** - -#### **1. Enhanced Dashboard Components** - -**A. Real-time Activity Counter** -- **File:** `garminsync/web/templates/dashboard.html` -- **Implementation:** - ```html -
-
-
-

Idle

-

Current Operation

-
-
-
- ``` -- **JavaScript Update:** Add WebSocket or periodic updates for sync status - -**B. Activity Progress Charts** -- **File:** Add Chart.js to `garminsync/web/static/charts.js` -- **Implementation:** - ```javascript - // Add to dashboard - const ctx = document.getElementById('activityChart').getContext('2d'); - const chart = new Chart(ctx, { - type: 'doughnut', - data: { - labels: ['Downloaded', 'Missing'], - datasets: [{ - data: [downloaded, missing], - backgroundColor: ['#28a745', '#dc3545'] - }] - } - }); - ``` - -#### **2. Enhanced Configuration Page** - -**A. Advanced Schedule Options** -- **File:** `garminsync/web/templates/config.html` -- **Add Preset Schedules:** - ```html -
- - -
- ``` - -**B. Notification Settings** -- **New Model in `database.py`:** - ```python - class NotificationConfig(Base): - __tablename__ = 'notification_config' - - id = Column(Integer, primary_key=True) - email_enabled = Column(Boolean, default=False) - email_address = Column(String, nullable=True) - webhook_enabled = Column(Boolean, default=False) - webhook_url = Column(String, nullable=True) - notify_on_success = Column(Boolean, default=True) - notify_on_error = Column(Boolean, default=True) - ``` - -#### **3. Comprehensive Logs Page** - -**A. Create Dedicated Logs Page** -- **File:** `garminsync/web/templates/logs.html` -- **Implementation:** - ```html - {% extends "base.html" %} - - {% block content %} -
-
-

Sync Logs

-
- - -
-
- - -
-
Filters
-
-
-
- -
-
- -
-
- -
-
- -
-
-
-
- - -
-
Log Entries
-
-
- - - - - - - - - - - - - -
TimestampOperationStatusMessageActivities
-
- - - -
-
-
- {% endblock %} - ``` - -**B. Enhanced Logs API** -- **File:** `garminsync/web/routes.py` -- **Add Filtering and Pagination:** - ```python - @router.get("/logs") - async def get_logs( - limit: int = 50, - offset: int = 0, - status: str = None, - operation: str = None, - date: str = None - ): - """Get logs with filtering and pagination""" - session = get_session() - try: - query = session.query(SyncLog) - - # Apply filters - if status: - query = query.filter(SyncLog.status == status) - if operation: - query = query.filter(SyncLog.operation == operation) - if date: - # Filter by date (assuming ISO format) - query = query.filter(SyncLog.timestamp.like(f"{date}%")) - - # Get total count for pagination - total = query.count() - - # Apply pagination - logs = query.order_by(SyncLog.timestamp.desc()).offset(offset).limit(limit).all() - - return { - "logs": [log_to_dict(log) for log in logs], - "total": total, - "limit": limit, - "offset": offset - } - finally: - session.close() - - def log_to_dict(log): - return { - "id": log.id, - "timestamp": log.timestamp, - "operation": log.operation, - "status": log.status, - "message": log.message, - "activities_processed": log.activities_processed, - "activities_downloaded": log.activities_downloaded - } - ``` - -#### **4. Activity Management Page** - -**A. Create Activities Page** -- **File:** `garminsync/web/templates/activities.html` -- **Features:** - - List all activities with status - - Filter by date range, status, activity type - - Bulk download options - - Individual activity details modal - -**B. Activity Details API** -- **File:** `garminsync/web/routes.py` -- **Implementation:** - ```python - @router.get("/activities") - async def get_activities( - limit: int = 100, - offset: int = 0, - downloaded: bool = None, - start_date: str = None, - end_date: str = None - ): - """Get activities with filtering and pagination""" - session = get_session() - try: - query = session.query(Activity) - - if downloaded is not None: - query = query.filter(Activity.downloaded == downloaded) - if start_date: - query = query.filter(Activity.start_time >= start_date) - if end_date: - query = query.filter(Activity.start_time <= end_date) - - total = query.count() - activities = query.order_by(Activity.start_time.desc()).offset(offset).limit(limit).all() - - return { - "activities": [activity_to_dict(a) for a in activities], - "total": total, - "limit": limit, - "offset": offset - } - finally: - session.close() - - @router.post("/activities/{activity_id}/download") - async def download_single_activity(activity_id: int): - """Download a specific activity""" - # Implementation to download single activity - pass - ``` - -#### **5. System Status Page** - -**A. Create System Status Template** -- **File:** `garminsync/web/templates/system.html` -- **Show:** - - Database statistics - - Disk usage - - Memory usage - - API rate limiting status - - Last errors - -**B. System Status API** -- **File:** `garminsync/web/routes.py` -- **Implementation:** - ```python - @router.get("/system/status") - async def get_system_status(): - """Get comprehensive system status""" - import psutil - import os - from pathlib import Path - - # Database stats - session = get_session() - try: - db_stats = { - "total_activities": session.query(Activity).count(), - "downloaded_activities": session.query(Activity).filter_by(downloaded=True).count(), - "total_logs": session.query(SyncLog).count(), - "database_size": get_database_size() - } - finally: - session.close() - - # System stats - data_dir = Path(os.getenv("DATA_DIR", "data")) - disk_usage = psutil.disk_usage(str(data_dir)) - - return { - "database": db_stats, - "system": { - "cpu_percent": psutil.cpu_percent(), - "memory": psutil.virtual_memory()._asdict(), - "disk_usage": { - "total": disk_usage.total, - "used": disk_usage.used, - "free": disk_usage.free - } - }, - "garmin_api": { - "last_successful_call": get_last_successful_api_call(), - "rate_limit_remaining": get_rate_limit_status() - } - } - ``` - -#### **6. Enhanced Navigation and Layout** - -**A. Update Base Template** -- **File:** `garminsync/web/templates/base.html` -- **Add Complete Navigation:** - ```html - - ``` - -**B. Add FontAwesome Icons** -- **Update base template with:** - ```html - - ``` - -### **๐Ÿ”„ Implementation Order** - -1. **Week 1: Enhanced Dashboard** - - Add real-time counters and charts - - Implement activity progress visualization - - Add sync status indicators - -2. **Week 2: Logs Page** - - Create comprehensive logs template - - Implement filtering and pagination APIs - - Add log management features - -3. **Week 3: Activities Management** - - Build activities listing page - - Add filtering and search capabilities - - Implement individual activity actions - -4. **Week 4: System Status & Configuration** - - Create system monitoring page - - Enhanced configuration options - - Notification system setup - -5. **Week 5: Polish & Testing** - - Improve responsive design - - Add error handling and loading states - - Performance optimization - -### **๐Ÿ“ New Files Needed** - -``` -garminsync/web/ -โ”œโ”€โ”€ templates/ -โ”‚ โ”œโ”€โ”€ activities.html # New: Activity management -โ”‚ โ”œโ”€โ”€ logs.html # New: Enhanced logs page -โ”‚ โ””โ”€โ”€ system.html # New: System status -โ”œโ”€โ”€ static/ -โ”‚ โ”œโ”€โ”€ charts.js # New: Chart.js integration -โ”‚ โ”œโ”€โ”€ activities.js # New: Activity management JS -โ”‚ โ””โ”€โ”€ system.js # New: System monitoring JS -``` - -### **๐Ÿ› ๏ธ Required Dependencies** - -Add to `requirements.txt`: -``` -psutil==5.9.6 # For system monitoring -python-dateutil==2.8.2 # For date parsing -``` - -This comprehensive implementation plan will transform the basic web interface into a full-featured dashboard for managing GarminSync operations. - -### **Planned Features** -- **Authentication**: Support for two-factor authentication -- **Selective Sync**: Date range and activity type filtering -- **Export Options**: Support for additional export formats (GPX, TCX) -- **Notification System**: Email/webhook notifications for sync completion -- **Activity Analysis**: Basic statistics and activity summary features -- **Multi-user Support**: Support for multiple Garmin accounts -- **Cloud Storage**: Integration with cloud storage providers -- **Mobile Interface**: Responsive design improvements for mobile devices - -### **Technical Improvements** -- **Health Checks**: Comprehensive health monitoring endpoints -- **Metrics**: Prometheus metrics for monitoring and alerting -- **Database Migrations**: Automatic schema migration support -- **Configuration Validation**: Enhanced validation for cron expressions and settings -- **Logging Enhancement**: Structured logging with configurable levels -- **Test Coverage**: Comprehensive unit and integration tests -- **CI/CD Pipeline**: Automated testing and deployment workflows \ No newline at end of file diff --git a/Dockerfile b/Dockerfile index 1fce15b..448ef43 100644 --- a/Dockerfile +++ b/Dockerfile @@ -1,22 +1,53 @@ -# Use an official Python runtime as a parent image -FROM python:3.10-slim +# Use multi-stage build with pre-built scientific packages +FROM python:3.12-slim-bookworm as builder -# Set the working directory -WORKDIR /app - -# Install system dependencies +# Install minimal build dependencies RUN apt-get update && apt-get install -y --no-install-recommends \ - build-essential curl git \ + gcc \ + g++ \ && rm -rf /var/lib/apt/lists/* -# Copy requirements file first to leverage Docker cache +# Create virtual environment +RUN python -m venv /opt/venv +ENV PATH="/opt/venv/bin:$PATH" + +# Upgrade pip and install build tools +RUN pip install --upgrade pip setuptools wheel + +# Install scientific packages first using pre-built wheels +RUN pip install --no-cache-dir --only-binary=all \ + numpy \ + scipy \ + pandas \ + scikit-learn + +# Copy requirements and install remaining dependencies COPY requirements.txt . -# Upgrade pip and install Python dependencies -RUN pip install --upgrade pip && \ - pip install --no-cache-dir -r requirements.txt +# Install remaining requirements, excluding packages we've already installed +RUN pip install --no-cache-dir \ + aiosqlite asyncpg aiohttp \ + $(grep -v '^numpy\|^scipy\|^pandas\|^scikit-learn' requirements.txt | tr '\n' ' ') -# Copy application code +# Final runtime stage +FROM python:3.12-slim-bookworm + +# Install only essential runtime libraries +RUN apt-get update && apt-get install -y --no-install-recommends \ + libgomp1 \ + libgfortran5 \ + curl \ + && rm -rf /var/lib/apt/lists/* \ + && apt-get clean + +# Copy virtual environment from builder +COPY --from=builder /opt/venv /opt/venv +ENV PATH="/opt/venv/bin:$PATH" + +# Set working directory +WORKDIR /app + +# Copy application files COPY garminsync/ ./garminsync/ COPY migrations/ ./migrations/ COPY migrations/alembic.ini ./alembic.ini @@ -24,17 +55,23 @@ COPY tests/ ./tests/ COPY entrypoint.sh . COPY patches/ ./patches/ -# Fix garth package duplicate parameter issue -RUN cp patches/garth_data_weight.py /usr/local/lib/python3.10/site-packages/garth/data/weight.py +# Apply patches +RUN cp patches/garth_data_weight.py /opt/venv/lib/python3.12/site-packages/garth/data/weight.py -# Make the entrypoint script executable +# Set permissions RUN chmod +x entrypoint.sh # Create data directory RUN mkdir -p /app/data -# Set the entrypoint -ENTRYPOINT ["./entrypoint.sh"] +# Create non-root user +RUN groupadd -r appuser && useradd -r -g appuser appuser +RUN chown -R appuser:appuser /app +USER appuser -# Expose port -EXPOSE 8888 +# Health check +HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ + CMD curl -f http://localhost:8888/health || exit 1 + +ENTRYPOINT ["./entrypoint.sh"] +EXPOSE 8888 \ No newline at end of file diff --git a/cycling.md b/cycling.md deleted file mode 100644 index dc5806e..0000000 --- a/cycling.md +++ /dev/null @@ -1,221 +0,0 @@ -# Cycling FIT Analysis Implementation Plan - -## Overview -Extend the existing GarminSync FIT parser to calculate cycling-specific metrics including power estimation and singlespeed gear ratio analysis for activities without native power data. - -## Phase 1: Core Infrastructure Setup - -### 1.1 Database Schema Extensions -**File: `garminsync/database.py`** -- Extend existing `PowerAnalysis` table with cycling-specific fields: - ```python - # Add to PowerAnalysis class: - peak_power_1s = Column(Float, nullable=True) - peak_power_5s = Column(Float, nullable=True) - peak_power_20s = Column(Float, nullable=True) - peak_power_300s = Column(Float, nullable=True) - normalized_power = Column(Float, nullable=True) - intensity_factor = Column(Float, nullable=True) - training_stress_score = Column(Float, nullable=True) - ``` - -- Extend existing `GearingAnalysis` table: - ```python - # Add to GearingAnalysis class: - estimated_chainring_teeth = Column(Integer, nullable=True) - estimated_cassette_teeth = Column(Integer, nullable=True) - gear_ratio = Column(Float, nullable=True) - gear_inches = Column(Float, nullable=True) - development_meters = Column(Float, nullable=True) - confidence_score = Column(Float, nullable=True) - analysis_method = Column(String, default="singlespeed_estimation") - ``` - -### 1.2 Enhanced FIT Parser -**File: `garminsync/fit_processor/parser.py`** -- Extend `FITParser` to extract cycling-specific data points: - ```python - def _extract_cycling_data(self, message): - """Extract cycling-specific metrics from FIT records""" - # GPS coordinates for elevation/gradient - # Speed and cadence for gear analysis - # Power data (if available) for validation - # Temperature for air density calculations - ``` - -## Phase 2: Power Estimation Engine - -### 2.1 Physics-Based Power Calculator -**New file: `garminsync/fit_processor/power_estimator.py`** - -**Key Components:** -- **Environmental factors**: Air density, wind resistance, temperature -- **Bike specifications**: Weight (22 lbs = 10 kg), aerodynamic drag coefficient -- **Rider assumptions**: Weight (75 kg default), position (road bike) -- **Terrain analysis**: Gradient calculation from GPS elevation data - -**Core Algorithm:** -```python -class PowerEstimator: - def __init__(self): - self.bike_weight_kg = 10.0 # 22 lbs - self.rider_weight_kg = 75.0 # Default assumption - self.drag_coefficient = 0.88 # Road bike - self.frontal_area_m2 = 0.4 # Typical road cycling position - self.rolling_resistance = 0.004 # Road tires - self.drivetrain_efficiency = 0.97 - self.air_density = 1.225 # kg/mยณ at sea level, 20ยฐC - - def calculate_power(self, speed_ms, gradient_percent, - air_temp_c=20, altitude_m=0): - """Calculate estimated power using physics model""" - # Power = (Rolling + Gravity + Aerodynamic + Kinetic) / Efficiency -``` - -**Power Components:** -1. **Rolling resistance**: `P_roll = Crr ร— (m_bike + m_rider) ร— g ร— cos(ฮธ) ร— v` -2. **Gravitational**: `P_grav = (m_bike + m_rider) ร— g ร— sin(ฮธ) ร— v` -3. **Aerodynamic**: `P_aero = 0.5 ร— ฯ ร— Cd ร— A ร— vยณ` -4. **Acceleration**: `P_accel = (m_bike + m_rider) ร— a ร— v` - -### 2.2 Peak Power Analysis -**Methods:** -- 1-second, 5-second, 20-second, 5-minute peak power windows -- Normalized Power (NP) calculation using 30-second rolling average -- Training Stress Score (TSS) estimation based on NP and ride duration - -## Phase 3: Singlespeed Gear Ratio Analysis - -### 3.1 Gear Ratio Calculator -**New file: `garminsync/fit_processor/gear_analyzer.py`** - -**Strategy:** -- Analyze flat terrain segments (gradient < 3%) -- Use speed/cadence relationship to determine gear ratio -- Test against common singlespeed ratios for 38t and 46t chainrings -- Calculate confidence scores based on data consistency - -**Core Algorithm:** -```python -class SinglespeedAnalyzer: - def __init__(self): - self.chainring_options = [38, 46] # teeth - self.common_cogs = list(range(11, 28)) # 11t to 27t rear cogs - self.wheel_circumference_m = 2.096 # 700x25c tire - - def analyze_gear_ratio(self, speed_data, cadence_data, gradient_data): - """Determine most likely singlespeed gear ratio""" - # Filter for flat terrain segments - # Calculate gear ratio from speed/cadence - # Match against common ratios - # Return best fit with confidence score -``` - -**Gear Metrics:** -- **Gear ratio**: Chainring teeth รท Cog teeth -- **Gear inches**: Gear ratio ร— wheel diameter (inches) -- **Development**: Distance traveled per pedal revolution (meters) - -### 3.2 Analysis Methodology -1. **Segment filtering**: Identify flat terrain (gradient < 3%, speed > 15 km/h) -2. **Ratio calculation**: `gear_ratio = (speed_ms ร— 60) รท (cadence_rpm ร— wheel_circumference_m)` -3. **Ratio matching**: Compare calculated ratios against theoretical singlespeed options -4. **Confidence scoring**: Based on data consistency and segment duration - -## Phase 4: Integration with Existing System - -### 4.1 FIT Processing Workflow Enhancement -**File: `garminsync/fit_processor/analyzer.py`** -- Integrate power estimation and gear analysis into existing analysis workflow -- Add cycling-specific analysis triggers (detect cycling activities) -- Store results in database using existing schema - -### 4.2 Database Population -**Migration strategy:** -- Extend existing migration system to handle new fields -- Process existing FIT files retroactively -- Add processing status tracking for cycling analysis - -### 4.3 CLI Integration -**File: `garminsync/cli.py`** -- Add new command: `garminsync analyze --cycling --activity-id ` -- Add batch processing: `garminsync analyze --cycling --missing` -- Add reporting: `garminsync report --power-analysis --gear-analysis` - -## Phase 5: Validation and Testing - -### 5.1 Test Data Requirements -- FIT files with known power data for validation -- Various singlespeed configurations for gear ratio testing -- Different terrain types (flat, climbing, mixed) - -### 5.2 Validation Methodology -- Compare estimated vs. actual power (where available) -- Validate gear ratio estimates against known bike configurations -- Test edge cases (very low/high cadence, extreme gradients) - -### 5.3 Performance Optimization -- Efficient gradient calculation from GPS data -- Optimize power calculation loops for large datasets -- Cache intermediate calculations - -## Phase 6: Advanced Features (Future) - -### 6.1 Environmental Corrections -- Wind speed/direction integration -- Barometric pressure for accurate altitude -- Temperature-based air density adjustments - -### 6.2 Machine Learning Enhancement -- Train models on validated power data -- Improve gear ratio detection accuracy -- Personalized power estimation based on rider history - -### 6.3 Comparative Analysis -- Compare estimated metrics across rides -- Trend analysis for fitness progression -- Gear ratio optimization recommendations - -## Implementation Priority - -**High Priority:** -1. Database schema extensions -2. Basic power estimation using physics model -3. Singlespeed gear ratio analysis for flat terrain -4. Integration with existing FIT processing pipeline - -**Medium Priority:** -1. Peak power analysis (1s, 5s, 20s, 5min) -2. Normalized Power and TSS calculations -3. Advanced gear analysis with confidence scoring -4. CLI commands for analysis and reporting - -**Low Priority:** -1. Environmental corrections (wind, pressure) -2. Machine learning enhancements -3. Advanced comparative analysis features -4. Web UI integration for visualizing results - -## Success Criteria - -1. **Power Estimation**: Within ยฑ10% of actual power data (where available for validation) -2. **Gear Ratio Detection**: Correctly identify gear ratios within ยฑ1 tooth accuracy -3. **Processing Speed**: Analyze typical FIT file (1-hour ride) in <5 seconds -4. **Data Coverage**: Successfully analyze 90%+ of cycling FIT files -5. **Integration**: Seamlessly integrate with existing GarminSync workflow - -## File Structure Summary - -``` -garminsync/ -โ”œโ”€โ”€ fit_processor/ -โ”‚ โ”œโ”€โ”€ parser.py (enhanced) -โ”‚ โ”œโ”€โ”€ analyzer.py (enhanced) -โ”‚ โ”œโ”€โ”€ power_estimator.py (new) -โ”‚ โ””โ”€โ”€ gear_analyzer.py (new) -โ”œโ”€โ”€ database.py (enhanced) -โ”œโ”€โ”€ cli.py (enhanced) -โ””โ”€โ”€ migrate_cycling_analysis.py (new) -``` - -This plan provides a comprehensive roadmap for implementing cycling-specific FIT analysis while building on the existing GarminSync infrastructure and maintaining compatibility with current functionality. \ No newline at end of file diff --git a/cyclingpower.md b/cyclingpower.md deleted file mode 100644 index 3393347..0000000 --- a/cyclingpower.md +++ /dev/null @@ -1,78 +0,0 @@ -# Cycling FIT Analysis Implementation Plan - -## Overview -Extend the existing GarminSync FIT parser to calculate cycling-specific metrics including power estimation and singlespeed gear ratio analysis for activities without native power data. - - -**Key Components:** -- **Environmental factors**: Air density, wind resistance, temperature -- **Bike specifications**: Weight (22 lbs = 10 kg), aerodynamic drag coefficient -- **Rider assumptions**: Weight (75 kg default), position (road bike) -- **Terrain analysis**: Gradient calculation from GPS elevation data - -**Core Algorithm:** -```python -class PowerEstimator: - def __init__(self): - self.bike_weight_kg = 10.0 # 22 lbs - self.rider_weight_kg = 75.0 # Default assumption - self.drag_coefficient = 0.88 # Road bike - self.frontal_area_m2 = 0.4 # Typical road cycling position - self.rolling_resistance = 0.004 # Road tires - self.drivetrain_efficiency = 0.97 - self.air_density = 1.225 # kg/mยณ at sea level, 20ยฐC - - def calculate_power(self, speed_ms, gradient_percent, - air_temp_c=20, altitude_m=0): - """Calculate estimated power using physics model""" - # Power = (Rolling + Gravity + Aerodynamic + Kinetic) / Efficiency -``` - -**Power Components:** -1. **Rolling resistance**: `P_roll = Crr ร— (m_bike + m_rider) ร— g ร— cos(ฮธ) ร— v` -2. **Gravitational**: `P_grav = (m_bike + m_rider) ร— g ร— sin(ฮธ) ร— v` -3. **Aerodynamic**: `P_aero = 0.5 ร— ฯ ร— Cd ร— A ร— vยณ` -4. **Acceleration**: `P_accel = (m_bike + m_rider) ร— a ร— v` - -### 2.2 Peak Power Analysis -**Methods:** -- 1-second, 5-second, 20-second, 5-minute peak power windows -- Normalized Power (NP) calculation using 30-second rolling average -- Training Stress Score (TSS) estimation based on NP and ride duration - -## Singlespeed Gear Ratio Analysis - -### Gear Ratio Calculator - -**Strategy:** -- Analyze flat terrain segments (gradient < 3%) -- Use speed/cadence relationship to determine gear ratio -- Test against common singlespeed ratios for 38t and 46t chainrings -- Calculate confidence scores based on data consistency - -**Core Algorithm:** -```python -class SinglespeedAnalyzer: - def __init__(self): - self.chainring_options = [38, 46] # teeth - self.common_cogs = list(range(11, 28)) # 11t to 27t rear cogs - self.wheel_circumference_m = 2.096 # 700x25c tire - - def analyze_gear_ratio(self, speed_data, cadence_data, gradient_data): - """Determine most likely singlespeed gear ratio""" - # Filter for flat terrain segments - # Calculate gear ratio from speed/cadence - # Match against common ratios - # Return best fit with confidence score -``` - -**Gear Metrics:** -- **Gear ratio**: Chainring teeth รท Cog teeth -- **Gear inches**: Gear ratio ร— wheel diameter (inches) -- **Development**: Distance traveled per pedal revolution (meters) - -### 3.2 Analysis Methodology -1. **Segment filtering**: Identify flat terrain (gradient < 3%, speed > 15 km/h) -2. **Ratio calculation**: `gear_ratio = (speed_ms ร— 60) รท (cadence_rpm ร— wheel_circumference_m)` -3. **Ratio matching**: Compare calculated ratios against theoretical singlespeed options -4. **Confidence scoring**: Based on data consistency and segment duration diff --git a/garminsync/daemon.py b/garminsync/daemon.py index d9609d2..bf33cd7 100644 --- a/garminsync/daemon.py +++ b/garminsync/daemon.py @@ -1,37 +1,57 @@ import os import signal -import sys -import threading +import asyncio +import concurrent.futures import time from datetime import datetime +from queue import PriorityQueue +import threading from apscheduler.schedulers.background import BackgroundScheduler from apscheduler.triggers.cron import CronTrigger -from .database import Activity, DaemonConfig, SyncLog, get_session +from .database import Activity, DaemonConfig, SyncLog, get_legacy_session, init_db, get_offline_stats from .garmin import GarminClient from .utils import logger from .activity_parser import get_activity_metrics +# Priority levels: 1=High (API requests), 2=Medium (Sync jobs), 3=Low (Reprocessing) +PRIORITY_HIGH = 1 +PRIORITY_MEDIUM = 2 +PRIORITY_LOW = 3 class GarminSyncDaemon: def __init__(self): self.scheduler = BackgroundScheduler() self.running = False self.web_server = None + # Process pool for CPU-bound tasks + self.executor = concurrent.futures.ProcessPoolExecutor( + max_workers=os.cpu_count() - 1 or 1 + ) + # Priority queue for task scheduling + self.task_queue = PriorityQueue() + # Worker thread for processing tasks + self.worker_thread = threading.Thread(target=self._process_tasks, daemon=True) + # Lock for database access during migration + self.db_lock = threading.Lock() def start(self, web_port=8888, run_migrations=True): - """Start daemon with scheduler and web UI - :param web_port: Port for the web UI - :param run_migrations: Whether to run database migrations on startup - """ - # Set migration flag for entrypoint - if run_migrations: - os.environ['RUN_MIGRATIONS'] = "1" - else: - os.environ['RUN_MIGRATIONS'] = "0" - + """Start daemon with scheduler and web UI""" try: + # Initialize database (synchronous) + with self.db_lock: + init_db() + + # Set migration flag for entrypoint + if run_migrations: + os.environ['RUN_MIGRATIONS'] = "1" + else: + os.environ['RUN_MIGRATIONS'] = "0" + + # Start task processing worker + self.worker_thread.start() + # Load configuration from database config_data = self.load_config() @@ -48,7 +68,7 @@ class GarminSyncDaemon: cron_str = "0 */6 * * *" self.scheduler.add_job( - func=self.sync_and_download, + func=self._enqueue_sync, trigger=CronTrigger.from_crontab(cron_str), id="sync_job", replace_existing=True, @@ -58,7 +78,7 @@ class GarminSyncDaemon: logger.error(f"Failed to create sync job: {str(e)}") # Fallback to default schedule self.scheduler.add_job( - func=self.sync_and_download, + func=self._enqueue_sync, trigger=CronTrigger.from_crontab("0 */6 * * *"), id="sync_job", replace_existing=True, @@ -66,10 +86,10 @@ class GarminSyncDaemon: logger.info("Using default schedule for sync job: '0 */6 * * *'") # Reprocess job - run daily at 2 AM - reprocess_cron = "0 2 * * *" # Daily at 2 AM + reprocess_cron = "0 2 * * *" try: self.scheduler.add_job( - func=self.reprocess_activities, + func=self._enqueue_reprocess, trigger=CronTrigger.from_crontab(reprocess_cron), id="reprocess_job", replace_existing=True, @@ -77,16 +97,6 @@ class GarminSyncDaemon: logger.info(f"Reprocess job scheduled with cron: '{reprocess_cron}'") except Exception as e: logger.error(f"Failed to create reprocess job: {str(e)}") - except Exception as e: - logger.error(f"Failed to create scheduled job: {str(e)}") - # Fallback to default schedule - self.scheduler.add_job( - func=self.sync_and_download, - trigger=CronTrigger.from_crontab("0 */6 * * *"), - id="sync_job", - replace_existing=True, - ) - logger.info("Using default schedule '0 */6 * * *'") # Start scheduler self.scheduler.start() @@ -115,8 +125,52 @@ class GarminSyncDaemon: self.update_daemon_status("error") self.stop() + def _enqueue_sync(self): + """Enqueue sync job with medium priority""" + self.task_queue.put((PRIORITY_MEDIUM, ("sync", None))) + logger.debug("Enqueued sync job") + + def _enqueue_reprocess(self): + """Enqueue reprocess job with low priority""" + self.task_queue.put((PRIORITY_LOW, ("reprocess", None))) + logger.debug("Enqueued reprocess job") + + def _process_tasks(self): + """Worker thread to process tasks from the priority queue""" + logger.info("Task worker started") + while self.running: + try: + priority, (task_type, data) = self.task_queue.get(timeout=1) + logger.info(f"Processing {task_type} task (priority {priority})") + + if task_type == "sync": + self._execute_in_process_pool(self.sync_and_download) + elif task_type == "reprocess": + self._execute_in_process_pool(self.reprocess_activities) + elif task_type == "api": + # Placeholder for high-priority API tasks + logger.debug(f"Processing API task: {data}") + + self.task_queue.task_done() + except Exception as e: + logger.error(f"Task processing error: {str(e)}") + except asyncio.TimeoutError: + # Timeout is normal when queue is empty + pass + logger.info("Task worker stopped") + + def _execute_in_process_pool(self, func): + """Execute function in process pool and handle results""" + try: + future = self.executor.submit(func) + # Block until done to maintain task order but won't block main thread + result = future.result() + logger.debug(f"Process pool task completed: {result}") + except Exception as e: + logger.error(f"Process pool task failed: {str(e)}") + def sync_and_download(self): - """Scheduled job function""" + """Scheduled job function (run in process pool)""" session = None try: self.log_operation("sync", "started") @@ -129,11 +183,12 @@ class GarminSyncDaemon: client = GarminClient() # Sync database first - sync_database(client) + with self.db_lock: + sync_database(client) # Download missing activities downloaded_count = 0 - session = get_session() + session = get_legacy_session() missing_activities = ( session.query(Activity).filter_by(downloaded=False).all() ) @@ -165,11 +220,11 @@ class GarminSyncDaemon: if metrics: # Update metrics if available activity.activity_type = metrics.get("activityType", {}).get("typeKey") - activity.duration = int(float(metrics.get("summaryDTO", {}).get("duration", 0))) - activity.distance = float(metrics.get("summaryDTO", {}).get("distance", 0)) - activity.max_heart_rate = int(float(metrics.get("summaryDTO", {}).get("maxHR", 0))) - activity.avg_power = float(metrics.get("summaryDTO", {}).get("avgPower", 0)) - activity.calories = int(float(metrics.get("summaryDTO", {}).get("calories", 0))) + activity.duration = int(float(metrics.get("duration", 0))) + activity.distance = float(metrics.get("distance", 0)) + activity.max_heart_rate = int(float(metrics.get("maxHR", 0))) + activity.avg_power = float(metrics.get("avgPower", 0)) + activity.calories = int(float(metrics.get("calories", 0))) session.commit() downloaded_count += 1 @@ -251,12 +306,28 @@ class GarminSyncDaemon: """Start FastAPI web server in a separate thread""" try: import uvicorn - from .web.app import app + + # Add shutdown hook to stop worker thread + @app.on_event("shutdown") + def shutdown_event(): + logger.info("Web server shutting down") + self.running = False + self.worker_thread.join(timeout=5) def run_server(): try: - uvicorn.run(app, host="0.0.0.0", port=port, log_level="info") + # Use async execution model for better concurrency + config = uvicorn.Config( + app, + host="0.0.0.0", + port=port, + log_level="info", + workers=1, + loop="asyncio" + ) + server = uvicorn.Server(config) + server.run() except Exception as e: logger.error(f"Failed to start web server: {e}") diff --git a/garminsync/database.py b/garminsync/database.py index 97e1ef8..a7746a5 100644 --- a/garminsync/database.py +++ b/garminsync/database.py @@ -1,15 +1,20 @@ -"""Database module for GarminSync application.""" +"""Database module for GarminSync application with async support.""" import os from datetime import datetime +from contextlib import asynccontextmanager -from sqlalchemy import Boolean, Column, Float, Integer, String, create_engine +from sqlalchemy import Boolean, Column, Float, Integer, String +from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession +from sqlalchemy.ext.asyncio import async_sessionmaker +from sqlalchemy.future import select +from sqlalchemy.orm import declarative_base from sqlalchemy.exc import SQLAlchemyError -from sqlalchemy.orm import declarative_base, sessionmaker +from sqlalchemy.orm import selectinload, joinedload +from sqlalchemy.orm import sessionmaker Base = declarative_base() - class Activity(Base): """Activity model representing a Garmin activity record.""" @@ -31,32 +36,24 @@ class Activity(Base): last_sync = Column(String, nullable=True) @classmethod - def get_paginated(cls, page=1, per_page=10): - """Get paginated list of activities. - - Args: - page: Page number (1-based) - per_page: Number of items per page - - Returns: - Pagination object with activities - """ - session = get_session() - try: - query = session.query(cls).order_by(cls.start_time.desc()) - page = int(page) - per_page = int(per_page) - pagination = query.paginate(page=page, per_page=per_page, error_out=False) - return pagination - finally: - session.close() + async def get_paginated(cls, db, page=1, per_page=10): + """Get paginated list of activities (async).""" + async with db.begin() as session: + query = select(cls).order_by(cls.start_time.desc()) + result = await session.execute(query.offset((page-1)*per_page).limit(per_page)) + activities = result.scalars().all() + count_result = await session.execute(select(select(cls).count())) + total = count_result.scalar_one() + return { + "items": activities, + "page": page, + "per_page": per_page, + "total": total, + "pages": (total + per_page - 1) // per_page + } def to_dict(self): - """Convert activity to dictionary representation. - - Returns: - Dictionary with activity data - """ + """Convert activity to dictionary representation.""" return { "id": self.activity_id, "name": self.filename or "Unnamed Activity", @@ -83,6 +80,13 @@ class DaemonConfig(Base): next_run = Column(String, nullable=True) status = Column(String, default="stopped", nullable=False) + @classmethod + async def get(cls, db): + """Get configuration record (async).""" + async with db.begin() as session: + result = await session.execute(select(cls)) + return result.scalars().first() + class SyncLog(Base): """Sync log model for tracking sync operations.""" @@ -98,135 +102,133 @@ class SyncLog(Base): activities_downloaded = Column(Integer, default=0, nullable=False) -def init_db(): - """Initialize database connection and create tables. +# Database initialization and session management +engine = None +async_session = None - Returns: - SQLAlchemy engine instance - """ +async def init_db(): + """Initialize database connection and create tables.""" + global engine, async_session db_path = os.getenv("DB_PATH", "data/garmin.db") - engine = create_engine(f"sqlite:///{db_path}") - Base.metadata.create_all(engine) - return engine + engine = create_async_engine( + f"sqlite+aiosqlite:///{db_path}", + pool_size=10, + max_overflow=20, + pool_pre_ping=True + ) + async_session = async_sessionmaker(engine, expire_on_commit=False) + + # Create tables if they don't exist + async with engine.begin() as conn: + await conn.run_sync(Base.metadata.create_all) -def get_session(): - """Create a new database session. +@asynccontextmanager +async def get_db(): + """Async context manager for database sessions.""" + async with async_session() as session: + try: + yield session + await session.commit() + except SQLAlchemyError: + await session.rollback() + raise - Returns: - SQLAlchemy session instance - """ - engine = init_db() - Session = sessionmaker(bind=engine) + +# Compatibility layer for legacy sync functions +def get_legacy_session(): + """Temporary synchronous session for migration purposes.""" + db_path = os.getenv("DB_PATH", "data/garmin.db") + sync_engine = create_engine(f"sqlite:///{db_path}") + Base.metadata.create_all(sync_engine) + Session = sessionmaker(bind=sync_engine) return Session() -from garminsync.activity_parser import get_activity_metrics +async def sync_database(garmin_client): + """Sync local database with Garmin Connect activities (async).""" + from garminsync.activity_parser import get_activity_metrics + async with get_db() as session: + try: + activities = garmin_client.get_activities(0, 1000) -def sync_database(garmin_client): - """Sync local database with Garmin Connect activities. + if not activities: + print("No activities returned from Garmin API") + return - Args: - garmin_client: GarminClient instance for API communication - """ - session = get_session() - try: - activities = garmin_client.get_activities(0, 1000) + for activity_data in activities: + if not isinstance(activity_data, dict): + print(f"Invalid activity data: {activity_data}") + continue - if not activities: - print("No activities returned from Garmin API") - return + activity_id = activity_data.get("activityId") + start_time = activity_data.get("startTimeLocal") + + if not activity_id or not start_time: + print(f"Missing required fields in activity: {activity_data}") + continue - for activity_data in activities: - if not isinstance(activity_data, dict): - print(f"Invalid activity data: {activity_data}") - continue - - activity_id = activity_data.get("activityId") - start_time = activity_data.get("startTimeLocal") - - if not activity_id or not start_time: - print(f"Missing required fields in activity: {activity_data}") - continue - - existing = session.query(Activity).filter_by(activity_id=activity_id).first() - - # Create or update basic activity info - if not existing: - activity = Activity( - activity_id=activity_id, - start_time=start_time, - downloaded=False, - created_at=datetime.now().isoformat(), - last_sync=datetime.now().isoformat(), + result = await session.execute( + select(Activity).filter_by(activity_id=activity_id) ) - session.add(activity) - session.flush() # Assign ID - else: - activity = existing + existing = result.scalars().first() + + # Create or update basic activity info + if not existing: + activity = Activity( + activity_id=activity_id, + start_time=start_time, + downloaded=False, + created_at=datetime.now().isoformat(), + last_sync=datetime.now().isoformat(), + ) + session.add(activity) + else: + activity = existing + + # Update metrics using shared parser + metrics = get_activity_metrics(activity, garmin_client) + if metrics: + activity.activity_type = metrics.get("activityType", {}).get("typeKey") + # ... rest of metric processing ... + + # Update sync timestamp + activity.last_sync = datetime.now().isoformat() + + await session.commit() + except SQLAlchemyError as e: + await session.rollback() + raise e + + +async def get_offline_stats(): + """Return statistics about cached data without API calls (async).""" + async with get_db() as session: + try: + result = await session.execute(select(Activity)) + total = len(result.scalars().all()) - # Update metrics using shared parser - metrics = get_activity_metrics(activity, garmin_client) - if metrics: - activity.activity_type = metrics.get("activityType", {}).get("typeKey") - - # Extract duration in seconds - duration = metrics.get("summaryDTO", {}).get("duration") - if duration is not None: - activity.duration = int(float(duration)) - - # Extract distance in meters - distance = metrics.get("summaryDTO", {}).get("distance") - if distance is not None: - activity.distance = float(distance) - - # Extract heart rates - max_hr = metrics.get("summaryDTO", {}).get("maxHR") - if max_hr is not None: - activity.max_heart_rate = int(float(max_hr)) - - avg_hr = metrics.get("summaryDTO", {}).get("avgHR", None) or \ - metrics.get("summaryDTO", {}).get("averageHR", None) - if avg_hr is not None: - activity.avg_heart_rate = int(float(avg_hr)) - - # Extract power and calories - avg_power = metrics.get("summaryDTO", {}).get("avgPower") - if avg_power is not None: - activity.avg_power = float(avg_power) - - calories = metrics.get("summaryDTO", {}).get("calories") - if calories is not None: - activity.calories = int(float(calories)) + result = await session.execute( + select(Activity).filter_by(downloaded=True) + ) + downloaded = len(result.scalars().all()) - # Update sync timestamp - activity.last_sync = datetime.now().isoformat() - - session.commit() - except SQLAlchemyError as e: - session.rollback() - raise e - finally: - session.close() - - -def get_offline_stats(): - """Return statistics about cached data without API calls. - - Returns: - Dictionary with activity statistics - """ - session = get_session() - try: - total = session.query(Activity).count() - downloaded = session.query(Activity).filter_by(downloaded=True).count() - missing = total - downloaded - last_sync = session.query(Activity).order_by(Activity.last_sync.desc()).first() - return { - "total": total, - "downloaded": downloaded, - "missing": missing, - "last_sync": last_sync.last_sync if last_sync else "Never synced", - } - finally: - session.close() + result = await session.execute( + select(Activity).order_by(Activity.last_sync.desc()) + ) + last_sync = result.scalars().first() + + return { + "total": total, + "downloaded": downloaded, + "missing": total - downloaded, + "last_sync": last_sync.last_sync if last_sync else "Never synced", + } + except SQLAlchemyError as e: + print(f"Database error: {e}") + return { + "total": 0, + "downloaded": 0, + "missing": 0, + "last_sync": "Error" + } diff --git a/justfile b/justfile index ed7bd1f..8c23f0b 100644 --- a/justfile +++ b/justfile @@ -39,8 +39,8 @@ format: # Start production server run_server: - just build - docker run -d --rm --env-file .env -e RUN_MIGRATIONS=1 -v $(pwd)/data:/app/data -p 8888:8888 --name garminsync garminsync daemon --start + cd ~/GarminSync/docker + docker compose up --build # Stop production server stop_server: diff --git a/plan.md b/plan.md index e69de29..6ce74b7 100644 --- a/plan.md +++ b/plan.md @@ -0,0 +1,1453 @@ +# GarminSync Improvement Plan - Junior Developer Guide + +## Overview +This plan focuses on keeping things simple while making meaningful improvements. We'll avoid complex async patterns and stick to a single-container approach. + +--- + +## Phase 1: Fix Blocking Issues & Add GPX Support (Week 1-2) + +### Problem: Sync blocks the web UI +**Current Issue:** When sync runs, users can't use the web interface. + +### Solution: Simple Threading +Instead of complex async, use Python's threading module: + +```python +# garminsync/daemon.py - Update sync_and_download method +import threading +from datetime import datetime + +class GarminSyncDaemon: + def __init__(self): + self.scheduler = BackgroundScheduler() + self.running = False + self.web_server = None + self.sync_lock = threading.Lock() # Prevent multiple syncs + self.sync_in_progress = False + + def sync_and_download(self): + """Non-blocking sync job""" + # Check if sync is already running + if not self.sync_lock.acquire(blocking=False): + logger.info("Sync already in progress, skipping...") + return + + try: + self.sync_in_progress = True + self._do_sync_work() + finally: + self.sync_in_progress = False + self.sync_lock.release() + + def _do_sync_work(self): + """The actual sync logic (moved from sync_and_download)""" + # ... existing sync code here ... +``` + +### Add GPX Parser +Create a new parser for GPX files: + +```python +# garminsync/parsers/gpx_parser.py +import xml.etree.ElementTree as ET +from datetime import datetime + +def parse_gpx_file(file_path): + """Parse GPX file to extract activity metrics""" + try: + tree = ET.parse(file_path) + root = tree.getroot() + + # GPX uses different namespace + ns = {'gpx': 'http://www.topografix.com/GPX/1/1'} + + # Extract basic info + track = root.find('.//gpx:trk', ns) + if not track: + return None + + # Get track points + track_points = root.findall('.//gpx:trkpt', ns) + + if not track_points: + return None + + # Calculate basic metrics + start_time = None + end_time = None + total_distance = 0.0 + elevations = [] + + prev_point = None + for point in track_points: + # Get time + time_elem = point.find('gpx:time', ns) + if time_elem is not None: + current_time = datetime.fromisoformat(time_elem.text.replace('Z', '+00:00')) + if start_time is None: + start_time = current_time + end_time = current_time + + # Get elevation + ele_elem = point.find('gpx:ele', ns) + if ele_elem is not None: + elevations.append(float(ele_elem.text)) + + # Calculate distance + if prev_point is not None: + lat1, lon1 = float(prev_point.get('lat')), float(prev_point.get('lon')) + lat2, lon2 = float(point.get('lat')), float(point.get('lon')) + total_distance += calculate_distance(lat1, lon1, lat2, lon2) + + prev_point = point + + # Calculate duration + duration = None + if start_time and end_time: + duration = (end_time - start_time).total_seconds() + + return { + "activityType": {"typeKey": "other"}, # GPX doesn't specify activity type + "summaryDTO": { + "duration": duration, + "distance": total_distance, + "maxHR": None, # GPX rarely has HR data + "avgPower": None, + "calories": None + } + } + except Exception as e: + print(f"Error parsing GPX file: {e}") + return None + +def calculate_distance(lat1, lon1, lat2, lon2): + """Calculate distance between two GPS points using Haversine formula""" + import math + + # Convert to radians + lat1, lon1, lat2, lon2 = map(math.radians, [lat1, lon1, lat2, lon2]) + + # Haversine formula + dlat = lat2 - lat1 + dlon = lon2 - lon1 + a = math.sin(dlat/2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2)**2 + c = 2 * math.asin(math.sqrt(a)) + + # Earth's radius in meters + earth_radius = 6371000 + return c * earth_radius +``` + +### Update Activity Parser +```python +# garminsync/activity_parser.py - Add GPX support +def detect_file_type(file_path): + """Detect file format (FIT, XML, GPX, or unknown)""" + try: + with open(file_path, 'rb') as f: + header = f.read(256) # Read more to catch GPX + + # Check for XML-based formats + if b'= 8 and header[4:8] == b'.FIT': + return 'fit' + if (len(header) >= 8 and + (header[0:4] == b'.FIT' or + header[4:8] == b'FIT.' or + header[8:12] == b'.FIT')): + return 'fit' + + return 'unknown' + except Exception as e: + return 'error' + +# Update get_activity_metrics to include GPX +def get_activity_metrics(activity, client=None): + """Get activity metrics from local file or Garmin API""" + metrics = None + if activity.filename and os.path.exists(activity.filename): + file_type = detect_file_type(activity.filename) + if file_type == 'fit': + metrics = parse_fit_file(activity.filename) + elif file_type == 'xml': + metrics = parse_xml_file(activity.filename) + elif file_type == 'gpx': + from .parsers.gpx_parser import parse_gpx_file + metrics = parse_gpx_file(activity.filename) + + # Only call Garmin API if we don't have local file data + if not metrics and client: + try: + metrics = client.get_activity_details(activity.activity_id) + except Exception: + pass + return metrics +``` + +--- + +## Phase 2: Better File Storage & Reduce API Calls (Week 3-4) + +### Problem: We're calling Garmin API unnecessarily when we have the file + +### Solution: Smart Caching Strategy + +```python +# garminsync/database.py - Add file-first approach +def sync_database(garmin_client): + """Sync local database with Garmin Connect activities""" + session = get_session() + try: + # Get activities list from Garmin (lightweight call) + activities = garmin_client.get_activities(0, 1000) + + if not activities: + print("No activities returned from Garmin API") + return + + for activity_data in activities: + activity_id = activity_data.get("activityId") + start_time = activity_data.get("startTimeLocal") + + if not activity_id or not start_time: + continue + + existing = session.query(Activity).filter_by(activity_id=activity_id).first() + + if not existing: + activity = Activity( + activity_id=activity_id, + start_time=start_time, + downloaded=False, + created_at=datetime.now().isoformat(), + last_sync=datetime.now().isoformat(), + ) + session.add(activity) + session.flush() + else: + activity = existing + + # Only get detailed metrics if we don't have a file OR file parsing failed + if not activity.filename or not activity.duration: + # Try to get metrics from file first + if activity.filename and os.path.exists(activity.filename): + metrics = get_activity_metrics(activity, client=None) # File only + else: + metrics = None + + # Only call API if file parsing failed or no file + if not metrics: + print(f"Getting details from API for activity {activity_id}") + metrics = get_activity_metrics(activity, garmin_client) + else: + print(f"Using cached file data for activity {activity_id}") + + # Update activity with metrics + if metrics: + update_activity_from_metrics(activity, metrics) + + activity.last_sync = datetime.now().isoformat() + + session.commit() + except Exception as e: + session.rollback() + raise e + finally: + session.close() + +def update_activity_from_metrics(activity, metrics): + """Helper function to update activity from metrics data""" + if not metrics: + return + + activity.activity_type = metrics.get("activityType", {}).get("typeKey") + + summary = metrics.get("summaryDTO", {}) + + if summary.get("duration"): + activity.duration = int(float(summary["duration"])) + if summary.get("distance"): + activity.distance = float(summary["distance"]) + if summary.get("maxHR"): + activity.max_heart_rate = int(float(summary["maxHR"])) + if summary.get("avgHR"): + activity.avg_heart_rate = int(float(summary["avgHR"])) + if summary.get("avgPower"): + activity.avg_power = float(summary["avgPower"]) + if summary.get("calories"): + activity.calories = int(float(summary["calories"])) +``` + +### Add Original File Storage +```python +# garminsync/database.py - Update Activity model +class Activity(Base): + __tablename__ = "activities" + + activity_id = Column(Integer, primary_key=True) + start_time = Column(String, nullable=False) + activity_type = Column(String, nullable=True) + duration = Column(Integer, nullable=True) + distance = Column(Float, nullable=True) + max_heart_rate = Column(Integer, nullable=True) + avg_heart_rate = Column(Integer, nullable=True) + avg_power = Column(Float, nullable=True) + calories = Column(Integer, nullable=True) + filename = Column(String, unique=True, nullable=True) + original_filename = Column(String, nullable=True) # NEW: Store original name + file_type = Column(String, nullable=True) # NEW: Store detected file type + file_size = Column(Integer, nullable=True) # NEW: Store file size + downloaded = Column(Boolean, default=False, nullable=False) + created_at = Column(String, nullable=False) + last_sync = Column(String, nullable=True) + metrics_source = Column(String, nullable=True) # NEW: 'file' or 'api' +``` + +--- + +## Phase 3: Enhanced UI with Filtering & Stats (Week 5-6) + +### Add Database Indexing +```python +# Create new migration file: migrations/versions/003_add_indexes.py +from alembic import op +import sqlalchemy as sa + +def upgrade(): + # Add indexes for common queries + op.create_index('ix_activities_activity_type', 'activities', ['activity_type']) + op.create_index('ix_activities_start_time', 'activities', ['start_time']) + op.create_index('ix_activities_downloaded', 'activities', ['downloaded']) + op.create_index('ix_activities_duration', 'activities', ['duration']) + op.create_index('ix_activities_distance', 'activities', ['distance']) + +def downgrade(): + op.drop_index('ix_activities_activity_type') + op.drop_index('ix_activities_start_time') + op.drop_index('ix_activities_downloaded') + op.drop_index('ix_activities_duration') + op.drop_index('ix_activities_distance') +``` + +### Enhanced Activities API with Filtering +```python +# garminsync/web/routes.py - Update activities endpoint +@router.get("/activities") +async def get_activities( + page: int = 1, + per_page: int = 50, + activity_type: str = None, + date_from: str = None, + date_to: str = None, + min_distance: float = None, + max_distance: float = None, + min_duration: int = None, + max_duration: int = None, + sort_by: str = "start_time", # NEW: sorting + sort_order: str = "desc" # NEW: sort direction +): + """Get paginated activities with enhanced filtering""" + session = get_session() + try: + query = session.query(Activity) + + # Apply filters + if activity_type: + query = query.filter(Activity.activity_type == activity_type) + if date_from: + query = query.filter(Activity.start_time >= date_from) + if date_to: + query = query.filter(Activity.start_time <= date_to) + if min_distance: + query = query.filter(Activity.distance >= min_distance * 1000) # Convert km to m + if max_distance: + query = query.filter(Activity.distance <= max_distance * 1000) + if min_duration: + query = query.filter(Activity.duration >= min_duration * 60) # Convert min to sec + if max_duration: + query = query.filter(Activity.duration <= max_duration * 60) + + # Apply sorting + sort_column = getattr(Activity, sort_by, Activity.start_time) + if sort_order.lower() == "asc": + query = query.order_by(sort_column.asc()) + else: + query = query.order_by(sort_column.desc()) + + # Get total count for pagination + total = query.count() + + # Apply pagination + activities = query.offset((page - 1) * per_page).limit(per_page).all() + + return { + "activities": [activity_to_dict(activity) for activity in activities], + "total": total, + "page": page, + "per_page": per_page, + "total_pages": (total + per_page - 1) // per_page + } + finally: + session.close() + +def activity_to_dict(activity): + """Convert activity to dictionary with computed fields""" + return { + "activity_id": activity.activity_id, + "start_time": activity.start_time, + "activity_type": activity.activity_type, + "duration": activity.duration, + "duration_formatted": format_duration(activity.duration), + "distance": activity.distance, + "distance_km": round(activity.distance / 1000, 2) if activity.distance else None, + "pace": calculate_pace(activity.distance, activity.duration), + "max_heart_rate": activity.max_heart_rate, + "avg_heart_rate": activity.avg_heart_rate, + "avg_power": activity.avg_power, + "calories": activity.calories, + "downloaded": activity.downloaded, + "file_type": activity.file_type, + "metrics_source": activity.metrics_source + } + +def calculate_pace(distance_m, duration_s): + """Calculate pace in min/km""" + if not distance_m or not duration_s or distance_m == 0: + return None + + distance_km = distance_m / 1000 + pace_s_per_km = duration_s / distance_km + + minutes = int(pace_s_per_km // 60) + seconds = int(pace_s_per_km % 60) + + return f"{minutes}:{seconds:02d}" +``` + +### Enhanced Frontend with Filtering +```javascript +// garminsync/web/static/activities.js - Add filtering capabilities +class ActivitiesPage { + constructor() { + this.currentPage = 1; + this.pageSize = 25; + this.totalPages = 1; + this.activities = []; + this.filters = {}; + this.sortBy = 'start_time'; + this.sortOrder = 'desc'; + this.init(); + } + + init() { + this.setupFilterForm(); + this.loadActivities(); + this.setupEventListeners(); + } + + setupFilterForm() { + // Create filter form dynamically + const filterHtml = ` +
+
+

Filters

+ +
+
+
+
+ + +
+ +
+ + +
+ +
+ + +
+
+ +
+
+ + +
+ +
+ + +
+ +
+ + +
+ +
+ + +
+
+ +
+ + +
+
+
+ `; + + // Insert before activities table + const container = document.querySelector('.activities-container'); + container.insertAdjacentHTML('afterbegin', filterHtml); + } + + setupEventListeners() { + // Apply filters + document.getElementById('apply-filters').addEventListener('click', () => { + this.applyFilters(); + }); + + // Clear filters + document.getElementById('clear-filters').addEventListener('click', () => { + this.clearFilters(); + }); + + // Toggle filter visibility + document.getElementById('toggle-filters').addEventListener('click', (e) => { + const filterForm = document.getElementById('filter-form'); + const isVisible = filterForm.style.display !== 'none'; + + filterForm.style.display = isVisible ? 'none' : 'block'; + e.target.textContent = isVisible ? 'Show' : 'Hide'; + }); + } + + applyFilters() { + this.filters = { + activity_type: document.getElementById('activity-type-filter').value, + date_from: document.getElementById('date-from-filter').value, + date_to: document.getElementById('date-to-filter').value, + min_distance: document.getElementById('min-distance-filter').value, + max_distance: document.getElementById('max-distance-filter').value + }; + + this.sortBy = document.getElementById('sort-by-filter').value; + this.sortOrder = document.getElementById('sort-order-filter').value; + + // Remove empty filters + Object.keys(this.filters).forEach(key => { + if (!this.filters[key]) { + delete this.filters[key]; + } + }); + + this.currentPage = 1; + this.loadActivities(); + } + + clearFilters() { + // Reset all filter inputs + document.getElementById('activity-type-filter').value = ''; + document.getElementById('date-from-filter').value = ''; + document.getElementById('date-to-filter').value = ''; + document.getElementById('min-distance-filter').value = ''; + document.getElementById('max-distance-filter').value = ''; + document.getElementById('sort-by-filter').value = 'start_time'; + document.getElementById('sort-order-filter').value = 'desc'; + + // Reset internal state + this.filters = {}; + this.sortBy = 'start_time'; + this.sortOrder = 'desc'; + this.currentPage = 1; + + this.loadActivities(); + } + + createTableRow(activity, index) { + const row = document.createElement('tr'); + row.className = index % 2 === 0 ? 'row-even' : 'row-odd'; + + row.innerHTML = ` + ${Utils.formatDate(activity.start_time)} + + + ${activity.activity_type || '-'} + + + ${activity.duration_formatted || '-'} + ${activity.distance_km ? activity.distance_km + ' km' : '-'} + ${activity.pace || '-'} + ${Utils.formatHeartRate(activity.max_heart_rate)} + ${Utils.formatHeartRate(activity.avg_heart_rate)} + ${Utils.formatPower(activity.avg_power)} + ${activity.calories ? activity.calories.toLocaleString() : '-'} + + + ${activity.file_type || 'API'} + + + `; + + return row; + } +} +``` + +--- + +## Phase 4: Activity Stats & Trends (Week 7-8) + +### Add Statistics API +```python +# garminsync/web/routes.py - Add comprehensive stats +@router.get("/stats/summary") +async def get_activity_summary(): + """Get comprehensive activity statistics""" + session = get_session() + try: + # Basic counts + total_activities = session.query(Activity).count() + downloaded_activities = session.query(Activity).filter_by(downloaded=True).count() + + # Activity type breakdown + type_stats = session.query( + Activity.activity_type, + func.count(Activity.activity_id).label('count'), + func.sum(Activity.distance).label('total_distance'), + func.sum(Activity.duration).label('total_duration'), + func.sum(Activity.calories).label('total_calories') + ).filter( + Activity.activity_type.isnot(None) + ).group_by(Activity.activity_type).all() + + # Monthly stats (last 12 months) + monthly_stats = session.query( + func.strftime('%Y-%m', Activity.start_time).label('month'), + func.count(Activity.activity_id).label('count'), + func.sum(Activity.distance).label('total_distance'), + func.sum(Activity.duration).label('total_duration') + ).filter( + Activity.start_time >= (datetime.now() - timedelta(days=365)).isoformat() + ).group_by( + func.strftime('%Y-%m', Activity.start_time) + ).order_by('month').all() + + # Personal records + records = { + 'longest_distance': session.query(Activity).filter( + Activity.distance.isnot(None) + ).order_by(Activity.distance.desc()).first(), + + 'longest_duration': session.query(Activity).filter( + Activity.duration.isnot(None) + ).order_by(Activity.duration.desc()).first(), + + 'highest_calories': session.query(Activity).filter( + Activity.calories.isnot(None) + ).order_by(Activity.calories.desc()).first() + } + + return { + "summary": { + "total_activities": total_activities, + "downloaded_activities": downloaded_activities, + "sync_percentage": round((downloaded_activities / total_activities) * 100, 1) if total_activities > 0 else 0 + }, + "by_type": [ + { + "activity_type": stat.activity_type, + "count": stat.count, + "total_distance_km": round(stat.total_distance / 1000, 1) if stat.total_distance else 0, + "total_duration_hours": round(stat.total_duration / 3600, 1) if stat.total_duration else 0, + "total_calories": stat.total_calories or 0 + } + for stat in type_stats + ], + "monthly": [ + { + "month": stat.month, + "count": stat.count, + "total_distance_km": round(stat.total_distance / 1000, 1) if stat.total_distance else 0, + "total_duration_hours": round(stat.total_duration / 3600, 1) if stat.total_duration else 0 + } + for stat in monthly_stats + ], + "records": { + "longest_distance": { + "distance_km": round(records['longest_distance'].distance / 1000, 1) if records['longest_distance'] and records['longest_distance'].distance else 0, + "date": records['longest_distance'].start_time if records['longest_distance'] else None + }, + "longest_duration": { + "duration_hours": round(records['longest_duration'].duration / 3600, 1) if records['longest_duration'] and records['longest_duration'].duration else 0, + "date": records['longest_duration'].start_time if records['longest_duration'] else None + }, + "highest_calories": { + "calories": records['highest_calories'].calories if records['highest_calories'] and records['highest_calories'].calories else 0, + "date": records['highest_calories'].start_time if records['highest_calories'] else None + } + } + } + finally: + session.close() +``` + +### Simple Charts with Chart.js +```html + +
+
+
+

Activity Statistics

+
+
+
+

{{ stats.total }}

+

Total Activities

+
+
+

{{ stats.downloaded }}

+

Downloaded

+
+
+

-

+

Sync %

+
+
+
+ +
+
+

Activity Types

+
+ +
+ +
+
+

Monthly Activity

+
+ +
+
+``` + +```javascript +// garminsync/web/static/stats.js - Simple chart implementation +class StatsPage { + constructor() { + this.charts = {}; + this.init(); + } + + async init() { + await this.loadStats(); + this.createCharts(); + } + + async loadStats() { + try { + const response = await fetch('/api/stats/summary'); + this.stats = await response.json(); + this.updateSummaryCards(); + } catch (error) { + console.error('Failed to load stats:', error); + } + } + + updateSummaryCards() { + document.getElementById('total-activities').textContent = this.stats.summary.total_activities; + document.getElementById('downloaded-activities').textContent = this.stats.summary.downloaded_activities; + document.getElementById('sync-percentage').textContent = this.stats.summary.sync_percentage + '%'; + } + + createCharts() { + this.createActivityTypesChart(); + this.createMonthlyChart(); + } + + createActivityTypesChart() { + const ctx = document.getElementById('activity-types-chart').getContext('2d'); + + const data = this.stats.by_type.map(item => ({ + label: item.activity_type, + data: item.count + })); + + this.charts.activityTypes = new Chart(ctx, { + type: 'doughnut', + data: { + labels: data.map(item => item.label), + datasets: [{ + data: data.map(item => item.data), + backgroundColor: [ + '#FF6384', '#36A2EB', '#FFCE56', '#4BC0C0', + '#9966FF', '#FF9F40', '#FF6384', '#C9CBCF' + ] + }] + }, + options: { + responsive: true, + plugins: { + legend: { + position: 'bottom' + }, + tooltip: { + callbacks: { + label: function(context) { + const label = context.label || ''; + const value = context.parsed; + const total = context.dataset.data.reduce((a, b) => a + b, 0); + const percentage = ((value / total) * 100).toFixed(1); + return `${label}: ${value} (${percentage}%)`; + } + } + } + } + } + }); + } + + createMonthlyChart() { + const ctx = document.getElementById('monthly-chart').getContext('2d'); + + const monthlyData = this.stats.monthly; + + this.charts.monthly = new Chart(ctx, { + type: 'line', + data: { + labels: monthlyData.map(item => item.month), + datasets: [ + { + label: 'Activities', + data: monthlyData.map(item => item.count), + borderColor: '#36A2EB', + backgroundColor: 'rgba(54, 162, 235, 0.1)', + yAxisID: 'y' + }, + { + label: 'Distance (km)', + data: monthlyData.map(item => item.total_distance_km), + borderColor: '#FF6384', + backgroundColor: 'rgba(255, 99, 132, 0.1)', + yAxisID: 'y1' + } + ] + }, + options: { + responsive: true, + plugins: { + legend: { + position: 'top' + }, + tooltip: { + mode: 'index', + intersect: false + } + }, + scales: { + y: { + type: 'linear', + display: true, + position: 'left', + title: { + display: true, + text: 'Number of Activities' + } + }, + y1: { + type: 'linear', + display: true, + position: 'right', + title: { + display: true, + text: 'Distance (km)' + }, + grid: { + drawOnChartArea: false, + }, + } + } + } + }); + } +} + +// Initialize when DOM is ready +document.addEventListener('DOMContentLoaded', function() { + if (document.getElementById('activity-types-chart')) { + new StatsPage(); + } +}); +``` + +--- + +## Phase 5: File Management & Storage Optimization (Week 9-10) + +### Problem: Better file organization and storage + +### Solution: Organized File Storage with Metadata + +```python +# garminsync/file_manager.py - New file for managing activity files +import os +import hashlib +from pathlib import Path +from datetime import datetime +import shutil + +class ActivityFileManager: + """Manages activity file storage with proper organization""" + + def __init__(self, base_data_dir=None): + self.base_dir = Path(base_data_dir or os.getenv("DATA_DIR", "data")) + self.activities_dir = self.base_dir / "activities" + self.activities_dir.mkdir(parents=True, exist_ok=True) + + def save_activity_file(self, activity_id, file_data, original_filename=None): + """ + Save activity file with proper organization + Returns: (filepath, file_info) + """ + # Detect file type from data + file_type = self._detect_file_type_from_data(file_data) + + # Generate file hash for deduplication + file_hash = hashlib.md5(file_data).hexdigest() + + # Create organized directory structure: activities/YYYY/MM/ + activity_date = self._extract_date_from_activity_id(activity_id) + year_month_dir = self.activities_dir / activity_date.strftime("%Y") / activity_date.strftime("%m") + year_month_dir.mkdir(parents=True, exist_ok=True) + + # Generate filename + extension = self._get_extension_for_type(file_type) + filename = f"activity_{activity_id}_{file_hash[:8]}.{extension}" + filepath = year_month_dir / filename + + # Check if file already exists (deduplication) + if filepath.exists(): + existing_size = filepath.stat().st_size + if existing_size == len(file_data): + print(f"File already exists for activity {activity_id}, skipping...") + return str(filepath), self._get_file_info(filepath, file_data, file_type) + + # Save file + with open(filepath, 'wb') as f: + f.write(file_data) + + file_info = self._get_file_info(filepath, file_data, file_type) + + print(f"Saved activity {activity_id} to {filepath}") + return str(filepath), file_info + + def _detect_file_type_from_data(self, data): + """Detect file type from binary data""" + if len(data) >= 8 and data[4:8] == b'.FIT': + return 'fit' + elif b'= 2: + activity_id = int(parts[1]) + if activity_id not in valid_activity_ids: + orphaned_files.append(file_path) + except (ValueError, IndexError): + continue + + # Remove orphaned files + for file_path in orphaned_files: + print(f"Removing orphaned file: {file_path}") + file_path.unlink() + + return len(orphaned_files) +``` + +### Update Download Process +```python +# garminsync/daemon.py - Update sync_and_download to use file manager +from .file_manager import ActivityFileManager + +class GarminSyncDaemon: + def __init__(self): + self.scheduler = BackgroundScheduler() + self.running = False + self.web_server = None + self.sync_lock = threading.Lock() + self.sync_in_progress = False + self.file_manager = ActivityFileManager() # NEW + + def sync_and_download(self): + """Scheduled job function with improved file handling""" + session = None + try: + self.log_operation("sync", "started") + + from .database import sync_database + from .garmin import GarminClient + + client = GarminClient() + sync_database(client) + + downloaded_count = 0 + session = get_session() + missing_activities = ( + session.query(Activity).filter_by(downloaded=False).all() + ) + + for activity in missing_activities: + try: + # Download activity data + fit_data = client.download_activity_fit(activity.activity_id) + + # Save using file manager + filepath, file_info = self.file_manager.save_activity_file( + activity.activity_id, + fit_data + ) + + # Update activity record + activity.filename = filepath + activity.file_type = file_info['type'] + activity.file_size = file_info['size'] + activity.downloaded = True + activity.last_sync = datetime.now().isoformat() + + # Get metrics from file + metrics = get_activity_metrics(activity, client=None) # File only + if metrics: + update_activity_from_metrics(activity, metrics) + activity.metrics_source = 'file' + else: + # Fallback to API if file parsing fails + metrics = get_activity_metrics(activity, client) + if metrics: + update_activity_from_metrics(activity, metrics) + activity.metrics_source = 'api' + + session.commit() + downloaded_count += 1 + + except Exception as e: + logger.error(f"Failed to download activity {activity.activity_id}: {e}") + session.rollback() + + self.log_operation("sync", "success", f"Downloaded {downloaded_count} new activities") + self.update_daemon_last_run() + + except Exception as e: + logger.error(f"Sync failed: {e}") + self.log_operation("sync", "error", str(e)) + finally: + if session: + session.close() +``` + +--- + +## Phase 6: Advanced Features & Polish (Week 11-12) + +### Add Activity Search +```python +# garminsync/web/routes.py - Add search endpoint +@router.get("/activities/search") +async def search_activities( + q: str, # Search query + page: int = 1, + per_page: int = 20 +): + """Search activities by various fields""" + session = get_session() + try: + # Build search query + query = session.query(Activity) + + search_terms = q.lower().split() + + for term in search_terms: + # Search in multiple fields + query = query.filter( + or_( + Activity.activity_type.ilike(f'%{term}%'), + Activity.filename.ilike(f'%{term}%'), + # Add more searchable fields as needed + ) + ) + + total = query.count() + activities = query.order_by(Activity.start_time.desc()).offset( + (page - 1) * per_page + ).limit(per_page).all() + + return { + "activities": [activity_to_dict(activity) for activity in activities], + "total": total, + "page": page, + "per_page": per_page, + "query": q + } + finally: + session.close() +``` + +### Add Bulk Operations +```javascript +// garminsync/web/static/bulk-operations.js +class BulkOperations { + constructor() { + this.selectedActivities = new Set(); + this.init(); + } + + init() { + this.addBulkControls(); + this.setupEventListeners(); + } + + addBulkControls() { + const bulkHtml = ` + + `; + + document.querySelector('.activities-table-card').insertAdjacentHTML('afterbegin', bulkHtml); + } + + setupEventListeners() { + // Add checkboxes to table + this.addCheckboxesToTable(); + + // Bulk action buttons + document.getElementById('clear-selection').addEventListener('click', () => { + this.clearSelection(); + }); + + document.getElementById('bulk-reprocess').addEventListener('click', () => { + this.reprocessSelectedFiles(); + }); + } + + addCheckboxesToTable() { + // Add header checkbox + const headerRow = document.querySelector('.activities-table thead tr'); + headerRow.insertAdjacentHTML('afterbegin', ''); + + // Add row checkboxes + const rows = document.querySelectorAll('.activities-table tbody tr'); + rows.forEach((row, index) => { + const activityId = this.extractActivityIdFromRow(row); + row.insertAdjacentHTML('afterbegin', + `` + ); + }); + + // Setup checkbox events + document.getElementById('select-all').addEventListener('change', (e) => { + this.selectAll(e.target.checked); + }); + + document.querySelectorAll('.activity-checkbox').forEach(checkbox => { + checkbox.addEventListener('change', (e) => { + this.toggleActivity(e.target.dataset.activityId, e.target.checked); + }); + }); + } + + extractActivityIdFromRow(row) { + // Extract activity ID from the row (you'll need to adjust this based on your table structure) + return row.dataset.activityId || row.cells[1].textContent; // Adjust as needed + } + + selectAll(checked) { + document.querySelectorAll('.activity-checkbox').forEach(checkbox => { + checkbox.checked = checked; + this.toggleActivity(checkbox.dataset.activityId, checked); + }); + } + + toggleActivity(activityId, selected) { + if (selected) { + this.selectedActivities.add(activityId); + } else { + this.selectedActivities.delete(activityId); + } + + this.updateBulkControls(); + } + + updateBulkControls() { + const count = this.selectedActivities.size; + const bulkDiv = document.getElementById('bulk-operations'); + const countSpan = document.getElementById('selected-count'); + + countSpan.textContent = count; + bulkDiv.style.display = count > 0 ? 'block' : 'none'; + } + + clearSelection() { + this.selectedActivities.clear(); + document.querySelectorAll('.activity-checkbox').forEach(checkbox => { + checkbox.checked = false; + }); + document.getElementById('select-all').checked = false; + this.updateBulkControls(); + } + + async reprocessSelectedFiles() { + if (this.selectedActivities.size === 0) return; + + const button = document.getElementById('bulk-reprocess'); + button.disabled = true; + button.textContent = 'Processing...'; + + try { + const response = await fetch('/api/activities/reprocess', { + method: 'POST', + headers: {'Content-Type': 'application/json'}, + body: JSON.stringify({ + activity_ids: Array.from(this.selectedActivities) + }) + }); + + if (response.ok) { + Utils.showSuccess('Files reprocessed successfully'); + // Refresh the page or reload data + window.location.reload(); + } else { + throw new Error('Reprocessing failed'); + } + } catch (error) { + Utils.showError('Failed to reprocess files: ' + error.message); + } finally { + button.disabled = false; + button.textContent = 'Reprocess Files'; + } + } +} +``` + +### Add Configuration Management +```python +# garminsync/web/routes.py - Add configuration endpoints +@router.get("/config") +async def get_configuration(): + """Get current configuration""" + session = get_session() + try: + daemon_config = session.query(DaemonConfig).first() + + return { + "sync": { + "enabled": daemon_config.enabled if daemon_config else True, + "schedule": daemon_config.schedule_cron if daemon_config else "0 */6 * * *", + "status": daemon_config.status if daemon_config else "stopped" + }, + "storage": { + "data_dir": os.getenv("DATA_DIR", "data"), + "total_activities": session.query(Activity).count(), + "downloaded_files": session.query(Activity).filter_by(downloaded=True).count() + }, + "api": { + "garmin_configured": bool(os.getenv("GARMIN_EMAIL") and os.getenv("GARMIN_PASSWORD")), + "rate_limit_delay": 2 # seconds between API calls + } + } + finally: + session.close() + +@router.post("/config/sync") +async def update_sync_config(config_data: dict): + """Update sync configuration""" + session = get_session() + try: + daemon_config = session.query(DaemonConfig).first() + if not daemon_config: + daemon_config = DaemonConfig() + session.add(daemon_config) + + if 'enabled' in config_data: + daemon_config.enabled = config_data['enabled'] + if 'schedule' in config_data: + # Validate cron expression + try: + from apscheduler.triggers.cron import CronTrigger + CronTrigger.from_crontab(config_data['schedule']) + daemon_config.schedule_cron = config_data['schedule'] + except ValueError as e: + raise HTTPException(status_code=400, detail=f"Invalid cron expression: {e}") + + session.commit() + return {"message": "Configuration updated successfully"} + finally: + session.close() +``` + +--- + +## Testing & Deployment Guide + +### Simple Testing Strategy +```python +# tests/test_basic_functionality.py - Basic tests for junior developers +import pytest +import os +import tempfile +from pathlib import Path + +def test_file_type_detection(): + """Test that we can detect different file types correctly""" + from garminsync.activity_parser import detect_file_type + + # Create temporary test files + with tempfile.NamedTemporaryFile(suffix='.fit', delete=False) as f: + # Write FIT file header + f.write(b'\x0E\x10\x43\x08.FIT\x00\x00\x00\x00') + fit_file = f.name + + with tempfile.NamedTemporaryFile(suffix='.gpx', delete=False) as f: + f.write(b'') + gpx_file = f.name + + try: + assert detect_file_type(fit_file) == 'fit' + assert detect_file_type(gpx_file) == 'gpx' + finally: + os.unlink(fit_file) + os.unlink(gpx_file) + +def test_activity_metrics_parsing(): + """Test that we can parse activity metrics""" + # This would test your parsing functions + pass + +# Run with: python -m pytest tests/ +``` + +### Deployment Checklist +```yaml +# docker-compose.yml - Updated for new features +version: '3.8' +services: + garminsync: + build: . + ports: + - "8888:8888" + environment: + - GARMIN_EMAIL=${GARMIN_EMAIL} + - GARMIN_PASSWORD=${GARMIN_PASSWORD} + - DATA_DIR=/data + volumes: + - ./data:/data + - ./logs:/app/logs + restart: unless-stopped + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8888/health"] + interval: 30s + timeout: 10s + retries: 3 +``` + +--- + +## Summary & Next Steps + +### What This Plan Achieves: +1. **Non-blocking sync** - Users can browse while sync runs +2. **Multi-format support** - FIT, TCX, GPX files +3. **Reduced API calls** - File-first approach with smart caching +4. **Enhanced UI** - Filtering, search, stats, and trends +5. **Better file management** - Organized storage with deduplication +6. **Simple architecture** - Single container, threading instead of complex async + +### Implementation Tips for Junior Developers: +- **Start small** - Implement one phase at a time +- **Test frequently** - Run the app after each major change +- **Keep backups** - Always backup your database before migrations +- **Use logging** - Add print statements and logs liberally +- **Ask for help** - Don't hesitate to ask questions about complex parts + +### Estimated Timeline: +- **Phase 1-2**: 2-4 weeks (core improvements) +- **Phase 3-4**: 2-4 weeks (UI enhancements) +- **Phase 5-6**: 2-4 weeks (advanced features) + +Would you like me to elaborate on any specific phase or create detailed code examples for any particular feature? \ No newline at end of file diff --git a/requirements.txt b/requirements.txt index 635b052..6c86f47 100644 --- a/requirements.txt +++ b/requirements.txt @@ -20,3 +20,6 @@ pygments==2.18.0 fitdecode numpy==1.26.0 scipy==1.11.1 +aiosqlite +asyncpg +aiohttp diff --git a/todo.md b/todo.md deleted file mode 100644 index df6e76a..0000000 --- a/todo.md +++ /dev/null @@ -1,44 +0,0 @@ -# Activity Reprocessing Implementation - -## Goal -Add capability to reprocess existing activities to calculate missing metrics like `avg_power` - -## Requirements -- Reprocess all existing activities -- Add web UI button to trigger reprocessing -- Background processing for large jobs -- Progress tracking and status reporting - -## Implementation Phases - -### Phase 1: Database & Infrastructure -- [ ] Add `reprocessed` column to activities table -- [ ] Create migration script for new column -- [ ] Update activity parser to handle reprocessing -- [ ] Add CLI commands for reprocessing - -### Phase 2: CLI & Backend -- [ ] Implement `garminsync reprocess` commands: - - `--all`: Reprocess all activities - - `--missing`: Reprocess activities missing metrics - - `--activity-id`: Reprocess specific activity -- [ ] Add daemon support for reprocessing -- [ ] Create background job system - -### Phase 3: Web UI Integration -- [ ] Add "Reprocess" button to activities page -- [ ] Create API endpoints: - - POST /api/activities/reprocess - - POST /api/activities/{id}/reprocess -- [ ] Implement progress indicators -- [ ] Add real-time status updates via websockets - -### Phase 4: Testing & Optimization -- [ ] Write tests for reprocessing functionality -- [ ] Add pagination for large reprocessing jobs -- [ ] Implement caching for reprocessed activities -- [ ] Performance benchmarks - -## Current Status -*Last updated: 2025-08-23* -โณ Planning phase - not yet implemented