# Feature Specification: Initial Spec **Feature Branch**: `001-create-a-new` **Created**: 2025-10-09 **Status**: Draft **Input**: User description: "create a new feature called "Initial Spec" based on the content of @initialspec.md" ## Clarifications ### Session 2025-10-09 - Q: How should the API endpoints be secured? → A: No authentication required (public API). - Q: When the system encounters a corrupted or invalid workout file, how should it report the error? → A: A JSON object in the API response. - Q: The original spec mentioned `_calculate_hr_recovery` was not implemented. Should heart rate recovery analysis be included in the scope for this feature? → A: No, defer it to a future feature. - Q: The underlying analysis functions use a default FTP (Functional Threshold Power). Should the FTP value be configurable for TSS/IF calculations? → A: Yes, make it configurable per user account. - Q: What is the target service availability for the analysis API? → A: Best effort (development/internal tool). ## User Scenarios & Testing *(mandatory)* ### User Story 1 - Analyze a single workout file (Priority: P1) As a data analyst, I want to upload a single workout file (FIT, TCX, or GPX) and receive a comprehensive analysis report so that I can understand the performance characteristics of that workout. **Why this priority**: This is the core functionality of the application and provides the primary value to the user. **Independent Test**: Can be fully tested by providing a single workout file and verifying that a complete report is generated with all expected metrics and charts. **Acceptance Scenarios**: 1. **Given** a valid FIT file containing power data, **When** the user submits it for analysis, **Then** the system generates a report including NP, IF, TSS, and power zone distribution. 2. **Given** a valid GPX file without power data, **When** the user submits it for analysis, **Then** the system estimates power and generates a report with the estimated power metrics. --- ### User Story 2 - Generate specific charts for a workout (Priority: P2) As a developer, I want to request specific visualizations (e.g., power curve, elevation profile) for an analyzed workout via an API so that I can integrate these charts into a custom dashboard. **Why this priority**: Enables integration with other systems and provides flexibility for developers. **Independent Test**: Can be tested by calling an API endpoint with a workout ID and chart type, and verifying that the correct chart image is returned. **Acceptance Scenarios**: 1. **Given** a workout has been successfully analyzed, **When** a developer requests the `power_curve` chart via the API, **Then** the system returns a PNG image of the power curve. 2. **Given** an invalid workout ID, **When** a developer requests a chart, **Then** the system returns a 404 Not Found error. --- ### User Story 3 - Batch analyze multiple workout files (Priority: P3) As a coach, I want to upload a directory of workout files and receive a summary report comparing the key metrics across all workouts so that I can track my athletes' progress over time. **Why this priority**: Provides a powerful feature for users who need to analyze large amounts of data. **Independent Test**: Can be tested by providing a directory of workout files and verifying that a batch analysis is performed and a summary report is generated. **Acceptance Scenarios**: 1. **Given** a directory containing 10 valid workout files, **When** the user submits the directory for batch analysis, **Then** the system processes all 10 files and generates a summary CSV file with key metrics for each workout. --- ### Edge Cases - **Corrupted Files**: How does the system handle a workout file that is corrupted or malformed? It should fail gracefully and report the error for the specific file as a JSON object in the API response. - **Missing Data**: What happens if a workout file is missing a critical data stream (e.g., timestamp, distance)? The system should report the missing data and calculate metrics that are possible with the available data. - **Data Spikes**: How does the system handle data with significant spikes or anomalies (e.g., a power reading of 5000W)? The system should detect and flag these spikes in the analysis. ## Requirements *(mandatory)* ### Functional Requirements - **FR-001**: System MUST parse FIT, TCX, and GPX workout files. - **FR-002**: System MUST calculate summary metrics including duration, distance, speed, power, and heart rate. - **FR-003**: System MUST perform detailed power analysis, including Normalized Power (NP), Intensity Factor (IF), Training Stress Score (TSS), power zones, and power distribution. - **FR-004**: System MUST perform detailed heart rate analysis, including time in zones and distribution. - **FR-005**: System MUST perform detailed speed analysis, including time in speed zones and distribution. - **FR-006**: System MUST perform elevation analysis, including total ascent/descent and gradient. - **FR-007**: System MUST detect high-intensity intervals based on power output. - **FR-008**: System MUST calculate efficiency metrics, such as the power-to-heart-rate ratio. - **FR-009**: System MUST estimate power data when it is not present in the source file. - **FR-010**: System MUST be able to analyze gear usage for single-speed bicycles. - **FR-011**: System MUST generate analysis reports in HTML, PDF, and Markdown formats. - **FR-012**: System MUST generate charts, including power curves, elevation profiles, and zone distribution charts. - **FR-013**: System MUST provide a `POST /api/analyze/workout` endpoint to analyze a single workout file. - **FR-014**: System MUST provide a `POST /api/analyze/batch` endpoint to analyze a directory of workout files. - **FR-015**: System MUST provide a `GET /api/analysis/{id}/charts` endpoint to retrieve generated charts for a specific analysis. - **FR-016**: System MUST provide a `GET /api/analysis/{id}/summary` endpoint to retrieve a summary of a specific analysis. - **FR-017**: System MUST expose the API endpoints publicly with no authentication mechanism. - **FR-018**: System MUST allow users to configure their FTP value. - **FR-019**: The system MUST use the user's configured FTP for all relevant calculations (TSS, IF). If not configured, a default value will be used. ### Key Entities - **User**: Represents a user of the system. Key attributes include a unique identifier and their configured FTP value. - **WorkoutData**: Represents a complete workout session, containing all raw and processed data. Key attributes include metadata, a dataframe of the raw time-series data, and dedicated objects for power, heart rate, speed, and elevation data. - **WorkoutMetadata**: Contains metadata about the workout, such as start time, duration, and the device used. - **PowerData**: Holds all power-related information, including the raw power stream, average power, normalized power, and zone distribution. - **HeartRateData**: Holds all heart-rate-related information, including the raw HR stream, average HR, max HR, and zone distribution. - **ZoneCalculator**: A utility entity used to define and calculate training zones for different metrics like power and heart rate. ### Out of Scope - **Heart Rate Recovery Analysis**: The calculation and analysis of heart rate recovery is explicitly out of scope for this feature and will be considered for a future release. ## Success Criteria *(mandatory)* ### Measurable Outcomes - **SC-001**: The system MUST successfully parse over 99% of valid FIT, TCX, and GPX files from major device manufacturers (Garmin, Wahoo). - **SC-002**: The analysis of a typical 2-hour workout file MUST complete in under 30 seconds on standard hardware. - **SC-003**: The variance between key calculated metrics (NP, IF, TSS) and the values produced by industry-standard software like GoldenCheetah or TrainingPeaks MUST be less than 5%. - **SC-004**: The system MUST be capable of processing a batch of 100 workout files concurrently without generating errors or significant performance degradation. - **SC-005**: The service will be provided on a "best effort" basis, suitable for development and internal use, with no strict uptime guarantee.