Hello All,
Greetings from Starkflow.
We are currently hiring for a "Computer Vision Engineer" for a part-time remote role for one of our hiring partner in India.
Job Type: Part-Time 20 hours/week
Job Title: Computer Vision Engineer/Developer
Job Location: India 100% Remote
Job Duration: 4 weeks
Job description:
Project Overview
The client is developing a proof-of-concept (POC) AI fitness coaching application that uses smartphone cameras to provide real-time form correction and feedback during workouts. This document outlines the specific technical requirements and development approach for the computer vision component, which is the core innovation of our product.
Primary Objectives
1. Create a computer vision system that can accurately analyze human movement during exercise
2. Implement real-time form detection and correction feedback
3. Develop a solution that works reliably on standard smartphone hardware
4. Deliver a demonstrable POC within 4-6 weeks suitable for presentations
Technical Requirements
1. Pose Estimation Implementation
- Primary Framework: Implement MediaPipe Pose (or equivalent library if you can justify superior performance)
- Optimization Requirements:
- Achieve minimum 15 FPS processing on mid-range smartphones
- Maintain accurate joint tracking during rapid movement
- Implement confidence thresholds to filter low-quality detections
- Camera Handling:
- Support portrait and landscape orientation
- Function in varying lighting conditions (within reasonable parameters)
- Provide guidance system for optimal camera placement
2. Exercise Analysis Algorithms
- Exercise Focus: Squats (primary), with a secondary and tertiary variations based on available equipment scenarios i.e. traditional straight bar, dumbbells/kettle bells, body weight only.
- Key Metrics to Track:
- Knee tracking (relative to toes and hip alignment)
- Back angle and straightness throughout movement
- Depth of squat (hip position relative to knees)
- Balance and weight distribution (center of gravity)
- Range of motion completion
- Analysis Approach:
- Define biomechanically correct form parameters for each exercise
- Create tolerance ranges for acceptable form variation
- Implement error detection for common form problems:
Knees caving inward
Insufficient depth
Back rounding
Forward knee projection
Weight shifting to toes/heels improperly
More to follow, use the above as representative, but not exhaustive
3. Real-Time Feedback System
- Processing Pipeline :
- Capture video frames at optimal resolution/framerate balance
- Process frames through pose estimation
- Extract relevant joint angles and positions
- Compare against ideal form templates
- Generate feedback based on deviations
- Deliver feedback with minimal latency (
- Feedback Categorization :
- Critical form errors (immediate correction needed)
- Form optimization suggestions (minor adjustments)
- Positive reinforcement for correct form
- Progressive feedback (acknowledging improvements)
4. Integration Requirements
- API Development :
- Create WebSocket-based service for real-time communication
- Design clean interfaces for frontend integration
- Implement structured data format for pose analysis results
- Mobile Integration :
- Work with frontend developer to optimize camera access
- Implement efficient data transfer between vision system and UI
- Develop fallback modes for processing limitations
5. Performance Optimization
- Processing Strategy :
- Implement hybrid processing approach:
Essential pose detection on device
More complex analysis optionally offloaded to server
- Balance accuracy vs. performance based on device capabilities
- Resource Management :
- Minimize battery consumption during workout sessions
- Implement dynamic quality adjustments based on device performance
- Optimize memory usage for extended workout sessions
Development Milestones
Week 1: Foundation
- Set up development environment with MediaPipe integration
- Implement basic pose detection from smartphone camera
- Create testing framework for vision accuracy
- Define ideal form parameters for squat exercise
- Document baseline performance metrics
Week 2: Core Analysis
- Develop squat-specific form analysis algorithms
- Implement joint angle calculation and tracking
- Create comparison logic against ideal form templates
- Build basic feedback mechanism based on form deviations
- Begin integration with frontend via API
Week 3: Optimization & Feedback
- Refine form detection accuracy across different body types
- Implement confidence scoring and error handling
- Optimize performance for target devices
- Enhance feedback system with progressive coaching elements
- Develop server-side processing option for complex analysis
Week 4: Integration & Testing
- Complete frontend integration
- Implement comprehensive testing across devices and conditions
- Optimize latency and performance
- Create demo configuration for investor presentations
- Document system architecture and technical achievements
Deliverables
1. Fully functional computer vision engine for exercise form analysis
2. API documentation for frontend integration
3. Performance optimization guide for different device capabilities
4. Technical documentation explaining the system architecture
5. Testing results demonstrating accuracy across conditions
6. Recommendations for scaling beyond POC
Technical Skills Required
- Strong experience with MediaPipe, OpenPose, or equivalent pose estimation libraries
- Deep understanding of computer vision principles
- Experience with real-time video processing
- Knowledge of WebSocket implementation
- Proficiency in optimizing performance for mobile devices
- Understanding of human biomechanics or movement analysis
- Ability to implement efficient algorithms for real-time analysis
Collaboration Requirements
- Regular communication with frontend developer
- Daily status updates during critical integration phases
- Clear documentation of APIs and interfaces
- Proactive identification of technical limitations and solutions
- Availability for troubleshooting during testing phases
Success Criteria
- System accurately identifies correct/incorrect squat form in >90% of attempts
- Feedback is delivered within 500ms of form issue detection
- Solution works reliably on mid-range Android and iOS devices
- Battery consumption allows for at least 30 minutes of continuous use
- Code is well-documented and structured for future expansion