Senior Data/ML Engineering Position

Canada

eDNA Explorer
Job Expired - Click here to search for similar jobs

eDNA Explorer is expanding through a partnership with iTrackDNA in Canada to create eDNA Explorer Canada. We are building a cutting-edge software platform for processing and analyzing environmental DNA (eDNA) data. Our system processes biological samples to identify species based on their genetic material, integrates environmental data, and provides insights into biodiversity and ecological patterns. We're using modern cloud-native data engineering principles and AI to build robust, scalable pipelines for scientific data analysis.


Position Overview


We are seeking a Senior Data/ML Engineer to join our team working on the eDNA Explorer data processing pipelines. The ideal candidate will have strong Python development skills, experience with data orchestration frameworks, and a background in building cloud-based data processing systems. Knowledge of bioinformatics or genomics is a plus but not required.

This is a grant-funded position with the possibility of future hiring as an employee at the end of the grant.


Technology Stack


Our platform leverages the following technologies:


Core Technologies

  • Python: Primary development language (version 3.12)
  • Dagster: Data orchestration framework
  • Docker/Kubernetes: Containerization and orchestration
  • Google Cloud Platform: Primary cloud provider
  • Google Cloud Storage
  • BigQuery
  • Secret Manager
  • PostgreSQL: Relational database
  • SQLAlchemy: ORM for database interactions
  • Polars: High-performance data processing library

Data Science & Bioinformatics

  • scikit-learn: Machine learning library
  • plotly: Data visualization
  • Earth Engine API: Environmental data collection
  • Bioinformatics tools: DADA2, custom taxonomic classification tools

DevOps & Infrastructure

  • Poetry: Dependency management
  • Ruff: Linting and code quality
  • Helm/ArgoCD: Kubernetes deployment and CD
  • GitHub Actions: CI pipelines

Key Responsibilities


As a Senior Data/ML Engineer, you will:


  • Design, develop, and maintain data processing pipelines using Dagster
  • Implement high-performance data transformation operations using Python and Polars
  • Optimize cloud resource usage and cost-efficiency in GCP
  • Collaborate with bioinformaticians to implement scientific algorithms
  • Build and improve our ML features for taxonomic classification and feature importance
  • Develop robust error handling and logging systems for pipeline monitoring
  • Create comprehensive tests for pipeline components
  • Contribute to deployment and CI/CD processes
  • Participate in code reviews and technical design discussions

Current Projects


You will have the opportunity to work on several exciting initiatives:


  1. Improved Sequence Analysis Pipeline: Enhance our taxonomic classification system to handle complex eDNA samples with greater accuracy
  2. Feature Importance Framework: Further develop our machine learning approach to identify key environmental factors influencing species distribution, ecosystem health and deeper understanding of the system biodiversity that leads to better restoration and management efforts.
  3. Terradactyl Integration: Expand our environmental data collection capabilities with additional data sources
  4. Pipeline Performance Optimization: Improve processing speed and resource efficiency for large-scale sequence data
  5. Developer Experience Improvements: Enhance testing, monitoring, and deployment systems

Required Qualifications

  • 5+ years of professional software development experience, with at least 3 years focused on data engineering
  • Strong proficiency in Python development, including testing and performance optimization
  • Experience with data orchestration frameworks (Dagster, Airflow, Prefect, etc.)
  • Demonstrated experience with cloud platforms, preferably GCP
  • Knowledge of SQL and relational database design
  • Experience with containerization technologies (Docker, Kubernetes)
  • Comfort working in a collaborative, fast-paced environment
  • Ability to understand and implement complex data workflows

Preferred Qualifications

  • Experience with scientific or bioinformatics data processing
  • Background in machine learning, particularly scikit-learn
  • Knowledge of genomics or related biological fields
  • Experience with Polars or other high-performance data processing libraries
  • Familiarity with geographic information systems (GIS) or Earth Engine
  • Experience with CI/CD systems and automated testing

Education

  • Bachelor's degree in Computer Science, Data Science, Bioinformatics, or a related field
  • Advanced degree (MS/PhD) in a relevant field is a plus

Skills That Will Help You Succeed

  • Problem-solving: The ability to tackle complex data processing challenges
  • Adaptability: Comfort with learning new technologies and scientific concepts
  • Attention to detail: Precision is critical when working with scientific data
  • Communication: The ability to explain technical concepts to team members with diverse backgrounds
  • Initiative: Self-direction to identify improvements and implement solutions

Our Development Environment


You'll be working in a modern development environment with:


  • Git-based workflow with pull requests and code reviews
  • Cloud-based development environments
  • Containerized testing and deployment
  • Comprehensive CI/CD pipelines
  • Collaborative team with both engineers and scientists

Why Join Our Team?


Working at eDNA Explorer offers the opportunity to:


  • Apply cutting-edge data engineering to solve real environmental challenges
  • Work with a diverse team of engineers, data scientists, and biologists
  • Develop skills across the full stack of modern data technologies
  • Build systems that directly contribute to environmental research and biodiversity monitoring
  • Grow your career in an expanding field at the intersection of technology and biology

Location


This position is available as remote within Canada with some preference for candidates who can occasionally visit our offices located at the University of Victoria on Vancouver Island in beautiful British Columbia. Applicant must be a Canadian citizen or have a work permit to work in Canada.


The Helbing lab is situated in the Department of Biochemistry & Microbiology at the University of Victoria. The rest of the eDNA Explorer team is in the United States. Check out the lab website here: The eDNA Explorer platform can be viewed here: .


How to Apply


Please submit your resume and a brief cover letter explaining your interest in eDNA Explorer and this role. Include examples of relevant projects you've worked on, particularly those involving data pipelines, cloud infrastructure, or scientific computing.


Submit your application by email with the header "eDNA Explorer Canada SDML position" to Dr. Caren Helbing at . Applications will be evaluated on an ongoing rolling basis until the position is filled.

Date Posted: 23 May 2025
Job Expired - Click here to search for similar jobs