Lead Data Scientist

The CHI Software team is not standing still. We love our job and give it one hundred percent of us! Every new project is a challenge that we face successfully. The only thing that can stop us is... Wait, it’s nothing! The number of projects is growing, and with them, our team too. And now we need a Lead Data Scientist.

About the project:

We are seeking a Lead Data Scientist to join a multidisciplinary team working on complex real-world challenges. This role involves developing full-service, end-to-end solutions where machine learning and advanced algorithms play a critical role. You will collaborate with software, hardware, and design experts to build systems that operate across cloud and edge environments. Projects typically include greenfield IoT data pipelines and full MLOps lifecycles, with deep integration into sensors, networks, cloud infrastructure, and interfaces.

Requirements:

Master’s degree in a Data Science-related field;
7+ years of industry experience in applied data science;
Proven experience deploying statistical or machine learning models to production;
Demonstrated leadership in cross-disciplinary engineering teams;
Strong expertise in real-time signal processing and analytics;
Expert-level proficiency in Python for data science and ML;
Production-level experience with AWS (especially SageMaker, IoT Core);
Experience in embedded software development (C/C++, Embedded C);
Competency in Linux-based development and basic system administration;
Strong written and verbal communication skills in English;
Experience with clients and presales support;
Successful track record in estimating, planning, and executing ML projects;
Hands-on experience with testing, model evaluation, CI/CD pipelines.

Will be a plus:

10+ years of experience in the industry, especially with connected/IoT devices;
AWS Professional certification;
Experience in embedded signal processing;
Familiarity with Elixir, Phoenix, Nerves;
Practical work with Raspberry Pi and other SBCs.

Responsibilities:

Lead the development and deployment of machine learning workflows in the cloud and on the edge;
Design and implement solutions for time series forecasting, anomaly detection, state classification, computer vision, and speech recognition;
Apply transformer models, composite AI, and agentic AI systems to real-world problems;
Plan and execute field testing of algorithms;
Develop MLOps infrastructure and ensure best practices;
Collaborate with design teams to deliver effective data visualizations;
Mentor team members, support hiring processes, and improve technical interviews;
Contribute to presales efforts and shape architecture for new solutions;
Ensure robust communication across technical and business domains.

Technologies You’ll Use:
Core tools: Git, GitHub, GitHub Actions (CI/CD), Docker, Terraform
Data science stack: Python, NumPy, SciPy, Pandas, Scikit-Learn, Matplotlib, SQL/Postgres
ML & orchestration: PyTorch, TensorFlow, PySpark, MLFlow, LangChain, Jupyter

Cloud platforms:
AWS: Lambda, ECS, RDS, DynamoDB, IoT Core, Greengrass, SageMaker, Bedrock
Azure: Functions, Container Registry, SQL Database, Azure ML, IoT Hub
3rd party: TimescaleDB, Ultralytics, Datadog, Peridio

Edge Computing & Embedded Systems:
Model deployment on edge devices (e.g., Raspberry Pi, SBCs)
Development in Linux, macOS, and embedded environments using Yocto
Languages: Embedded C, C++, Rust, Elixir (Phoenix/Nerves)
Hardware-in-the-loop testing, CI/CD for embedded systems