MLOps Engineer
Yearly B2B contract (Freelance)
Location Belgium-Haren (once a week or 3 days per month onsite)
Position Summary
We are looking for an experienced MLOps Engineer to architect and operationalize scalable machine learning infrastructure on Azure within a decentralized data platform environment. You will own the complete ML lifecycle—from development through production—leveraging a hybrid Azure ML and Databricks ecosystem, using infrastructure-as-code practices and MLflow to deliver automated, reliable, and cost-effective ML operations. This role requires building MLOps capabilities that align with data mesh principles, treating data and models as products with clear ownership and domain-driven architecture.
Core Responsibilities
Infrastructure & Automation
* Collaborate with cross-functional infrastructure and platform teams to design and deploy production-grade MLOps infrastructure on Azure using Terraform, adhering to data mesh principles of decentralized domain ownership
* Work alongside DevOps and platform engineers to build reusable Infrastructure as Code (IaC) templates for ML environments, covering compute resources, storage, networking, and security
* Partner with team members to ensure infrastructure is reproducible, version-controlled, and optimized for scalability across multiple domain-oriented data products
* Contribute to team efforts in establishing infrastructure standards and best practices for ML workloads
* Provision and manage Azure ML workspaces, compute clusters, and related resources alongside Databricks infrastructure
ML Lifecycle Management
* Develop automated end-to-end ML pipelines covering training, validation, deployment, and monitoring within a federated data architecture
* Implement ML workflows using both Azure ML and Databricks, selecting the appropriate platform based on use case requirements
* Implement experiment tracking, model versioning, and artifact management using MLflow integrated with both Azure ML and Databricks environments
* Leverage Azure ML's model registry and Databricks MLflow Model Registry for unified model governance across platforms
* Manage model promotion workflows across development, staging, and production environments
* Design and implement feature store solutions for centralized feature engineering, versioning, and serving across ML workloads
* Enable feature reusability and discoverability to support consistent model development across domain teams
Data Mesh & Product Thinking
* Build MLOps functionalities within a development data platform following data mesh architecture principles
* Apply data-as-a-product mindset to ML models and features, ensuring they meet quality, discoverability, and usability standards
* Establish domain-agnostic MLOps capabilities that can be consumed by autonomous domain teams
* Implement self-serve ML infrastructure enabling domain teams to independently develop, deploy, and manage models
* Define and enforce data product standards including SLAs, data contracts, and quality metrics for ML features and models
Platform Engineering
* Configure and optimize both Azure ML compute instances and Azure Databricks clusters for performance and cost efficiency across federated domains
* Integrate Azure ML pipelines and Databricks workflows with CI/CD systems to enable seamless, automated model deployments
* Establish interoperability between Azure ML and Databricks ecosystems, enabling data scientists to leverage strengths of both platforms
* Establish best practices for platform usage and ML workflow orchestration in a decentralized environment
* Build feature store infrastructure (Azure ML Feature Store, Databricks Feature Store) that supports cross-domain feature sharing while maintaining domain autonomy
Monitoring & Operations
* Build comprehensive monitoring systems to track model performance, data drift, feature quality, and infrastructure health
* Implement monitoring solutions that span both Azure ML and Databricks deployments, providing unified observability
* Design automated alerting and incident response processes for pipeline failures and degradation
* Maintain operational visibility across the full ML stack using observability tools
* Implement governance and observability frameworks that provide transparency across domain-owned ML products
Required Qualifications
Cloud & Infrastructure - Hands-on expertise with Azure services including compute, storage, networking, and security tailored for ML workloads - Advanced proficiency in Terraform with proven experience managing complex, multi-environment infrastructure - Demonstrated ability to collaborate effectively with infrastructure and DevOps teams on shared platform initiatives
ML Platform & Tools - Deep knowledge of Azure ML including workspace management, compute resources, pipeline orchestration, model deployment (managed endpoints, AKS), and MLOps capabilities - Deep knowledge of Azure Databricks, including cluster management, job orchestration, and Azure integrations - Experience integrating Azure ML and Databricks ecosystems to create unified ML workflows - Extensive experience with MLflow for experiment tracking, model registry, model serving, and production lifecycle management across both platforms - Proven experience designing and implementing feature stores (Azure ML Feature Store, Databricks Feature Store, or Feast) for online and offline feature serving
Data Mesh & Platform Architecture - Understanding of data mesh principles including domain ownership, data as a product, self-serve data infrastructure, and federated computational governance - Experience building platform capabilities that enable autonomous domain teams while maintaining organizational standards - Ability to design ML systems that support decentralized ownership with centralized governance
Development & Automation - Strong Python programming skills with familiarity in ML frameworks (scikit-learn, TensorFlow, PyTorch) and data processing libraries - Demonstrated ability to build CI/CD pipelines for ML systems using Azure DevOps, GitHub Actions, or similar platforms, including automated testing and deployment strategies - Experience with Azure ML SDK/CLI and Databricks APIs for workflow automation
Deployment & Monitoring - Solid understanding of containerization (Docker, Kubernetes) for ML model deployment and scaling - Experience with Azure ML model deployment options including managed endpoints, AKS, and Azure Container Instances - Experience with monitoring and observability platforms such as Azure Monitor, Application Insights, or equivalent tools for tracking model and infrastructure metrics - Experience implementing data quality monitoring and feature drift detection in production environments