We are looking for Azure Site Reliability Engineer for our customer based in Brussels.
Requirements:
* Automation and CI/CD: Design, create, and maintain automation frameworks for deployment, scaling, and managing productive environments.
* System Monitoring and Maintenance: Implement and manage monitoring tools to ensure system health and performance. Proactively identify and fix issues before they impact users.
* Incident Management: Respond to and resolve incidents in a timely manner, perform root cause analysis, and implement measures to prevent recurrence.
* Performance Optimization: Analyze system performance and implement improvements to ensure scalability and efficiency.
* Capacity Planning: Conduct capacity planning assessments to predict system needs and ensure resources are in place to handle growth.
* Collaboration: Work closely with development teams to integrate systems reliability into the development lifecycle through continuous integration and deployment practices.
* Documentation: Create and maintain comprehensive documentation related to systems architecture, configuration, and operational procedures.
* Tool Development: Develop and maintain internal tools to streamline processes and improve system reliability.
* Security: Ensure that security controls are implemented, monitored, and maintained across all systems.
* Service Level Objectives (SLOs): Define and track Service Level Objectives (SLOs) to ensure reliability metrics meet business requirements.
* On call Support: Participate in on call rotations to provide 24/7 support for critical systems and infrastructure.
Qualifications:
* Experience: Minimum of 5 years in a Site Reliability Engineer or DevOps role with extensive experience in Microsoft Azure.
* Advantageous: Microsoft Azure certifications, such as Azure Solutions Architect Expert, or Azure DevOps Engineer Expert
Technical Skills:
1. Proficient in scripting languages (Python, Azure CLI, PowerShell).
2. Experience with containerization technologies (Docker, Kubernetes).
3. Proficiency in Azure Cloud services (VMs, Storage, Networking, etc.).
4. Experience in Infrastructure as Code (IaC) tools such as Terraform, ARM templates, or Bicep to automate secure provisioning and configuration of Azure resources.
5. Strong experience with monitoring, logging and alerting tools such as Azure Monitor, Application Insights, or Log Analytics and third party solutions like Grafana, Splunk or Elastic Stack.
6. Strong understanding of cloud networking, hybrid cloud, and virtual networking concepts (e.g.: VPNs, subnets, NSGs, load balancing, hub & spoke).
7. Experience in Azure governance and cost management using Azure Cost Management, Azure Policies, and managemen