Key Responsibilities
Find out more about this role by reading the information below, then apply to be considered.
Infrastructure & Reliability
Design and implement multi-region AWS disaster recovery solutions, including fallback infrastructure for us-east-1 outages
Architect and maintain highly available, scalable cloud infrastructure across multiple AWS regions
Ensure infrastructure resilience through chaos engineering and disaster recovery testing
Development
Develop and deploy new features using Python and the Open Source Serverless Framework
Build and maintain serverless applications (Lambda, API Gateway, DynamoDB, etc.)
Write clean, maintainable, and well-tested code following best practices
Contribute to architectural decisions and technical design reviews
Platform Observability
Design and implement comprehensive observability solutions for production platforms
Set up monitoring, logging, and alerting using tools such as CloudWatch, DataDog, Grafana, or similar
Establish SLIs, SLOs, and error budgets to measure platform health
Create dashboards and on-call runbooks for incident response
CI/CD & Automation
Design, implement, and maintain CI/CD pipelines for automated testing and deployment
Automate infrastructure provisioning using Infrastructure as Code (Terraform, CloudFormation, CDK)
Implement security scanning, testing, and compliance checks in deployment pipelines
Optimize build and deployment processes for speed and reliability
Mentor and manage development teams, fostering a culture of technical excellence
Conduct code reviews and provide constructive feedback
Facilitate technical discussions and help unblock team members
Collaborate with product and engineering teams to deliver on roadmap priorities
Required Qualifications
5+ years of software engineering experience with strong Python development skills
3+ years of hands-on experience with AWS services (EC2, Lambda, S3, RDS, VPC, IAM, CloudFormation, etc.)
Proven experience building and deploying serverless applications (AWS Lambda, API Gateway, Step Functions)
Strong understanding of multi-region architecture and disaster recovery patterns
Experience designing and implementing CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, or similar)
Demonstrated experience setting up observability and monitoring solutions
Experience managing or mentoring development teams
Strong understanding of networking, security, and xphnsxz cloud best practices
Excellent problem-solving skills and ability to debug complex distributed systems
Preferred Qualifications
Experience with the Serverless Framework ( )
AWS certifications (Solutions Architect, DevOps Engineer, or similar)
Experience with Infrastructure as Code tools (Terraform, AWS CDK, CloudFormation)
Knowledge of containerization and orchestration (Docker, ECS, Kubernetes)
Experience with observability platforms (DataDog, New Relic, Grafana, Prometheus)
Familiarity with event-driven architectures and message queuing systems (SQS, SNS, EventBridge)
Experience with testing frameworks and test automation
Background in Agile/Scrum methodologies
Strong communication skills and experience working with cross-functional teams
#J-18808-Ljbffr