Cloud SRE
1 settimana fa
Primary Responsibilities:
Responsible for Deployments to Cloud Infrastructure. Develop and setup Datadog monitors and tracking for Microservices and MicroFrontEnd applications.
Create custom metrics to track the page and API performance
Required Skillset:
Proven experience (10 years) working as an SRE with a specific focus on Microsoft Azure Cloud services and OCI
Deep understanding of Cloud services including Docker and Kubernetes Service API and tooling in Azure and OCI.
Proficiency in scripting and programming languages (e.g. PowerShell Python) for automation infrastructure management and tool development.
Experience with scalable networking technologies including Linux softwaredefined networking network virtualization open protocols App acceleration Load Balancers DNS virtual private networks and their application in PaaS and IaaS technologies
Strong incident management skills with a datadriven and analytical approach to diagnosing complex issues.
Familiarity with Infrastructure as Code (IaC) tools (e.g. Terraform ARM templates) and configuration management tools.
Excellent problemsolving skills attention to detail and a proactive attitude towards addressing operational challenges.
Effective communication and collaboration skills with the ability to work across teams and influence technical decisions.
Experience with CI/CD pipelines and version control systems (e.g. Git).
Develop and implement comprehensive monitoring and analytics solution using Datadog for a cloudbased microservices architecture
Develop dashboards using modern monitoring tools (e.g. Dynatrace AppDynamics Splunk etc)
Analyze monitoring data to identify trends and root causes of incidents leading to continuous improvement of system health.
A strong understanding of DevOps principles and automation practices