Systems EngineerSRE
1 mese fa
JOB TITLE: Systems Engineer/SRE
LOCATION: 100% Remote RemoteEST Time zone working hours it can be remote but need to travel for PI planning
DURATION: One year contract potential extension
START DATE: 2 Weeks
DUE DATE: 24 48 Hours
Interview Process/# of Rounds:
2 rounds
Top 3 Requirements:
- Experience with Data Platforms: Data platform engineering and infrastructure provisioning.
- Automation: Experience with Terraform CI/CD and DevOps tools
- Microservices & Kubernetes: Deploying microservices in Kubernetes
- Automation testing
- Experience with Sagemaker for ML tasks
- Snowflake for data sorting
Reason for Openings:
- Backfill for two positions
Primary Responsibilities:
- Provisioning infrastructure for MDP (Data Platform) and MGP (Growth Platform)
- Managing applications and infrastructure for Marriott
Team Structure:
- Director of Cloud Solutions and Infrastructure Team
- MGP BAU (Business As Usual): Maintaining existing infrastructure but shifting to new infrastructure
Previous Roles of Candidates:
- Leading MGP BAU with one lead and two offshore resources
- Provisioning and designing clusters deploying applications
Collaboration with Other Teams:
- MGP Applications Team
- Enterprise Architect
- Application Team
Snowflake Specifics:
- Focus on connectivity ensuring data security and compliance with standards
Technology Stack:
- Infrastructure Provisioning: Terraform
- DevOps Tools: Jenkins Harness
- Role Focus: DevOps (not specifically SRE)
Job Description:
JOB SUMMARY
The Sr. Systems Engineer/SRE Cloud Solutions Engineering ensures the reliability scalability and efficient operation of information systems and technologies that support Marriott Internationals Infrastructure and Cloud Delivery Strategy. This role leads the technical delivery of infrastructure and Cloud Delivery projects. Serves as a subject matter expert in a complex array of full stack solutions especially data engineering and data science platforms. The role is a discipline which combines both software and systems engineering to build and operate largescale distributed faulttolerant systems. Performs research analysis design creation and implementation of systems to meet current and future requirements while enabling customer facing and enterprise products/platforms. Partners with business technology leaders architects and other engineers across other disciplines to engineer systems as well as the systems that deliver those systems. Engineers systems to ensure that Marriotts services have the necessary resiliency and uptime appropriate to user needs while being able to deliver a fast rate of change functional and nonfunctional improvements. Provides solutions that serve our business leveraging current and leadingedge technologies in an innovative and impactful manner.
CANDIDATE PROFILE
Required:
Undergraduate degree in an engineering or computer science discipline and/or equivalent experience/certification
7 years experience of progressive IT engineering experience that includes:
o 7 years building very highly available and distributed systems
o 3 years production level experience with reliable and secure cloud scale infrastructure (Preferably AWS)
o 3 years strong programming experience in one or more of the following languages: C C Java Python Go Perl
o Build the infrastructure required for optimal extraction transformation and loading of data from a wide variety of data sources using SQL and AWS big data technologies.
o 3 years experience of Data Lake/Hadoop/ Data Science platforms
o Production level expertise with containerization orchestration engines such as Kubernetes and data science technologies such as EMR sagemaker snowflake
o Experience working within software development or Internetrelated industries particularly in the context of a SaaS offering.
o Experience with modern continuous development techniques and pipelines (Agile Kanban CI/CD Jenkins Git Artifactory)
Familiarity with security frameworks such as ISO27001 SOCII PCIDSS and/or HIPAA
Curious and selfdriven to ask very difficult questions and capable of leading change in a diverse organizational landscape
Strong written and oral communication skills with a high degree of comfort speaking with engineering management developers and leadership
Demonstrated ability to adapt to new technologies and learn quickly
Preferred:
Graduate Degree in Computer Science or Computer Engineering
2 years experience in working with Kubernetes ContainerasaService Platforms (Docker Enterprise Red Hat OpenShift Enterprise Amazon EKS (Elastic Kubernetes Service) Mesosphere)
Experience with data processing frameworks and technologies such as Apache Spark Kafka Streams Astronomer Airflow
Experience working with core AWS services such as EKS EC2 EFS EMR Aurora EBS S3 DocumentDB ElastiCache Lambda etc.
Experience with Security Protocols (SSL SAML SAMP LDAP etc.) and controls (container scanning log aggregation network scanning CVE)
Familiarity with both relational (Oracle DB2 MySQL/MariaDB PostgreSQL MSSQL Server) nonrelational (Cassandra Couchbase MongoDB) database technologies and database caching technologies (Memcached Redis)
Excellent understanding of change management testing requirements techniques and tools to ensure high availability of systems
Strong attention to detail with an ability to operate effectively across multiple priorities
Infrastructure operations experience including selfhealing autonomy
Experience in researching emerging technologies and trends standards and products
Experience in developing technology roadmaps and strategies
Demonstrated experience learning and applying modern technologies to solve business needs
Excellent problemsolving skills working independently and through leading outcomes for cross functional teams
Experience operating in Scaled Agile Framework
CORE WORK ACTIVITIES
Ensure the highest level of uptime and Quality of Service (QoS) to Marriotts customers through operational excellence
Define service level objectives (SLOs) and service level indicators (SLIs) to represent and measure service quality
Embed with product teams (physically and/or virtually) to foster strong collaboration/partnership
Identify areas and drive initiatives to improve service resiliency through techniques such as chaos engineering performance/load testing Observability AIOps etc.
Support and maintain globally distributed multicloud (public and/or private) cloudscale environments
Automate common repeatable tasks at large scale to streamline operational procedures
Design and maintain production monitoring systems
Troubleshoot performance and stability issues using a wide variety of tools
Follow change management processes during implementations
Work in a diverse and global team environment
Participate in an oncall rotation as required
Determine rootcause for all production level incidents and write corresponding highquality RCA reports
Embrace the Site Reliability Engineering (SRE) mindset
Leadership
Providing leadership oversight governance and strategic direction related to Cloud Services solution delivery
Providing technical expertise and technical leadership within own and other teams
Investigating and resolving complex and critical incidents and problems
Participating in architectural discussions and providing expert advice
Designing and implementing changes that require deep technical understanding and expertise
Collaborating with engineering teams to develop and deploy new cloud services or enhancements
Ensuring 4 9s (or better) reliability for all the critical services
Coordinating between onshore/offshore engineering and operations teams; handover and acceptance of L1/2 shared services Ops
Managing Projects and Priorities
Functions as a strategic senior technical expert within the department.
Develops specific goals and plans to prioritize organize and accomplish work.
Champions leaders vision for product and service delivery.
Makes and executes the necessary decisions to keep moving forward toward achievement of goals.
Provides direction and assistance to other teams regarding projects.
Determines priorities schedules plans and necessary resources to promote completion of any projects on schedule.
Analyzes information and evaluates results to choose the best solution and solve problems.
Reviews vendor proposals and selects appropriate vendor for services/technologies/hardware.
Thinks creatively and practically to develop execute and implement new project plans.
Generates and provides accurate and timely results in the form of reports presentations etc.
Plans develops implements and evaluates the quality of operations.
Delivering on the Needs of Key Stakeholders
Understands and meets the needs of key stakeholders.
Communicates concepts in a clear and persuasive manner that is easy to understand.
Demonstrates an understanding of business priorities.
Supports achievement of performance goals budget goals team goals etc.
Providing Technical Support and Consultation
Provides recommendations to improve the effectiveness of processes and programs.
Demonstrates advanced knowledge of jobrelevant issues products systems and processes.
Demonstrates advanced knowledge of functionspecific procedures.
Applies knowledge/judgment to achieve business goals.
Foresees identifies and resolves problems.
Keeps uptodate technically and applies new knowledge to job.
Performs other reasonable duties as required for this position.