Remote Cloud Site Reliability Engineer Manager

Company

For more than 30 years, we have been providing industry-specific, cloud-based business management software and services to small and medium-sized businesses. With divisions focused on manufacturing, wholesale/retail distribution, building and construction, and field service, we integrate into every aspect of a customers’ business to help them level the playing field, run day-to-day operations more efficiently, and free them up to focus on what matters most. It’s how business gets done. Privately held with more than 16,000 customers, we are headquartered in Fort Worth, Texas, USA, with offices and companies throughout the U.S., Australia, New Zealand, England and the Netherlands.

Overview

**Must have software deployment cloud experience.

**Must have experience managing a small team.

The Cloud Site Reliability Engineer Manager is critical to the success of the company and its customers. The SRE will fill the gap between Cloud OPs and the Product/Dev Teams focusing on improving the delivery of our products to the customers. This role will work closely with the product dev teams, participating in weekly sprint planning to provide support/consulting and advocate for the improvements needed to provide a world class hosting experience. The SRE will be responsible for building systems and tooling to enable and empower the dev teams to work more efficient while keeping a cloud-first mentality. This is an internal product-facing role that will work and collaborate closely with development teams, support teams, architects and peer engineers for planning, development, and implementation of solutions for various systems.

Responsibilities

* Drive Thought Leadership into the team driving towards the SRE Mission Statement

* Manage the Team’s weekly project work through status updates against deadlines

* Drive Monthly reports from each SRE for their BU’s uptime and reliability metrics

• Handle HR duties, Review and approve Time-Off, Coordinate Schedules, Perform Annual Reviews • Design processes for improving operational stability of the Cloud.

* Identify, document and help improve performance and operational efficiency challenges

* Create tooling with documentation to scale the Cloud

* Validate and enforce best application security practices

* Participate in incident management on-call rotation and drive root cause analysis

* Collaborate with engineering teams, product owners and other stakeholders to develop tooling and CI/CD procedures

* Continual development of monitoring tools and best practices

* Help drive capacity requirements and planning

* Ability to function in a DevOps atmosphere

* Support and manage cloud infrastructure and environments (AWS, Azure, IBM, Private Cloud)

* Ability to drive standardization across the team by building repeatable processes to ensure Cloud Stability

* Complies with security standards and technical design

* Complies with ITSM standards and practices

Qualities and Skills Required

* Bachelor’s Degree in Computer Science, Engineering, IS

* 5+ years’ experience in a 24×7 high-availability production Cloud environment

* Configuration management and automation tools such as Ansible, Terraform, vRA, etc

* Experience with CI/CD tools and implementing best practices

* Strongly prefer prior experience in Microsoft Windows (Server and Guest OS)

* Experience with virtualization technologies such as VMWare

* Experience with Active Directory, PowerShell, SQL, Microsoft Remote Desktop Services

* Experience with configuring and extending monitoring tools

* A background in automating the management of a data center environment

* Experience with cloud-based IAAS (AWS, IBM Cloud)

* Good understanding of Software Development Lifecycle

* Excellent analytical and problem-solving skills

* A passion for system stability, performance, scalability and customer success

* Ability to work with minimal supervision, making decisions based upon priorities, schedules and an understanding of business initiatives

* Strong interpersonal and team building skills

* The desire to take advantage of training and learning opportunities

Key qualifications

* Skills: Shell Scripting, SQL, Amazon Web Services, VMWare, Month End Close, ITSM, SDLC, Microsoft Windows, Azure, Terraform
* Minimum education: Bachelors
* Years experience: 5+ years
* Schedule details: 5 days/week

Related Post