Overview
What you’ll bring
- BS in Computer Science or related field, or equivalent work experience
- 5+ years related experience including at least 2 years of experience either supporting Splunk or as a Splunk user creating content (dashboards, searches, alerts, etc.), at least 1 year supporting applications in AWS, and at least 1 year of experience with Python. Java or Go a plus.
- Monitoring tools setup and configuration
- Trouble shooting production issues
- Performance tuning of App/Web tier
- System Admin exposure
- Expertise in managing middle tier and app tier.
- Scripting expertise
How you will lead
- Responsible for driving operational excellence for the connected services that a business offers to its customers to deliver an “always on” operation, year round, at the right cost
- Uses your knowledge of technology and operational best practices to drive the design, development and implementation of operational standards and capabilities for connected services that enable highly available, scalable & reliable customer experiences
- Analyzes and synthesizes a variety of inputs to drives the end-to-end incident management process for multiple offerings
- Includes creating, developing & managing the deployment architecture for the application
- Developing the monitoring architecture and implementing monitoring agents, dashboards, escalations and alerts
- Developing and driving incident management processes, playbooks and stakeholder communication mechanisms
- Overseeing change management & configuration management operating mechanisms
- Driving root cause analysis (RCA) and risk management processes
- Driving ongoing improvements and efficiencies in operational practices, tools & processes BU and Intuit-wide