IT Operation Manager with J2EE stack-based applications (FULL TIME only)

Position IT Operation Manager Location Northridge, CA Primary Skills ITIL Service Management practices (Monitoring Event Management, Incident Management, Problem Management, Release Management), Service Now, Jira, Project Management Nice to have – New Relic, ELK stack Experience Level 12+ Years 12+ years of experience with minimum 4 years of experience in Operation managerProdOps Manager role for J2EE stack-based applicationssystems. Solid understanding of ITIL Service ManagementModern Service Management practices – Monitoring Event management, Incident management, Problem management, Release and Change management etc. Experience in managing critical incidents, leading effective postmortems Root-cause Analysis (RCAs), and driving continuous improvements to boost service reliability. Experience in defining and implementing solutions for proactive incident detection, response, and remediation. Ex. Proactive monitoring, Log analytics dashboards, Self-healing scripts etc. Experience in defining, tracking, and reporting of OLAs (Operational-level agreements), and SLAs (service-level agreements). Ability to track and monitor daily workload, request fulfillments. Prepare and publish regular and on-demand reports. Solid understanding of application lifecycle management and DevOps practices. Well-versed with ServiceNow, Jira, and Confluence tools. Experience driving IT Service Operations specific automation maturity for large, and complex systems environments. Strong people management, cross-functional collaboration, and Organizational skills. Ability to achieve results without direct reporting relationships. Self-motivated and pro-active with demonstrated problem solving and critical thinking skills. Strong verbal, written communication, including status reports, project plans, presentations, etc. Strong desire to find ways to improve solutions, systems, and processes and capable of enforcing processes and procedures. Nice to have experience defining monitoring requirement using Failure Modes and Effects Analysis (FMEAs) and working experience implementing Site Reliability Engineering (SRE) practices. Nice to have experience with New Relic (Monitoring Event management).

Related Post