Company

Government Recruitment ServiceSee more

addressAddressFY4 5ES
type Form of workFull time
salary Salary£69,869 to £89,995 per year
CategoryIT

Job description

Are you someone who has excellent stakeholder management and problem-solving skills?

Do you like finding the root cause of a problem and building automated solutions to make sure it doesn’t happen again?

If so, we’d love to hear from you.

The Health Team manages multiple applications which support some of DWP’s most vulnerable customers.

Key digital applications include Personal Independence Payment; Employment Support Allowance; Access to Work.

In addition, the team manage a number of digital projects that are transforming services for citizens and DWP agents.

Health and Disability has expanded significantly and there is a need to support and maintain our applications 24/7. You will work closely with the engineering community and the product teams which cover both AWS and Azure services.

Your primary task will be to lead a small team of SRE’s that act as the ‘gatekeepers’ of Production, and also actively manage the work backlog and develop reliability improvements as well as leading initiatives to develop the automation of low-value tasks balanced against project delivery demands.

You will provide technical leadership and to wider operational teams along with providing oversight to the products and services they support.

The successful candidate will act as a Lead SRE, solidifying the service wrap for Health applications, analysis of error budgets, enhancing the current integrated monitoring solution, creation of service improvements, and lead on incident analysis and resolution, working alongside multiple teams.

  • Will work across multiple teams as an engineering specialist defining organisation engineering standards.
  • Pushing a mindset change within the organisation to foster engineering ownership, and the importance of the integrity and maintenance of the Live Service.
  • Design and develop the techniques for improving application reliability, run books, knowledge transfer, and ongoing SRE strategy within Health and the wider engineering community.
  • Manage the error budget agreed with the product owner for the application and ensure that work is balanced in alignment with it.
  • Work collaboratively with Health teams throughout the investigation and resolution of major or complex incidents for the service, ensuring people with the right skills and expertise are proactively available to respond effectively.
  • Assess the impact of change requests in consultation with stakeholders, providing technical expertise and advice.
  • Requirement to provide on call 24/7 support, approximately 1 week in every 4-week period (On call allowance is payable for this)

In addition to the essential criteria below, Experience of working with any of the following technologies is also desirable: Terraform, Ansible, Packer, Java, Javascript, MongoDB, Docker, JSON, YAML, SQL, Git, GitLab CI.

Refer code: 2606101. Government Recruitment Service - The previous day - 2024-01-24 16:13

Government Recruitment Service

FY4 5ES

Share jobs with friends

Lead Site Reliability Engineer

Chase

London, Greater London

5 months ago - seen

Lead Site Reliability Engineer

Department of Work & Pensions

£69,869 - £89,995/annum

Sheffield, South Yorkshire

5 months ago - seen

Lead Site Reliability Engineer

Lloyds Banking Group

Competitive

Bristol, Bristol

5 months ago - seen

Lead Site Reliability Engineer

Lloyds Banking Group

£85,255 to £102,310

Bristol, England

5 months ago - seen

Lead Site Reliability Engineer

BCT Resourcing

£90,000 - £100,000 per annum

South East

6 months ago - seen