Head Of Site Reliability Engineering

Company	EfinancialcareersSee more
Address	South East
Form of work	Permanent, full-time
Salary	Competitive salary
Category	Energy

Job description

Overview:

This role will be responsible for ensuring the availability, latency, performance, efficiency, and stability of our client’s critical infrastructure. You will also collaborate with development teams to implement and maintain reliable and scalable systems.

Key Responsibilities:

Monitor and identify potential issues that could impact the availability of our systems.
Implement and maintain automated alerting mechanisms to notify the appropriate parties of potential outages or performance degradation.
Analyse performance metrics to identify and resolve latency bottlenecks in our infrastructure.
Implement performance optimization techniques and tools to improve the overall responsiveness of our systems.
Work with development teams to ensure that new features and code changes do not introduce performance regressions.
Develop and maintain metrics dashboards to track key performance indicators (KPIs) for our critical systems.
Identify performance trends and anomalies that may indicate potential issues or areas for improvement.
Optimize resource utilization and minimize unnecessary expenditure on IT infrastructure.
Identify and implement cost-effective solutions to improve the efficiency of our IT operations.

Release Management:

Design and implement automated deployment and rollback procedures to mitigate risks associated with software updates.
Monitor the performance of new releases and address any issues that arise promptly.
Lead the team that executes the release management.
Design, implement, and maintain a comprehensive monitoring infrastructure to track the health and performance of our systems.
Analyse monitoring data to identify potential issues and proactively troubleshoot problems before they impact users.
Develop and implement alerts and notifications for critical events to ensure timely intervention.
Build and lead the team that responds promptly to incidents and works collaboratively to resolve them in a timely manner.
Analyse root causes of incidents to identify and implement preventive measures to minimize their recurrence.
Document incident responses and communicate lessons learned to enhance our incident handling processes.
Collaborate with your peers on the leadership team to define a multi-year technical roadmap. Stay up to date with industry developments and enterprise infrastructure, and anticipate significant risks.

Required Expereince:

10+ years of experience as a Site Reliability Engineer or equivalent in a similar role.
Proven experience in monitoring, analysing, and optimizing the performance of large-scale distributed systems.
Expertise in Linux systems administration, including managing servers, operating systems, and network configurations.
Strong scripting and automation skills, preferably with experience in Bash, Python, or similar languages.
Troubleshooting and problem-solving skills with a knack for identifying and resolving complex technical issues.

Desired Experience:

Bachelors degree in Computer Science, Information Technology, or a related field.
Familiarity with AWS.
Experience with DevOps tools and practices, such as GitLab CI/CD, and Docker.

Refer code: 3285832. Efinancialcareers - The previous day - 2024-05-06 02:08

Efinancialcareers

South East

Jobs feed

Mobile Electrical Engineer

Fawkes And Reece

Lancashire, England

£27,000 - £41,000 per annum

just now

Investment Implementation Manager - leading MPS provider

Efinancialcareers

South East

Competitive salary

just now

Postroom Operator

Randstad Perm Professionals

Essex, England

£12.00 - £13.30 per hour

just now

Site Supervisor

Its Ltd

Dorset, England

just now

Office Administration Assistant

Bennett & Game Recruitment

West Midlands, England

£20,700 - £23,500 per annum

just now

Executive Assistant to C-Suite

Robert Walters

London, England

£70,000 - £75,000 per annum

just now

Multi Skilled Maintenance Engineer

Cv Technical Ltd

North West

£36,000 - £37,000 per annum

just now

Part-time Team Assistant - Maternity cover

Crone Corkill

London, England

just now

Project Engineering Manager - Energy

Hays Specialist Recruitment Limited

Lancashire, England

Salary negotiable

just now

Site Manager

Tsr Recruitment Limited

Manchester, Greater Manchester

£45,000 - £55,000/annum plus car allowance / company car

just now

Head Of Site Reliability Engineering

EfinancialcareersSee more

Job description

Mobile Electrical Engineer

Investment Implementation Manager - leading MPS provider

Postroom Operator

Site Supervisor

Office Administration Assistant

Executive Assistant to C-Suite

Multi Skilled Maintenance Engineer

Part-time Team Assistant - Maternity cover

Project Engineering Manager - Energy

Site Manager

Related jobs

Head Of Site Reliability Engineering

Infrastructure engineer

Senior Electrical Engineer

Senior Back End Engineer

Head of Engineers (Alarms)

Senior Software Engineer

Robotics Engineer - Head of QA

Senior Software Engineer Android/ TypeScript

Principal Engineer/ Head of Offshore Wind

Chartered Deputy Head of UK Design & Engineer

Cylinder Head Trainee Engineer

IT Support Engineer (Head Office)

Head Engineer (1 year relevant experience required)

Back End Engineer

Synapse Data Engineer

Head Mechanical Engineer

Head of Design / Senior Design Engineer

Senior Software Engineer .net

Head of AV Engineers

Head Of Site Reliability Engineering

EfinancialcareersSee more

Job description

Share jobs with friends

Related jobs

Head Of Site Reliability Engineering

Explore trending job searches in the United Kingdoms

Top States

Top Cities

Top Job Titles

Highest Paying Jobs