Senior Site Reliability Engineer

PickMe
4 months ago
tie
0 Applied
Expired on: Apr 23 2024
tie

Ref.No 00004987

Description

Responsibilities:

  • Maintain, install, and configure Linux / Windows instances in Google Cloud environment.
  • Maintain and configuring cloud infrastructure components in Google Cloud environments.
  • Ensure high availability and reliability of systems through effective monitoring, incident management, and proactive troubleshooting.
  • Build CI/CD pipelines and perform deployments on GKE clusters.
  • Identify performance bottlenecks, optimize for scalability and cost-effectiveness.
  • Maintain and update technical documentation and internal processes.
  • Collaborate with different engineering teams in PickMe. (Development,L2,Data,Operations,QA etc).
  • Ability to work without minimal supervision.
  • Mentor team members.



Requirements:

  • Bachelor's Degree in Computing/Computer Science/IT from a recognised University or equivalent.
  • Around 3+ years of experience,
  • Managing one of the GCP/Azure or AWS Cloud environments.
  • With RHEL/ Ubuntu operating systems.
  • With Jenkins/Ansible/Terraform.
  • With one of the Go/PHP/Bash/Python languages.
  • Managing Kubernetes environments.
  • Working in Agile environments.
  • Experience with Observability and Monitoring tools (Prometheus,Grafana, OpenTelemetry etc)
  • Experience with incident response strategies.
  • Ability to identify and resolve performance bottlenecks in high-traffic environments.
  • Strong analytical and problem-solving abilities to address technical challenges.
  • Experience with security best practices.
  • Excellent communication skills.
  • Holding a CKA, CKS or GCP Associate or Professional level certification will be a definite advantage.



Skills
JAVA
PHP
C++
Analytical
Problem Solving
Industry Sector