Reliability Engineer – Automation & Observability

  •  Job Reference: 21382
  •  Published Date: 21/05/2026
  •  Salary: 100000000 (VND) - Negotiable
  •  Industry: IT- Software

Key Responsibilities:

  • Design and implement automation solutions (Ansible, Terraform, Rundeck, Python/Bash/PowerShell) to reduce manual work and improve efficiency
  • Enhance system reliability, availability, and performance across managed infrastructure services
  • Build and optimize monitoring, logging, and alerting systems (e.g., Splunk) for proactive issue detection
  • Develop and manage automated workflows integrating APIs and microservices to streamline operations
  • Drive continuous improvement initiatives with an automation-first approach
  • Collaborate with global teams and stakeholders to embed automation and improve service reliability

 

Must-have:

  • Strong knowledge of Linux systems
  • Proficiency in scripting (Python, Bash, PowerShell)
  • Experience with cloud platforms (AWS, Azure, or GCP)
  • Hands-on experience with monitoring tools like Splunk
  • Familiarity with automation tools (Ansible, Terraform, Rundeck)
  • Experience with Git and version control
  • Strong problem-solving skills with an automation-driven mindset