Skip Navigation

Site Reliability Engineer

Experian is an Equal Opportunity Employer Including Disability/Veterans Job Number: 188774 Cyberjaya, Malaysia

Description

Site Reliability Engineering (SRE) is an approach to running and evolving production systems,using methods, concepts and practices from the discipline of Software Engineering and Systems Engineering.

The focus is to achieve reliable and fault-tolerant systems, through heavy use of automation and smart monitoring, while ensuring performance and evolving capacity.

As an Experian Decision Analytics SRE, you will be responsible for running and evolving the flagship SaaS platform that delivers complex financial service solutions - customer acquisition workflows, data enrichment, strategy - to our Enterprise clients.

Responsibilities

  • Maintain SaaS platform services by monitoring capacity, availability, latency and overall systems health

  • Own incident management process and response

  • Optimize existing systems and infrastructure through automation and engineering practices

  • Engage Development organization during inception and design phase to ensure that

  • new services/systems meet the non-functional acceptance criteria

  • Evolve systems capacity and scalability

  • Continuously improve monitoring strategy focusing on incident prevention/early detection

Knowledge, Experience & Qualifications

Qualifications
  • MUST be willing to work on shift, at night
  • Computer Science degree or equivalent practical experience
  • Software Development experience
  • Experience in one or more of the following: Ruby, Python, Perl, Lua
  • Knowledge of Linux, being comfortable on the Linux command line
  • Experience with cloud infrastructure platforms: AWS or Google Cloud
  • Familiarity with container technology: Docker, Kubernetes, OpenShift
  • Experience with configuration management tools: Ansible, Puppet, Chef, Terraform
Preferred
  • Experience with Java
  • Experience with Java tooling: Maven, Gradle, Groovy, Artifactory
  • Linux System Administration experience
  • Continuous Integration/Continuous Delivery Concepts - Jenkins
  • Analytical problems solving and debugging skills
  • Strong communication skills