Site Reliability Eng.- Medellin

Perfil requerido:
Puesto: Administrador de Sistema
Lenguaje: N/A
Base de datos: N/A
Sistema Operativo: N/A
Aplicaciones: N/A
Hardware: N/A
Sexo: Indistinto

País: Colombia
Provincia: Medellin
Ciudad/Zona: Medellin

Vacantes: 3
Tipo: Full-time
Referencia: SRE

Otros datos:
Empresa: Importante Emp. Lider
Publicado: 11 de Septiembre 2020
Finaliza en: 48 dias, 18 horas y 52 minutos

Site Reliability Engineer

Job Description

We are software services company that renders services to clients in Europe and United States, we’re focused on providing managed delivery services to our clients where we bring an additional value by designing and building effective software delivery and operation teams based on Agile principles that help our client IT and Product organizations to incrementally develop their products and reliably support production operations.

Job content
We’re looking for a Site Reliability Engineer to join our client team to help modernize cloud infrastructure for a software product and ensure reliable operations in production environments. The product itself is a highly loaded web platform for the US SEC online data preparing and submission. The clients of our client (users of the platform), are world-class well known companies, financial institutions, top banks, therefore operational excellence is our way to go!

You will apply a software engineering mindset to maintain services hosted in Azure Cloud. The solution to many of the work challenges will come from you developing improvements in performance, efficiency and transparency of the infrastructure as a code. You will partner with the development and product engineering teams to design and implement secure performant solutions, optimization tasks as well as ongoing planned project work and with a team of experienced DBA engineers.

In this role you will be a valuable member of an international team, you will be collaborating with engineers across the world, located in Ukraine, Belarus, North America.

• Setup monitoring, alerting and provide 365/24/7 on-call support for production environments
• Respond to production incidents according to the agreed SLA
• Analyze, solve, communicate and correct issues in real-time per agreed SLA
• Provide on-demand support for non-production environments
• Perform scheduled maintenance and deployment activities
• Champion and implement a culture of DevOps to maintain a frictionless high quality platform infrastructure
• Champion and implement application and infrastructure monitoring and alerting to prevent client impacting issues by ensuring system availability, performance and scalability to maintain SLIs, SLOs, and SLAs
• Optimize application performance at scale
• Automate everything including system operational runbooks
• Define and support continuous integration and deployment pipelines (CI/CD) aligned to branching and quality assurance strategies
• Dive deep into technology and stay on the forefront of the latest tools, technologies, and strategies; help evaluate, prototype, and integrate them into work processes
• Perform with broad independence and deliver on project milestones and tasks on schedule while communicating progress regularly
• Build strong relationships with team members and hold each other accountable for quality expectations
• Learn continuously and apply lessons learned
• Evangelize best practices, eliminate bottlenecks, and improve process

Required Experience and Skills
• Experience developing infrastructure as code for software in Java or C# / .NET or Node.js to optimize application performance at scale
• Experience writing scripts in Ansible or Jenkins to manage infrastructure / software build and deployment in a continuous integration (CI) environment
• Experience writing scripts in PowerShell (in Windows environment) or Python/Bash (in Linux environment) to automate system operations as runbooks
• Experience implementing production performance, availability, and scalability monitoring and alerting best practices using a tool such as DataDog, New Relic, Dynatrace, or AppDynamics
• Experience as a global admin a Azure or AWS cloud-utility stack application
• Strong research/problem solving skills to resolve complex programming issues and implement longer term solutions to frequently occurring issues
• Readiness to work with minimal direction and supervision
• Strong communication skills (written and verbal)
• Upper-intermediate written and spoken English

Nice to have
• Experience building and deploying Infrastructure as Code with Terraform or similar technology
• Experience planning, coordinating, developing and executing all stages of test scripts
• Experience securing Windows or Linux systems in 24x7 production environment
• Experience with containerization and managing Kubernetes clusters
• Experience with common networking and load balancing protocols

If you consider that your profile is aligned with the requirements of the position, be sure to apply to empleo@itechCareer.com y/o empleoitc@gmail.com with Ref: SRE indicating the intended remuneration.

We guarantee absolute reserve.

Thank you very much!!


Nuestra Compañia - Prensa - Publicidad - Política de Privacidad - Aviso Legal
© Copyright 2009 - Todos los derechos reservados