Azim Masoudinejad

Email, LinkedIn,Github

Site Reliability Engineer with 6+ years of experience automating, scaling, and securing large-scale infrastructures. Skilled in Kubernetes, Terraform, CI/CD, and cloud platforms (GCP, AWS, Azure), with a track record of zero-downtime migrations, cost optimization, and observability improvements.

Professional Experience

DevOps Engineer at Rabobank

December 2024 - Present | Utrecht, Netherlands | Observability team

A leading international cooperative bank headquartered in the Netherlands, specializing in food & agriculture financing and sustainable banking solutions.

  • Evaluated and benchmarked incident management solutions with multiple vendors; advised stakeholders and leadership on strategic decision-making.
  • Designed and delivered Kubernetes, Terraform, and Docker training sessions for 20 engineers, boosting technical expertise across the team.
  • Redesigned and streamlined EKS cluster upgrade processes to ensure application reliability and minimize downtime.
  • Developed secure REST APIs based on cross-team requirements; collaborated with stakeholders to validate and align on functionality.
  • Collaborated cross-functionally to onboard ~30 teams onto the incident management platform, driving adoption and operational alignment.

Site Reliability Engineer at Catawiki

May 2022 - December 2024 | Amsterdam, Netherlands | Platform team

Catawiki is the curated marketplace in Europe for special objects with 10+ million unique monthly visitors.

  • Led the migration of 30+ on-premise MySQL database machines to Google Cloud.
  • Led zero-downtime upgrades of MySQL databases, including all associated backup and data import/export components, from Ubuntu 16 to Ubuntu 20. This involved updating related Ansible roles, Terraform resources, and Python scripts to ensure a smooth and reliable transition.
  • Provided a detailed Kafka solution analysis and cost estimation as part of a plan to upgrade Kafka components, enabling the tech department to make a transparent and well-informed decision.
  • Implemented and configured OpenTelemetry and Lightstep (ServiceNow) components, delivering dashboards that improved application error detection, making it faster, easier, and more visual.

Site Reliability Engineer at Sheypoor

Jul 2021 - May 2022 (11 Months) | Tehran, Iran | Platform team

Sheypoor is the leading ads and secondhand marketplace, with more than 2+ million daily active users.

  • Tuned Kubernetes Clusters and implemented GitOps model using ArgoCD
  • Implement Jenkins Configurations and CI/CD pipelines as code (Infrastructure as Code), which resulted in more scalability, stability, and reliability.
  • Implement Vault and Consul solutions as code by Ansible and Terraform to increase security.
  • Hold multiple Knowledge-sharing sessions with 5 development teams.

Site Reliability Engineer at Myket

Jan 2021 – Jul 2021 (7 Months) | Tehran, Iran | Platform team

Myket is a leading Iranian Android store, with +25 million active installations, +10 million monthly active users, and +7000 TB Network Traffic.

  • Implemented and maintained Loki to centralize logs and customers’ data.
  • Maintained and Improved Monitoring by implementing Prometheus Alertmanager with Grafana integration
  • Tuned Nginx web servers to decrease request time by 5%.

Technical Author at Linux Professional Institute (LPI) (Freelance)

Jul 2020 - Feb 2021 (8 months)

Linux Professional Institute is the global certification standard organization for open source professionals.

  • Wrote the lesson about Docker-Machine and Docker Security.
  • Wrote the lesson about cluster monitoring using Prometheus.

Network Engineer at “Peyvand Azad”

Sep 2019 - Jan 2021 (1 year 5 months) | Tehran, Iran

Peyvand Azad was established to provide Internet services to enterprises and small businesses.

Network Engineer at “Fanava Group” (Internship)

June 2016 - Sep 2016 (4 months) | Tehran, Iran

Fanava Group was offering Datacenter services to customers.

Certifications

  • Cisco Certified Network Professional | Cisco | Aug. 2018 - Feb. 2022
  • Cisco Certified Network Associate | Cisco | Aug 2018. - Feb. 2022
  • Microsoft Certified Solution Associate (MCSA 2012) | Microsoft | Nov. 2017

Education

Bachelor’s Degree | Computer-Hardware K.N Toosi University of Technology | Tehran, Iran | 2013-2018