Skip to main content

Experience

DevOps and Site Reliability Engineer

CraftSchoolship · San Jose, Remote · Oct 2023 - Present

Key Highlights
  • Reduced AWS costs by 45% through FinOps practices
  • Automated 70% of operational tasks
  • Reduced MTTR by 50% with monitoring solutions
View all responsibilities
  • Designed and built cloud architecture on AWS, including ALBs and multi-region EKS clusters, by developing Terraform and CloudFormation templates, ensuring infrastructure consistency and reducing cluster provisioning time to 10 minutes.
  • Applied FinOps practices using Kubecost and load testing to analyze cloud spend and optimize resources, reducing AWS costs by 45%.
  • Automated 70% of operational tasks using Python and Bash scripts with cron scheduling, reducing manual effort and operational overhead.
  • Managed and troubleshot production Kubernetes clusters, ensuring reliability, availability, and security.
  • Built a deployment automation platform using GitHub Actions, Helm, and Argo CD, providing a centralized way to manage application deployments and configurations.
  • Implemented CI/CD pipelines with GitHub Actions covering linting, testing, build, and deployment, improving code quality and reducing deployment time to under 5 minutes.
  • Managed and optimized PostgreSQL clusters, improving performance and reducing resources usage by 35%.
  • Automated backup and restore workflows for Kubernetes resources and PVs (EBS, EFS), improving disaster recovery readiness.
  • Created Grafana dashboards for analyzing Loki logs, improving debugging and reducing troubleshooting time.
  • Built monitoring and alerting using Prometheus and Grafana, helping reduce MTTR by 50%.
  • Participated in on-call rotations, handling incidents and contributing to postmortems, documentation, and SOPs.
  • Implemented centralized authentication using Keycloak and managed company workspace, reducing employee on-boarding time by 70%.
  • Developed backend systems for payments, user management, and analytics, including AI-driven features, by building and maintaining REST APIs using Python and Go.
  • Implemented a service mesh using Istio and configured Jaeger distributed tracing to improve service observability and debug latency issues.

Software Developer Intern

Box2Home · Sousse, Hybrid · Feb 2023 - June 2023

Key Highlights
  • Built custom internal platform replacing expensive third-party system
  • Managed 100+ MySQL tables with NestJS, React, and Prisma
View all responsibilities
  • Developed a custom internal platform using NestJS, React and Prisma to replace an expensive third-party system, allowing developers to easily view and manage 100+ MySQL tables.
  • Implemented operations logging and role-based user management to ensure data integrity and security.
  • Deployed the platform with Docker and AWS ECS, enabling reliable and faster environment provisioning.