hero
3,039
companies
3,514
Jobs
If you are a Techstars portfolio companyclaim your profile.

Senior DevOps Engineer

Timehop

Timehop

Software Engineering
Framingham, MA, USA
Posted on Friday, May 17, 2024

Sincere is looking for a Senior DevOps Engineer to join our growing team. In this role, you will support our family of brands – Punchbowl, Timehop, and Memento. You will help to manage and improve Sincere’s AWS web infrastructure. You will be responsible for ensuring site reliability, monitoring and enhancing the observability of key systems, security, reliable deployment of services, disaster recovery planning, and CI. Additionally, you will have the opportunity to work closely with the development team to improve our processes and our Ruby on Rails applications and services.

Our team values uptime, stability, scalability, and reliability – and as a key member of our team you will have the opportunity to learn and apply new technologies to meet these needs as we grow.

You are:

  • Experienced with AWS infrastructure management experience: EC2, VPC, S3, ELB, ECS, IAM, Lambda, CloudFront, RDS, ElastiCache
  • Knowledgeable in infrastructure configuration management tools (Terraform, Chef)
  • An expert of Linux systems’ administration covering security, maintenance, backups, disaster recovery, storage management, monitoring, etc. (Ubuntu)
  • An expert in command line and bash scripting skills
  • Experienced with database administration (MySQL, PostgreSQL, MongoDB, Redis)
  • Experienced in how to use and maintain monitoring, logging, & observability tools (Datadog, New Relic, ELK stack, Grafana, Prometheus, Nagios)
  • Knowledgeable of systems and networking: DNS, SSL, SMTP, SSH, VPN
  • Well-versed in web infrastructure management: HAProxy, NGINX, understanding of HTTP
  • Familiar with Docker, containers, CI/CD, automation

You will:

  • Manage, plan, and execute system and software updates and upgrades when needed
  • Manage application deployments in coordination with the engineering team
  • Maintain and improve our monitoring systems to pre-empt issues that may affect our live environments
  • Prioritize and resolve live issues appropriately
  • Investigate and implement system improvements
  • Maintain and improve system documentation and runbooks
  • Plan and implement disaster recovery and backup plans on AWS