Lead Cloud Infrastructure Engineer
Certa
About Certa
At Certa, we’re revolutionizing process automation for top-tier companies, including Fortune 500 and Fortune 1000 leaders, from the heart of Silicon Valley. Our mission? Simplifying complexity through cutting-edge SaaS solutions. Join our thriving, global team and become a key player in a startup environment that champions innovation, continuous learning, and unlimited growth. We offer a fully remote, flexible workspace that empowers you to excel.
Position Overview
We are seeking an experienced DevOps Lead to own and optimize our AWS cloud infrastructure while leading our DevOps team to deliver exceptional product outcomes. This role combines hands-on technical excellence with strategic leadership, requiring someone who can architect robust cloud solutions, mentor engineers, and drive continuous improvement across our entire software delivery lifecycle. The ideal candidate brings a product mindset, balancing business objectives with technical innovation to create scalable, secure, and cost-effective cloud infrastructure.
Key Responsibilities
- Cloud Architecture & Infrastructure Management
- Design, implement, and optimize comprehensive AWS cloud infrastructure leveraging services including RDS, ECS, EC2, S3, SQS, CloudFront, Route53, VPC, and WAF.
- Develop scalable architecture solutions that align with product roadmaps and business requirements while maintaining high availability, performance, and reliability. Lead cloud adoption strategies and migration initiatives, ensuring smooth transitions and minimal disruption to business operations.
- Own infrastructure capacity planning and resource optimization to support growth while controlling costs.
Product-Focused Solution Development
- Partner closely with product managers and engineering teams to translate business requirements into effective cloud solutions.
- Design technical architectures that directly support customer needs, business goals, and cost optimization objectives.
- Evaluate and recommend cloud services and configurations based on performance requirements, security standards, and budget constraints.
- Act as technical advisor to leadership, providing guidance on cloud strategy, infrastructure investments, and technical decision-making.
Team Leadership & Mentorship
- Lead, mentor, and develop a team of DevOps engineers, fostering a culture of ownership, innovation, and continuous improvement. Provide technical guidance, code reviews, and knowledge transfer to ensure team members follow best practices and grow their skillsets. Set clear goals, priorities, and expectations while enabling team autonomy and distributed authority. Conduct regular one-on-ones, performance reviews, and career development discussions to support team growth. Build cross-functional collaboration between development, operations, security, and QA teams.
Security & Compliance Implementation
- Implement and enforce security best practices across all infrastructure components, including IAM policies, WAF configurations, encryption standards, and secrets management.
- Ensure compliance with industry standards and regulatory requirements through proper controls, auditing, and documentation.
- Integrate security into the CI/CD pipeline, implementing DevSecOps practices and automated security scanning.
- Manage incident response procedures, conducting root cause analysis and implementing corrective measures to prevent recurrence.
Platform Engineering & Self-Service Infrastructure
- Build reusable, self-service infrastructure components and frameworks that empower development teams and reduce operational friction.
- Design internal developer platforms that provide standardized, approved templates for rapid resource provisioning.
- Create modularized infrastructure code following Infrastructure-as-Code principles to promote consistency and maintainability.
- Implement guardrails and governance policies that enable developer autonomy while maintaining security and compliance standards.
CI/CD & Automation
- Lead continuous integration and deployment pipeline development using AWS CodePipeline, Jenkins, GitLab, or similar tools.
- Implement Infrastructure-as-Code practices using Terraform, AWS CloudFormation, and Ansible to automate provisioning and configuration management.
- Develop automation scripts and solutions to streamline software development, testing, and release processes.
- Automate manual operational tasks to improve efficiency and reduce human error.
Monitoring, Observability & Reliability
- Establish comprehensive observability solutions using CloudWatch, custom dashboards, alerts, and centralized logging.
- Define and track key performance indicators (KPIs) and service level objectives (SLOs) to measure system health and performance.
- Lead incident response efforts, conducting thorough root cause analysis and implementing system hardening measures.
- Implement proactive monitoring and anomaly detection to identify potential issues before they impact users.
- Ensure high availability and disaster recovery capabilities across all critical systems.
Continuous Improvement & Innovation
- Stay current with emerging cloud technologies, DevOps tools, and industry best practices.
- Research, evaluate, and implement new techniques and technologies that can improve team productivity and system performance.
- Drive DevOps culture transformation across the organization, promoting collaboration, automation, and data-driven decision-making.
- Conduct regular retrospectives and improvement initiatives based on metrics, incidents, and team feedback.
Required Qualifications
Education & Experience:
- Bachelor's or Master's degree in Computer Science, Engineering, or related field
- 6+ years of hands-on DevOps or cloud infrastructure experience, with at least 2 years in a lead or senior role
- Proven track record of designing and managing large-scale AWS cloud environments
Experience leading DevOps teams and mentoring junior engineers
Technical Skills: AWS Services: Deep expertise in RDS, ECS, EC2, S3, SQS, CloudFront, Route53, VPC, WAF, IAM, and CloudWatch
Infrastructure-as-Code: Proficiency with Terraform, AWS CloudFormation, and/or Ansible
CI/CD Tools: Strong experience with AWS CodePipeline, Jenkins, GitLab CI, CircleCI, or similar platforms
Containerization: Hands-on experience with Docker and Kubernetes/ECS orchestration
Scripting & Programming: Proficiency in Python, Bash, Ruby, or similar languages
Version Control: Expert-level knowledge of Git and version control workflows
Monitoring & Logging: Experience implementing observability solutions using CloudWatch, ELK stack, or similar tools
Security: Strong understanding of cloud security, IAM, encryption, compliance frameworks, and DevSecOps practices
Leadership & Soft Skills
- Excellent communication skills with ability to explain technical concepts to non-technical stakeholders
- Strategic thinking and ability to align technical solutions with business objectives
- Strong problem-solving abilities and analytical mindset
- Experience with Agile methodologies and collaborative team environments
- Ability to manage multiple priorities and projects simultaneously
- Demonstrated ability to drive cultural change and foster innovation
Why Join Us
- Compensation: Top-tier salary and exceptional benefits.
- Work-Life Flexibility: Fully remote, flexible scheduling.
- Growth Opportunities: Accelerate your career in a company poised for significant growth.
- Innovative Culture: Engineering-centric, innovation-driven work environment.
- Team Events: Annual offsites and quarterly Hackerhouse.
- Wellness & Family: Comprehensive healthcare and parental leave.
- Workspace: Premium workstation setup, providing the tech you need to succeed.
Ready to revolutionize SaaS infrastructure and accelerate your career trajectory? We’re excited to meet you—apply today!