Principal AI/ML Engineer
Hackajob
As a Principal AI/ML Engineer, you will be responsible for the design, implementation, and scaling of AI systems for health applications at Optum. You will be responsible for the development and execution of the vision for enabling Agentic orchestration and deployment solutions on United AI Studio (enterprise ML platform). As a Principal Engineer, you will ensure technical tasks and properly detailed, designed and executed per existing standards. You are expected to play a key role in design activities and scoping and mentor juniors in execution.
You’ll enjoy the flexibility to work remotely* from anywhere within the U.S. as you take on some tough challenges. For all hires in the Minneapolis or Washington, D.C. area, you will be required to work in the office a minimum of four days per week.
Primary Responsibilities
- Design and Development: Lead the design, development, and deployment of observability solutions and Agentic orchestration features on the UAIS platform, ensuring scalability, reliability, and fault tolerance
- Agentic Framework Integration: Implement and optimize Agentic orchestration frameworks, ensuring seamless integration with UAIS platform features and observability systems
- Observability Frameworks: Build and maintain observability tools and infrastructure (e.g., logging, metrics, distributed tracing) to monitor the health, performance, and behavior of deployed systems and features
- Platform Engineering: Develop and enhance platform capabilities, including Kubernetes-based orchestration, CI/CD pipelines, and infrastructure automation
- Technical Leadership: Provide technical guidance and mentorship to offshore and onshore engineering teams, fostering collaboration and continuous improvement
- Code Quality and Standards: Write robust, scalable, and maintainable code while enforcing software development standards and best practices Documentation: Create and maintain detailed technical documentation, supporting team alignment and knowledge sharing
- Innovation and Continuous Improvement: Stay up-to-date with emerging technologies and methodologies, introducing innovative solutions for Agentic solution challenges
Required Qualifications
- Master’s degree or Ph.D. in computer science or related field; or bachelor’s degree and at least 7 years of relevant experience
- 7+ years of industry experience across AI/ML Engineering and Platform Engineering with at least 3+ years in lead role
- 5+ years of experience with demonstrated proficiency in Python
- 3+ years of experience and demonstratable ability to lead and mentor a technical team with a collaborative and results-driven approach
- Hands-on experience with Agentic / AI orchestration frameworks (Langgraph, ADK etc.) and AI observability tools (e.g., Langfuse, Arize, AppInsights, Prometheus, OpenTelemetry) in any cloud environment
- Experience working with multiple cloud platforms such as Azure, AWS, or GCP
- Experience with other programming languages such as Java and C++
- Familiarity with DevOps practices, such as containerization (Docker, Kubernetes) and infrastructure-as-code tools
- Experience influencing in a heavily matrixed organization