Autonomy Data Infrastructure Engineer
About You and The Role
Zipline is at the forefront of a logistics revolution, using autonomous aircraft to deliver just-in-time, life-saving medical supplies on multiple continents, 7 days a week. We have completed over 200,000 deliveries; it took over 4 years to reach the first 100,000 deliveries. It took 8 months to reach the last 100,000 deliveries. Do you want to be a part of reaching 1 million life-saving deliveries in the next 2 years?
We believe access to medical care should not depend on your GPS coordinates. In service of our mission to operate at global scale, we’re growing our perception capabilities to expand quickly and safely into new products and locations, with the ultimate goal of delivering essential packages right to your doorstep. Our Autonomy team is looking for an Autonomy Data Ops Full Stack Engineer to build pipelines for curating datasets for Autonomy development and to build metrics that matter for ensuring we have the right data powering our Autonomy system.
What You'll Do
- You will be responsible for the architecture, design and development of novel perception data pipelines and annotation tooling to deliver high quality datasets aimed at solving Zipline’s critical autonomy problems
- You will interface with third party vendors to support annotation and audit of our data
- You will partner with software, machine learning and operations teams to capture ground truth specifications, annotation throughput and quality
- You will build pipelines to curate high quality datasets including automating data subsampling, annotation, audit, visualizing and evaluating data
- Raise the bar on data quality by developing methods and processes to find and correct annotation errors
- You will build metrics dashboards for tracking metrics such as data downselection rate, annotation throughput, audit correction rate, and costs
- You will contribute to faster annotation with novel pre-labeling pipelines and annotation tools
- You will work with autonomy teams to identify shortcomings of data and help improve model accuracy by delivering better datasets
What You'll Bring
- Production level experience with data ETL andcuration for ML based solutions
- Deep expertise in one or more of the following areas: Data ETL, Big Data Pipelines, UI/UX for Data Annotation and Audit
- AWS experience with S3, IAM, Lambda, VPC, Redshift
- Strong Background in computer science: algorithms and data structures•
- Proficiency with databases and SQL
- 3+ years experience with large scale production services and/or web applications
- Experience with AWS Services such as Amazon S3 EC2 EKS / Kubernetes
- Strong technical communication (both written and verbal), prioritization, and time management skills
What Else You Need to Know
The starting cash range for this role is $145,000 - $185,000. Please note that this is a target, starting cash range for a candidate who meets the minimum qualifications for this role. The final cash pay for this role will depend on a variety of factors, including a specific candidate's experience, qualifications, skills, working location, and projected impact. The total compensation package for this role may also include: equity compensation; discretionary annual or performance bonuses; sales incentives; benefits such as medical, dental and vision insurance; paid time off; and more.