Data Engineer / Data Architect
- Collaborate with cross-functional teams to define the data architecture, data models, and data flow in our data warehouse, and to understand their data requirements and provide technical support as needed.
- Design, develop, and maintain scalable data pipelines using PySpark and AWS
- Implement data quality assurance practices to ensure data accuracy, reliability, and integrity by implementing data validation, cleansing, and transformation processes.
- Create and maintain documentation of the data architecture, data models, and data pipelines to ensure understanding and efficient collaboration among team members.
- Maximise the performance and scalability of the data warehouse in AWS for each unit of computation and storage cost.
- Optimise and tune data pipelines for performance and scalability.
- Implement data governance policies, data security and privacy best practices to protect sensitive information and comply with relevant regulations.
- Work closely with other team members to troubleshoot and resolve data-related issues.
- Continuously monitor the performance of the data infrastructure and make improvements as needed to ensure optimal efficiency and reliability.
- Stay up-to-date with the latest industry trends and advancements in data architecture and technologies to drive innovation and maintain a competitive edge.
- Bachelor's or Master's degree in computer science, Data Science, Statistics, or a related field.
- 3+ years of experience in data architecture and data engineering roles, preferably on AWS.
- Strong proficiency in PySpark, Python, Pandas, SQL, or other data intensive technologies.
- Strong experience with Data Warehouses
- Experience with DevOps and version control practices and tools such as Jenkins or GitLab
- Experience in ML Ops