Data Operations Analyst (AI-Assisted Data Transformation & Structuring)
Careerflow
Software Engineering, IT, Operations, Data Science
India
Summary
Looking for a detail-oriented Data Operations Analyst to independently handle data extraction, transformation, and validation across structured (CSV) and unstructured (PDF) datasets. The role involves converting complex raw datasets into highly structured JSON formats using defined schemas, with zero data loss and strict quality standards. AI tools can be used to improve speed and efficiency, but 100% accuracy and manual validation is mandatory.
Key Responsibilities
1. Data Extraction
Extract and organize data from:
CSV datasets (NASA materials data)
PDF documents (BGS mining directory)
Use tools where needed, while ensuring data accuracy
2. Data Transformation
Convert raw data into structured JSON formats
Map all fields according to strict schema rules
-
Handle:
Complex property grouping (NASA dataset)
Supplier + site grouping (BGS dataset)
3. Data Structuring
Build:
Material-level JSON records
Supplier-level JSON objects
Ensure:
Correct categorization
Proper grouping of related data
Consistent naming conventions
4. Quality Assurance (Critical)
Validate every record before submission
Ensure:
No missing data
No incorrect mappings
No duplicates
Follow strict SOP and validation checklist
5. AI-Assisted Workflow
Use AI tools to:
Speed up the transformation
-
Assist in structuring data
Independently verify all AI-generated outputs
Final accountability for accuracy remains with the analyst
Candidate must be comfortable using AI tools responsibly for productivity, not blindly:
ChatGPT – for structuring, reasoning, and transformation assistance
Claude – for long document parsing and summarization
Microsoft Excel (with AI features like Copilot) – for data handling
-
Adobe Acrobat or similar – for PDF extraction
Optional:
Python (basic scripting)
Google Sheets
Key expectation: Ability to use AI as an assistant, not a replacement for thinking
Qualifications
Bachelor’s degree in:
Computer Science / Engineering
Mathematics / Statistics
Or any relevant field with strong data handling experience
Operations / Material Science background is a plus.