Senior Python Developer – Data Cleaning & Processing
Company: 3Wheels AI
Location: Remote / Hybrid
Experience: 4+ Years
Employment Type: Full-time / Contract
About the Role
We are looking for a Senior Python Developer to lead data cleaning, preprocessing, transformation, and quality assurance initiatives across large-scale datasets. The ideal candidate should have strong experience handling structured, semi-structured, and unstructured data and building scalable data processing pipelines.
Responsibilities
- Design and develop robust data cleaning and preprocessing pipelines.
- Handle large-scale datasets from multiple sources (CSV, JSON, XML, databases, APIs, documents, images, etc.).
- Identify and resolve data quality issues, inconsistencies, duplicates, and missing values.
- Develop automated ETL workflows and data validation systems.
- Build scalable data transformation and enrichment processes.
- Optimize Python scripts for performance and efficiency.
- Collaborate with AI/ML teams to prepare high-quality training datasets.
- Create documentation and maintain data processing standards.
- Implement quality control and monitoring frameworks.
Required Skills
- Strong proficiency in Python (4+ years).
- Expert knowledge of:
- Pandas
- NumPy
- Polars
- PySpark (preferred)
- SQL
- Experience with ETL pipelines and workflow orchestration.
- Experience processing large datasets (millions of records).
- Knowledge of data validation and quality assurance techniques.
- Familiarity with cloud platforms (AWS, GCP, or Azure).
- Experience with Git and collaborative development workflows.
- Strong debugging and performance optimization skills.
Preferred Qualifications
- Experience working with AI/ML datasets.
- Experience handling multimodal datasets (text, image, audio, video).
- Knowledge of data annotation and labeling workflows.
- Experience with Apache Airflow, Spark, or distributed processing systems.
- Understanding of data privacy and compliance requirements.
Nice to Have
- Experience with LLM training datasets.
- Experience building data pipelines for AI companies.
- Familiarity with vector databases and data indexing.
- What We're Looking For
- Strong ownership mindset.
- Ability to work independently.
- Excellent problem-solving skills.
- Attention to detail and data quality.
How to Apply
Please share:
- Resume/CV
- GitHub profile
- LinkedIn profile
- Examples of data processing projects you've worked on
- Current and expected compensation
Pay: ₹25,000.00 - ₹40,000.00 per month
Benefits:
Work Location: In person