Data Engineer (Python/AI) - Assistant Vice President

Citi -
Pune, Maharashtra

Apply Now

Job details

Full-time

Qualifications

TensorFlow
CI/CD
Law
Software troubleshooting
Kubernetes
PyTorch
Software deployment
Git
Master's degree
Pandas
OOP
AWS
Docker
Continuous integration
Redis
REST
Natural language processing
Scripting
GitHub
APIs
Linux
JSON
Flask
AI
Python
Shell Scripting
Design patterns

Full job description

Discover your future at Citi

Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.

Job Overview

Role Summary
We are looking for a mid-level Python Developer with combined experience in Data Engineering andAI/NLP engineering. The candidate will build NLP pipelines using libraries such as Flair, BERT, and LLM frameworks, and will also work on large-scale data processing using PySpark, Pandas, and related data tools. The role includes developing APIs, integrating with platform services, and supporting CI/CD deployments using GitHub and LightSpeed Enterprise.

Key Responsibilities

Develop and optimize ETL/data processing jobs using PySpark, Pandas, PyArrow, and related libraries.
Work with Parquet files using FastParquet or pyarrow.parquet for efficient data processing.
Implement data parsing and serialization using json, ujson, or orjson for high-performance JSON handling.
Build and maintain NLP pipelines using Flair, BERT, and LLM-based models.
Develop scalable ingestion and data transformation pipelines for AI and analytics use cases.
Build and maintain Flask-based APIs for model inference and service integrations.
Use regular expressions for text cleaning, parsing, and NLP preprocessing.
Integrate caching and fast lookups using Redis.
Manage and deploy ML models using MLflow for tracking and versioning.
Support CI/CD workflows using GitHub, LightSpeed Enterprise, and deployment pipelines.
Create and maintain Autosys JILs for job scheduling and automation.
Use basic Linux commands for troubleshooting, operations, and deployment tasks.
Monitor application and system health using ITRS Geneos.
Write unit tests and improve automation test coverage (PyTest/unittest).
Work with REST APIs, microservices, and basic shell scripting.
Work with cloud services (ECS), including boto3.

Required Skills

8+ years of hands-on Python programming experience.
Strong fundamentals in Python, OOP, and design patterns.
Experience with NLP libraries such as Flair, BERT, HuggingFace Transformers, or similar.
Solid experience with PySpark, Pandas, PyArrow, and distributed data pipelines.
Proficient in working with Parquet using FastParquet or pyarrow.parquet.
Familiarity with fast JSON parsing libraries (json, ujson, orjson).
Experience building APIs using Flask (FastAPI is a plus).
Experience with MLflow for model tracking and deployment.
Good understanding of CI/CD practices and Git workflows.
Experience working with Redis or similar in-memory stores.
Experience with Autosys JILs for job scheduling.
Comfortable with Linux command line and shell scripting.
Strong debugging, problem-solving, and teamwork skills.
Exposure to cloud services; AWS boto3 experience is an asset.

Nice-to-Have

Experience with Polars or Dask for high-performance data processing.
Experience with PyTorch or TensorFlow for model training.
Experience with Docker, Kubernetes, or containerized deployments.
Experience with monitoring tools such as ITRS Geneos.
Experience with FastAPI, Airflow, or Prefect.

Job Family Group:

Technology

Job Family:

Applications Development

Time Type:

Full time

Most Relevant Skills

Please see the requirements listed above.

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

Apply Now