Senior AI Data Engineer

Luxoft -
Remote

Apply Now

Job details

Qualifications

CI/CD
Elasticsearch
Oracle
Law
NoSQL
Git
English
Cost control
Master's degree
SQL
AWS
Machine learning
PostgreSQL
Version control systems
Continuous integration
Computer networking
APIs
ETL
Agile
Data science
Metadata
AI
Communication skills
Python

Full job description

Project description

Join the Data Engineering team to contribute to the ongoing maintenance and improvement of an internal LLM-powered assistant that uses hosted LLM APIs and internal knowledge sources, with a focus on reliability, retrieval quality, and operational excellence.

Responsibilities

Maintain and enhance ingestion/enrichment pipelines for internal content (parsing/extraction, normalization, metadata enrichment, deduplication, and quality monitoring)

Improve indexing and retrieval performance and quality (chunking/segmentation refinements, embedding/index update workflows, metadata filtering, caching) and support hybrid retrieval capabilities (vector + keyword/BM25 + metadata)

Implement and maintain access-aware retrieval by propagating/enforcing document permissions through indexing and query-time filters, including audit logs and validation tests

Improve source attribution so responses reliably point to the correct documents and sections in a consistent format.

Extend and harden tool/workflow execution and automations (scheduled/trigger-based), including retries, timeouts, idempotency, concurrency controls, and run history

Develop and maintain evaluation and regression testing (golden sets, automated scoring) and support structured comparisons across LLM providers/models as required

Operate the platform in production: observability (logs/metrics/tracing), alerting, incident support, performance tuning, and cost controls, plus runbooks and handover documentation

Skills

Must have

8+ years of hands-on experience in Data Science and 5+ years in Machine Learning, with a proven track record, demonstrated through a robust portfolio of projects.

Strong programming skills in languages such as Python and familiarity building ETL pipelines.

Expertise in SQL and experience with both relational (preferably Postgres) and NoSQL databases (Open Search or Elastic Search)

Familiarity with AWS cloud platform and its services.

Experience with version control systems (e.g., Git) and CI/CD pipelines.

Ability to build scalable infrastructure to embed and search very large number of documents.

Ability to move fast in an environment where things are sometimes loosely defined and may have competing priorities or deadlines.

Expertise in ML inference optimizations

Solid experience with Hybrid RAG, chunking/segmentation refinements, embedding/index update workflows, metadata filtering, caching, etc.

Knowledge of network optimization for distributed ML training and inference.

Understanding of distributed training patterns and checkpointing strategies.

Strong English skills (B2 and higher)

Strong verbal and written communication skills.

Ability to work independently and collaborate in a group.

Nice to have

Agile certification

Oracle/Microsoft attestations and certifications

Domain knowledge

Trading and Capital Markets

Other

Languages

English: C1 Advanced

Seniority

Senior

Remote India, India

Req. VR-122155

AI/ML

BCM Industry

03/06/2026

Req. VR-122155

Apply Now

Project description

Responsibilities

Skills

Other

Jobseeker tools

Employer Tools

Browse

Stay Connected