СберСпасибо

AI/ML Engineer

6.0/10
СберСпасибо
Not specified
Office / on-site
senior
about 4 hours ago
AI SummaryVerified by Aipplify AI

The vacancy is well-defined in tasks and requirements but lacks compensation details, affecting overall quality.

AI quality score6.5 / 10

Check Match — Just drop your CV

See your fit for AI/ML Engineer in seconds.

Overview

Join SberSpasibo as an AI/ML Engineer to enhance RAG agents' quality across various domains and implement advanced methodologies for automatic regression checks.

Responsibilities

  • Improve the quality of RAG agents in three areas: searching technical documentation, regulatory framework, and accounting documentation;
  • Prompt engineering and context engineering for all team agents;
  • Prepare benchmark quality assessment sets;
  • Implement LLM-as-a-judge methodology for automatic regression checks;
  • Configure guards for production agents: filters against prompt injections, validation of output structure and content, protection against personal data leaks, anti-hallucination mechanisms;
  • A/B testing of prompts and models to find the best configurations;
  • Tune quality based on user feedback signals and traces from Langfuse.

Requirements

  • Python with at least 2 years of commercial development experience;
  • Practical experience with RAG systems in production: embeddings, vector databases (Qdrant, FAISS or pgvector), reranking, chunking;
  • Experience building and supporting at least one RAG solution from start to finish, not a prototype;
  • Practical experience in evaluating LLM systems: preparing benchmark sets, offline metrics, LLM-as-judge, regression checks;
  • Experience with evaluation frameworks (Ragas, DeepEval or analogs);
  • Practical experience in configuring protections for LLM applications: protection against prompt injections, validation of output structure and content, protection against personal data leaks;
  • Experience in prompt engineering and context engineering in real projects: iterative prompt tuning, structured output, function calling;
  • Understanding of RAG architecture: document slicing strategies, metadata, embedding model selection, reranking, accuracy of source referencing;
  • Practical experience with at least one LLM framework: LangChain, LangGraph, PydanticAI, OpenAI API or analogs;
  • Practical experience in A/B testing of prompts and models in production;
  • Experience with agent protocols (MCP) or custom tool-layer for agents;
  • Basic level SQL and working with relational databases.
Loading similar jobs...