AI SummaryVerified by Aipplify AI
The vacancy is well-defined in tasks and requirements but lacks compensation details, affecting overall quality.
AI quality score6.5 / 10
Check Match — Just drop your CV
See your fit for AI/ML Engineer in seconds.
Overview
Join SberSpasibo as an AI/ML Engineer to enhance RAG agents' quality across various domains and implement advanced methodologies for automatic regression checks.
Responsibilities
- •Improve the quality of RAG agents in three areas: searching technical documentation, regulatory framework, and accounting documentation;
- •Prompt engineering and context engineering for all team agents;
- •Prepare benchmark quality assessment sets;
- •Implement LLM-as-a-judge methodology for automatic regression checks;
- •Configure guards for production agents: filters against prompt injections, validation of output structure and content, protection against personal data leaks, anti-hallucination mechanisms;
- •A/B testing of prompts and models to find the best configurations;
- •Tune quality based on user feedback signals and traces from Langfuse.
Requirements
- •Python with at least 2 years of commercial development experience;
- •Practical experience with RAG systems in production: embeddings, vector databases (Qdrant, FAISS or pgvector), reranking, chunking;
- •Experience building and supporting at least one RAG solution from start to finish, not a prototype;
- •Practical experience in evaluating LLM systems: preparing benchmark sets, offline metrics, LLM-as-judge, regression checks;
- •Experience with evaluation frameworks (Ragas, DeepEval or analogs);
- •Practical experience in configuring protections for LLM applications: protection against prompt injections, validation of output structure and content, protection against personal data leaks;
- •Experience in prompt engineering and context engineering in real projects: iterative prompt tuning, structured output, function calling;
- •Understanding of RAG architecture: document slicing strategies, metadata, embedding model selection, reranking, accuracy of source referencing;
- •Practical experience with at least one LLM framework: LangChain, LangGraph, PydanticAI, OpenAI API or analogs;
- •Practical experience in A/B testing of prompts and models in production;
- •Experience with agent protocols (MCP) or custom tool-layer for agents;
- •Basic level SQL and working with relational databases.
Loading similar jobs...