About the Role
AI is at the centre of what we're building. You'll design and implement AI systems that go beyond demos — production RAG pipelines, multi-step agents, and LLM integrations that handle thousands of requests per day reliably.
What You'll Do
- Design and build RAG systems using LlamaIndex, LangChain, or custom implementations
- Integrate OpenAI GPT-4o, Anthropic Claude, Gemini, and open-source LLMs
- Build autonomous agents with tool use, memory, and multi-step planning
- Implement vector search with Pinecone, Qdrant, pgvector, or Weaviate
- Build document processing pipelines — PDFs, OCR, structured extraction
- Deploy AI systems as FastAPI services with streaming, batching, and error handling
- Evaluate model performance — implement evals, track accuracy, optimise prompts
What We're Looking For
- 2+ years in AI/ML engineering or equivalent self-taught background with portfolio
- Python proficiency — asyncio, type hints, packaging
- Real LLM integration experience — production systems, not just API demos
- RAG pipeline experience — chunking strategies, embedding models, retrieval tuning
- Vector database experience — Pinecone, Qdrant, or pgvector
- FastAPI or Flask for serving AI inference endpoints
- Understanding of token budgets, cost management, and model selection
Nice to Have
- Fine-tuning experience with LoRA or QLoRA
- LangGraph or multi-agent orchestration frameworks
- Prompt engineering and evaluation frameworks (RAGAS, DeepEval)
- Computer vision or multimodal model experience
Perks & Benefits
- 100% Remote
- Salary: 8 LPA fixed — our highest engineering band
- Cutting-edge AI work
- Direct client impact
- Research-friendly culture