Machine Learning Engineer
Build production ML systems at Engini. Work on LLMs, AI agents, and intelligent automation for global enterprise clients. Tel Aviv.
ApplyThe Role
Engini is building AI Workers that run real enterprise processes - not chatbots that *talk about* them. We need an ML Engineer who can take us from "LLMs with prompts" to production-grade, evaluated, observable, self-improving agents.
What you'll do
- Design and ship the ML backbone of Gini AI Workers - routing, tool selection, reasoning, memory, evaluation. - Build evaluation and feedback loops - offline evals, online A/B, regression harnesses, human-in-the-loop labeling pipelines. - Optimize cost and latency across the agent stack: prompt engineering, model routing (frontier ↔ small ↔ fine-tuned), caching, speculative decoding, distillation. - Fine-tune and/or RAG-tune models for vertical enterprise tasks (invoice extraction, PO matching, ticket triage, forecasting). - Own the ML infra - training pipelines, experiment tracking, model registry, deployment, monitoring, drift detection. - Partner with backend + product to turn research into shipped features on a weekly cadence.
What you'll Bring
- 4+ years of ML engineering in production (not just research or notebooks). - Hands-on LLM experience in 2025-2026: agentic systems, tool-use, function-calling, RAG, structured output, eval design. - Strong Python. Comfortable with PyTorch/JAX and one serving stack (vLLM, TGI, TensorRT-LLM, SageMaker, or similar). - You've built an eval pipeline that actually caught a regression in prod. - You read the papers and know which ones to ignore.
Nice to have
- Experience with MCP, LangGraph, DSPy, or custom agent frameworks. - Fine-tuning (LoRA/QLoRA, DPO/ORPO, RLAIF) on open-weight models (Llama, Qwen, Mistral, DeepSeek). - Vector DBs (pgvector, Pinecone, Weaviate, Qdrant), reranking, hybrid retrieval. - Prior work on multi-agent systems or enterprise copilots.
Why this role matters
- Experience with MCP, LangGraph, DSPy, or custom agent frameworks. - Fine-tuning (LoRA/QLoRA, DPO/ORPO, RLAIF) on open-weight models (Llama, Qwen, Mistral, DeepSeek). - Vector DBs (pgvector, Pinecone, Weaviate, Qdrant), reranking, hybrid retrieval. - Prior work on multi-agent systems or enterprise copilots.
