Machine Learning Engineer
The Role
- 4+ years of ML engineering in production (not just research or notebooks).
- Hands-on LLM experience in 2025-2026: agentic systems, tool-use, function-calling, RAG, structured output, eval design.
- Strong Python. Comfortable with PyTorch/JAX and one serving stack (vLLM, TGI, TensorRT-LLM, SageMaker, or similar).
- You've built an eval pipeline that actually caught a regression in prod.
- You read the papers and know which ones to ignore.
What You'll Do
- Design and ship the ML backbone of Gini AI Workers - routing, tool selection, reasoning, memory, evaluation.
- Build evaluation and feedback loops - offline evals, online A/B, regression harnesses, human-in-the-loop labeling pipelines.
- Optimize cost and latency across the agent stack: prompt engineering, model routing (frontier ↔ small ↔ fine-tuned), caching, speculative decoding, distillation.
- Fine-tune and/or RAG-tune models for vertical enterprise tasks (invoice extraction, PO matching, ticket triage, forecasting).
- Own the ML infra - training pipelines, experiment tracking, model registry, deployment, monitoring, drift detection.
- Partner with backend + product to turn research into shipped features on a weekly cadence.
What You Bring
- 4+ years of ML engineering in production (not just research or notebooks).
- Hands-on LLM experience in 2025-2026: agentic systems, tool-use, function-calling, RAG, structured output, eval design.
- Strong Python. Comfortable with PyTorch/JAX and one serving stack (vLLM, TGI, TensorRT-LLM, SageMaker, or similar).
- You've built an eval pipeline that actually caught a regression in prod.
- You read the papers and know which ones to ignore.
Nice to Have
- Experience with MCP, LangGraph, DSPy, or custom agent frameworks.
- Fine-tuning (LoRA/QLoRA, DPO/ORPO, RLAIF) on open-weight models (Llama, Qwen, Mistral, DeepSeek).
- Vector DBs (pgvector, Pinecone, Weaviate, Qdrant), reranking, hybrid retrieval.
- Prior work on multi-agent systems or enterprise copilots.
