The Role:
You'll own the LLM pipeline from retrieval to fine-tuning to deployment. This is a 0?1 role — you'll work directly with founders and domain experts to drive Interexy's AI strategy.
What You'll Be Doing:
- Own the full LLM lifecycle: retrieval ? evaluation ? fine-tuning.
- Build and optimize RAG systems with hybrid search (BM25 + vector).
- Make smart architecture calls for speed, reliability, and scale.
- Monitor and improve accuracy, grounding, and metrics like Recall@k and Precision@k.
- Collaborate with founders in a flat structure — turning business needs into AI roadmaps.
Technical Must-Haves:
- Deep experience with LLMs (Llama, Mistral, GPT-4, Claude) and LangChain/LlamaIndex.
- Proficiency in hybrid search, reranking, and vector DBs (Pinecone, Weaviate).
- Hands-on fine-tuning with LoRA/QLoRA and Hugging Face Transformers.
- Expert Python + FastAPI, PyTorch, Redis, Postgres.
- Docker & Kubernetes for production AI deployment.
- Automated eval loops (groundedness, hallucination checks).
Nice to Have:
- IoT/hardware integration or real-time systems.
- Startup or defense-related AI projects.
- CS or Applied ML degree from a top university.
How We Work (Interexy DNA):
- Fast execution – ship code, not meeting minutes.
- High ownership – no tickets? Find the problem and fix it.
- Resilience – you actually enjoy the messy 0?1 phase.
What You Get:
- Real ownership of the AI roadmap at an international company.
- A tight-knit, high-output engineering team.
- Flat structure, direct access to leadership, and equity opportunities.
- $25–49 / hour based on experience.
- Assigned admin team – we handle scheduling, procurement, travel, and other logistics so you can focus 100% on engineering.
- Home office stipend – $500 one-time to set up your workspace.
- Flexible hours – work when you're most effective.