Role Description
**Build the next generation of Agentic AI with us**
Our platform combines **conversation intelligence, multimodal understanding, and agentic AI systems** to power both human agents and autonomous AI agents across the entire customer experience lifecycle.
A core part of this vision is our investment in **custom Small Language Models (SLMs)**—purpose-built for CX workflows—paired with **reinforcement learning systems** that continuously improve decision-making in real-world environments.
We’re looking for a **Research Intern (Reinforcement Learning)** to join us in shaping this future.
**What you’ll do**
------------------
* **Design and build reinforcement learning environments** that model real-world customer interaction workflows.
* **Design RL agents** that learn from these environments using real-world interaction data, rewards, and feedback loops
* **Define reward models and feedback loops** using real-world signals (outcomes and human feedback)
* **Enable learning from production data** by structuring interaction traces into training-ready datasets for offline and online learning
* **Experiment with multi-agent systems and simulation frameworks** for complex coordination and decision-making
* **Collaborate with engineering and product teams** to deploy, evaluate, and iterate on learning systems in production at scale.
**What we’re looking for**
--------------------------
* Currently pursuing (or recently completed) a degree in **Computer Science, AI, Machine Learning, or related field**
* Strong understanding of **reinforcement learning fundamentals**
* Familiarity with **RL environments and training libraries such as Verl and Tinker**
* Strong foundation in **probability, math, and optimization**
* Passion for building **real-world AI systems**
**Nice to have**
----------------
* Experience with **RLHF, LLM/SLM fine-tuning, or model alignment**
* Exposure to **agent-based systems or multi-agent RL**
* Prior research, projects, or publications in **RL or applied ML**
* Experience working with **large-scale or production datasets**
**Why Level AI**
----------------
* Work on **production-grade Agentic AI systems** used by leading enterprises
* Build alongside a team with deep expertise from **Amazon, Google, and Meta**
* Be part of a fast-growing **Series C AI company.**
* Direct exposure to **0 1 AI innovation in CX and decisioning systems**
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.