Open Source Cowork para escritorio

Eigent es un producto de código abierto, lo que significa que puedes alojarlo tú mismo gratis usando tus propias claves API o modelos locales.

¡Nueva versión de Eigent 1.0 lanzada!

Open Source Cowork para escritorio

Eigent es un producto de código abierto, lo que significa que puedes alojarlo tú mismo gratis usando tus propias claves API o modelos locales.

¡Nueva versión de Eigent 1.0 lanzada!

All roles

RL Environment Data Engineer / Researcher

We are looking for an RL Environment Data Engineer / Researcher to design, build, and refine reinforcement learning training environments across different domains. This role will focus on data collection, task definition, reward design, evaluation criteria, anti-reward-hacking mechanisms, and post-training validation of environment data effectiveness.

Responsibilities

- Design and improve RL training environments across various task domains.
- Collect, clean, structure, and evaluate data used for RL environment construction and model post-training.
- Define task objectives, reward functions, and evaluation standards to ensure reliable and reproducible training signals.
- Develop technical approaches to prevent reward hacking and identify loopholes in reward design.
- Build validation environments to assess the effectiveness of post-training data and RL environment design.
- Collaborate with research, engineering, and data teams to improve environment coverage, task difficulty, and evaluation reliability.
- Follow research progress in RL environments, data evaluation, AI agents, and post-training methods, and apply relevant findings to production workflows.

Requirements

- Strong coding skills, especially in Python, with the ability to independently build data pipelines, environments, and evaluation tools.
- Proficiency with AI coding tools for code generation, debugging, refactoring, and rapid experimentation.
- Solid understanding of reinforcement learning, post-training, reward function design, environment design, and data evaluation.
- Ability to translate real-world tasks into trainable and measurable RL environments.
Experience with data scraping, data cleaning, annotation, or data quality assessment is preferred.
- Experience with LLM agents, RLHF/RLAIF, coding agents, automated evaluation, or benchmark construction is a strong plus.
- Strong experimental mindset and engineering execution, with the ability to continuously improve systems based on data and evaluation results.