logo
  • Entornos
  • Empresa
  • Precios

Open Source Cowork para escritorio

Eigent es un producto de código abierto, lo que significa que puedes alojarlo tú mismo gratis usando tus propias claves API o modelos locales.

Eigent

Recibe las últimas actualizaciones y tutoriales sobre automatización de la fuerza laboral con IA.

ProductoEigentEntornosPreciosEmpresarial
ExplorarSolucionesCasos de usoHabilidadesPluginsBlog
DesarrolladoresDocumentaciónGitHubCAMEL-AIFondo Open SourceSocio
DescargarPara código abierto
EmpresaSobre nosotrosMarcaEmpleosTérminos de usoPolítica de privacidadSeguridad y confianzaPolítica de cookiesPolítica de reembolso y prueba

Todos los derechos reservados © 2026 EIGENT UK LTD

¡Nueva versión de Eigent 1.0 lanzada!download
All roles

RL Environment Data Engineer / Researcher

Location

London / Bay Area / Remote

Employment Type

Full-time

We are looking for an RL Environment Data Engineer / Researcher to design, build, and refine reinforcement learning training environments across different domains. This role will focus on data collection, task definition, reward design, evaluation criteria, anti-reward-hacking mechanisms, and post-training validation of environment data effectiveness.

Responsibilities

  • - Design and improve RL training environments across various task domains.
  • - Collect, clean, structure, and evaluate data used for RL environment construction and model post-training.
  • - Define task objectives, reward functions, and evaluation standards to ensure reliable and reproducible training signals.
  • - Develop technical approaches to prevent reward hacking and identify loopholes in reward design.
  • - Build validation environments to assess the effectiveness of post-training data and RL environment design.
  • - Collaborate with research, engineering, and data teams to improve environment coverage, task difficulty, and evaluation reliability.
  • - Follow research progress in RL environments, data evaluation, AI agents, and post-training methods, and apply relevant findings to production workflows.

Requirements

  • - Strong coding skills, especially in Python, with the ability to independently build data pipelines, environments, and evaluation tools.
  • - Proficiency with AI coding tools for code generation, debugging, refactoring, and rapid experimentation.
  • - Solid understanding of reinforcement learning, post-training, reward function design, environment design, and data evaluation.
  • - Ability to translate real-world tasks into trainable and measurable RL environments.
  • Experience with data scraping, data cleaning, annotation, or data quality assessment is preferred.
  • - Experience with LLM agents, RLHF/RLAIF, coding agents, automated evaluation, or benchmark construction is a strong plus.
  • - Strong experimental mindset and engineering execution, with the ability to continuously improve systems based on data and evaluation results.