We are looking for an Agent Data Product Manager to lead the definition, development, and quality management of training data for general-purpose AI agents. This role sits at the intersection of product, data, and AI research, focusing on how high-quality agent data can improve model capabilities and create a sustainable product-data feedback loop.
Responsibilities
- - Define product requirements for Agent training data, including task scenarios, data formats, quality standards, and evaluation criteria.
- - Lead the design and optimization of data production workflows, from requirement definition and annotation guidelines to delivery and quality review.
- - Work closely with research, engineering, data, and operations teams to ensure data supports Agent training, evaluation, and product iteration.
- - Build quality control mechanisms for Agent data, including sampling, review, error analysis, and feedback loops.
- - Translate product use cases into scalable data tasks that improve the capabilities of general-purpose Agents.
- - Monitor data performance in training and product environments, and use insights to refine data strategy.
- - Help build a product-data flywheel where user needs, product behavior, training data, and model improvement reinforce each other.
Requirements
- - Basic understanding of LLMs, AI Agents, reinforcement learning, and model training or post-training workflows.
- - Strong interest in general-purpose Agent products and enthusiasm for building product-data flywheels.
- - Experience in product management, data product design, AI product operations, or related roles.
- - Ability to define clear requirements, structure ambiguous problems, and coordinate cross-functional execution.
- - Strong data sense, with the ability to evaluate data quality and connect data decisions to product outcomes.
- - Excellent communication and collaboration skills across research, engineering, data, and business teams.
- - Experience with AI Agents, LLM evaluation, RLHF/RLAIF, data annotation platforms, or model training data pipelines is a strong plus.