About the team
The Applied AI Research team works on innovative solutions to solve practical
real-world problems with direct impact on users of the OpenAI
API. We work on open-ended research projects to
enable successful products. We care about solving real-world problems, while
keeping our products up to a high safety standard. We care about applications
driven by user feedback, as well as long-term research with significant
impacts on the product.
About the role
To help API users monitor and prevent unwanted use cases, we developed the
a tool for checking whether content complies with OpenAI's content
Developers can thus identify content that our content policy prohibits and
take actions (e.g. block it). We seek a Machine Learning Engineer to help
design and build a robust pipeline for data management, model training and
deployment to enable a consistent improvement on the Moderation model.
In this role you will:
- Design, develop and maintain a robust and scalable data management pipeline and set up standards for versioning and data quality control. The pipeline should be able to handle data relabeling requests due to content policy changes.
- Build a pipeline for automated model training, evaluation and deployment, including active learning process, routines for calibration and validation data refresh etc.
- Work closely with stakeholders from product, engineering, content policy on a long-term improvement over the moderation models, for both external release and internal use cases across a variety of projects on model safety.
- Research on the latest techniques and methods in deep learning and natural language processing to improve the moderation model across a collection of unwanted content categories.
- Experiment on data augmentation and data generation methods to enhance the diversity and quality of training data.
- Experiment and design an effective red-teaming pipeline to examine the robustness of the model and identify areas for future improvement.
- Conduct open-ended research to improve the quality of collected data, including but not limited to, semi-supervised learning and human-in-the-loop machine learning.
You might thrive in this role if you:
- Have 3+ years industry experience as a Machine Learning Engineer or Software Engineer, working on building data pipelines, training and deploying machine learning models in production on a daily basis.
- Care deeply about AI safety and passionate about building the best deep learning empowered moderation model to effectively detect unwanted content.
- Have a strong belief in the criticality of high-quality data and are highly motivated to work with the associated challenges.
- Have experience working in large distributed systems, deep learning or/and natural language processing is a big plus.
- Love working with a team.