Machine Learning Engineer, Data & Machine Learning Innovation
Beijing, Beijing, China
Machine Learning and AI
As part of Apple’s AI and Machine Learning org, we inspire and create groundbreaking technology for multi-modal models with strong agent and reasoning capabilities. The Data and Machine Learning Innovation (DMLI) team is seeking a passionate Machine Learning Engineer to explore new methods, challenge existing metrics and protocols, and develop new insightful practices that will change how we understand data and overcome real-world ML challenges. As a team member, you will work on some of the most ambitious technical challenges in the field. Your role will involve collaborating closely with our team of machine learning researchers, engineers, and data scientists. Together, you will spearhead groundbreaking research initiatives and develop transformative products designed to create a significant impact for billions of users worldwide.
Description
As a Machine Learning (ML) Engineer, you will be entrusted with the critical role of innovating and applying state-of-the-art research in foundation models to tackle complex data problems. The solutions you develop will significantly impact future Apple software and hardware products and the broader ML development ecosystem.
You will work with a multidisciplinary team to actively participate in the data-model co-design and co-development practice. Your responsibilities will extend to designing and developing a comprehensive data generation and curation framework for foundation models at Apple. You will also be responsible for creating robust model evaluation pipelines, integral to the continuous improvement and assessment of foundation models. Additionally, your role will entail an in-depth analysis of multi-modal data to understand its influence on model performance.
Furthermore, you will have the opportunity to showcase your groundbreaking research work by publishing and presenting at premier academic venues.
Your work may span various applications, including:
- Enhancing current products and future hardware platforms with multi-modal perception data.
- Designing and implementing semi-supervised, self-supervised representation learning techniques to maximize the power of both limited labeled data and large-scale unlabeled data.
- Developing on-device intelligence and learning with strong privacy protections.
- Employing data selection techniques such as novelty detection, active learning, and core-set selection for diverse data types like images, 3D models, natural language, and audio.
- Uncovering patterns in data, setting performance targets, and leveraging modern statistical and ML-based methods to model data distributions. This will aid in reducing redundancy and addressing out-of-distribution samples.
- Learning new skills rapidly and applying them as needed, e.g., learning a new machine learning algorithm from a research paper and implementing it; mastering basic knowledge from a new domain in a short amount of time.
- Providing technical guidance to product teams on choosing appropriate machine learning approaches for tasks.
Minimum Qualifications
- Deep technical skills in one or more machine learning areas, such as computer vision, combinatorial optimization, causality analysis, natural language processing, and deep learning.
- Strong software development skills with proficiency in Python; hands-on experience working with deep learning toolkits like PyTorch, TensorFlow, or JAX.
- 5+ years of experience developing and evaluating ML applications, demonstrating a passion for understanding and improving model/data quality.
Key Qualifications
Preferred Qualifications
- Deep understanding of multi-modal foundation models.
- Staying up-to-date with emerging trends in generative AI and multi-modal LLMs.
- The ability to formulate machine learning problems, design, experiment, implement, and communicate solutions effectively.
- Hands-on mentality to own engineering projects from inception to shipping products and the ability to work independently and as part of a cross-functional team.
- Demonstrated publication records in relevant conferences (e.g., CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, etc.).
- Track records of adopting ML to solve cross-disciplinary problems.