Job Description
The Senior Software Engineer develops PyTorch frameworks for deployment on Cloud TPUs and GPUs, tuning for performance in AI workloads. Organizations gain access to massive-scale computing for machine learning models, accelerating tasks like vision or translation. This includes model evaluation, data processing, and debugging with compilers.
Google Cloud provides access to Tensor Processing Units for PyTorch and JAX users, supporting frameworks at hyperscale. The AI Infrastructure team redefines capabilities with efficiency for AI models. Services power global innovations in computing.
Engineers work on enabling generative models, collaborating with researchers on ML capabilities, and maintaining open-source code. They contribute to training and inference on TPUs, handle design and architecture, test products, and launch them. Roles involve leadership in cross-functional projects. A challenge is integrating with frameworks for advanced models.
The US base salary range is $166,000-$244,000, plus bonus and equity. Full-time positions offer benefits including health and professional development. Remote work is available in the US.
Responsibilities
- Work on AI framework development to enable PyTorch models to run on Google Cloud's TPUs and GPUs and tune for peak performance
- Provide comprehensive support for ML frameworks and compilers on Cloud TPUs and Graphics Processing Units (GPUs), enabling the training and deployment of the most advanced machine learning models, managing innovation and breakthroughs
- Enable PyTorch models for generative models, computer vision (image recognition, object detection, image generation), machine translation, language modeling, rankings and recommendations, speech recognition, etc
- Collaborate with other Google teams and leading researchers across the industry to continuously bring ML capabilities to our PyTorch in Cloud offering
- Design, develop, test, deploy, maintain, and improve software while contributing to open-source software development
Requirements
- Bachelor’s degree or equivalent practical experience
- 5 years of experience with ML design and ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning)
- 5 years of experience in software development
- 5 years of experience testing, and launching software products, and 3 years of experience with software design and architecture
- Master’s degree or PhD in Engineering, Computer Science, or a related technical field
- 8 years of experience with data structures/algorithms
- 3 years of experience in a technical leadership role leading project teams and setting technical direction
- 3 years of experience working in an organization involving cross-functional, or cross-business projects
- Experience with compilers or ML frameworks