Publicis Sapient is a digital transformation partner helping established organizations get to their future, digitally-enabled state, both in the way they work and the way they serve their customers. We help unlock value through a start-up mindset and modern methods, fusing strategy, consulting and customer experience with agile engineering and problem-solving creativity. United by our core values and our purpose of helping people thrive in the brave pursuit of next, our 20,000+ people in 53 offices around the world combine experience across technology, data sciences, consulting and customer obsession to accelerate our clients’ businesses through designing the products and services their customers truly value.
The Platform Engineer – AI & GPU Services will be responsible for implementing and maintaining AI/ML platforms and GPU resource management across cloud (GCP) and on-premise infrastructure. This role combines expertise in cloud services, AI/ML technologies, and infrastructure automation to support both product engineering and platform engineering functions. The ideal candidate will have experience working with generative AI services, GPU management, and container orchestration platforms.
Responsibilities:
• Architect, build, and maintain AI/ML platforms using Google Cloud Platform (GCP) services like Compute, Storage, IAM, and VPC.
• Manage NVIDIA GPU resources across projects using Run.ai or similar tools.
• Develop and maintain MLOps pipelines on platforms like Vertex AI, supporting AI/ML model training and deployment.
• Write Python scripts for model development, automation, and infrastructure management.
• Use Terraform for Infrastructure as Code (IaC) to automate provisioning and deployment of cloud resources.
• Deploy and manage AI/ML models on container orchestration platforms such as OpenShift and GKE.
• Collaborate with AI teams to facilitate LLM deployment (e.g., Llama, Mistral) and GPU utilization.
• Automate and enhance CI/CD pipelines for seamless integration and deployment of services.
• Monitor performance and capacity with Prometheus, Grafana, and other observability tools to ensure system stability.
• Engage in DevOps practices, including containerization, orchestration, and infrastructure management.
• Strong experience with Google Cloud Platform (GCP) and its core services (Compute, Storage, IAM, VPC).
• Experience with GPU resource management tools (e.g., Run.ai).
• Proficiency with Python for AI/ML workflows and automation.
• Hands-on experience with MLOps platforms like Vertex AI.
• Experience with Terraform for managing cloud infrastructure using Infrastructure as Code (IaC) practices.
• Knowledge of Kubernetes and container orchestration platforms such as OpenShift and GKE.
• Familiarity with monitoring and logging tools like Prometheus, Grafana, and the ELK Stack.
• Proven track record of working with CI/CD pipelines and DevOps automation tools.
Pay Range: $75,000 - $146,000
The range shown represents a grouping of relevant ranges currently in use at Publicis Sapient. Actual range for this position may differ, depending on location and specific skillset required for the work itself.
Benefits of Working Here:
As part of our dedication to an inclusive and diverse workforce, Publicis Sapient is committed to Equal Employment Opportunity without regard for race, color, national origin, ethnicity, gender, protected veteran status, disability, sexual orientation, gender identity, or religion. We are also committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, you may contact us at [email protected] or you may call us at +1-617-621-0200.