AI Engineer Intern

Singapore, Singapore
2026-05-25
Role Description

Job Responsibilities：

 Centered on the deployment needs of Tencent's overseas gaming business in large language model (LLM) and reinforcement learning scenarios, this role is responsible for the development, performance optimization, and engineering implementation of high-quality AI computing infrastructure. Specific responsibilities include:

 1\. Distributed Training Engineering: Participate in the implementation of large-scale distributed training solutions; own the engineering delivery of data parallelism, model parallelism (Tensor Parallelism / Pipeline Parallelism), and ZeRO techniques; continuously tune GPU utilization and ensure the stability of ultra-large-scale training jobs.

 2\. Compute Scheduling Optimization: Take a deep role in developing and optimizing AI job scheduling logic; Address compute bottlenecks in complex gaming scenarios through fine-grained resource management, fault self-healing mechanisms, and efficient checkpointing strategies.

 3\. End-to-End Model Engineering: Own the full engineering pipeline from model training to inference serving; participate in operator profiling, model quantization, and the construction of high-performance inference pipelines to support rapid AI iteration within gaming products.

 4\. AI-Driven Engineering Evolution: Actively embrace AI Coding tools to boost development efficiency; drive Harness Engineering practices — including automated testing and engineering governance — to ensure extreme reliability of the underlying infrastructure.

 Qualifications

 1\. Bachelor's degree or above; majors in Computer Science, Computer Architecture, High-Performance Computing, or related fields preferred.

 2\. Core Tech Stack: Proficient in at least one of Python / C\+\+ / Go; deep understanding of the PyTorch framework; hands-on experience engineering distributed training with DeepSpeed, Megatron-LM, or equivalent frameworks.

 3\. Solid understanding of distributed systems principles; Familiarity with NCCL, RDMA networking, or high-performance storage is a plus; working knowledge of containerized infrastructure (Docker / Kubernetes).

 4\. Demonstrable experience with AI Coding tools (e.g., GitHub Copilot, Cursor) is a strong plus; prior work in Harness Engineering — engineering governance, automated benchmarking, or system stress testing — is highly valued.

 5\. Exceptional learning agility, clear logical thinking, and the ability to collaborate effectively with cross-functional teams on complex systems engineering challenges; Fluent proficiency in English

 6\. Bonus: Background in high-performance backend architecture, or real project experience in LLM training / inference engineering.
Ready to apply?

Tags & Skills
DevOps Machine Learning Systems
Join to unlock direct application links.
AI Engineer Intern
Tencent