Your new company
A Chinese Hedge fund/ high frequency trading firm.
Your new role
- Advancing Mixture of Experts (MoE) through dynamic routing, sparse activation, and scalable training architectures.
- Innovating Transformer-MoE integration to enhance model efficiency and performance in large-scale deployments.
- Optimizing Multi-Head Attention (MHA) via sparse and linear mechanisms, with theoretical and multimodal enhancements.
- Pioneering tokenization research, including subword/byte-level algorithms and multilingual framework development.
- Enhancing embedding techniques through compression and semantic space optimization for improved model representation.
What you'll need to succeed
- Holds a PhD in Computer Science, AI, Mathematics, or a closely related discipline.
- Demonstrates strong research experience in MoE, MHA, or Tokenization, with publications or deep project work.
- Skilled in PyTorch/JAX and experienced in training and fine-tuning large-scale models.
- Deeply familiar with Transformer architectures and frameworks like Megatron and DeepSpeed.
- Combines academic curiosity with an engineering mindset and a self-driven approach to practical implementation.
What you'll get in return
In return, you'll be part of an organization that values its employees. You’ll be rewarded with:
- Structured career growth and plenty of developmental opportunities
- Stable working environment
What you need to do now
If you're interested in this role, click 'apply now' or for more information and a confidential discussion on this role or to find out about more opportunities in Technology, contact Yuki Cheung or email yuki.cheung@hays.com.sg. Referrals are welcome.
At Hays, we value diversity and are passionate about placing people in a role where they can flourish and succeed. We actively encourage people from diverse backgrounds to apply.
EA Reg Number: R22110258 | EA License Number: 07C3924 | Company Registration No: 200609504D