Morph builds the fastest LLM code editing inference engine in the world, and they are seeking a Machine Learning Engineering Intern to help push the limits of performance, safety, and scalability across their inference, retrieval, and diffing pipelines.
Responsibilities
- Have used ML frameworks like Pytorch, Tensorflow, or JAX in projects or at work
- Work across low-latency inference, containerized deployment, and CI/CD tooling
- Work with CUDA, kernels, and bleeding edge inference optimization research
- Implement the latest ML research into production quality systems
Skills
- Have used ML frameworks like Pytorch, Tensorflow, or JAX in projects or at work
- Work across low-latency inference, containerized deployment, and CI/CD tooling
- Work with CUDA, kernels, and bleeding edge inference optimization research
- Implement the latest ML research into production quality systems
- Strong understanding of Pytorch/TF/JAX
- Know your way around real infra: Docker, Kubernetes, Linux, observability
- Prior experience with low level inference optimizations (ex. kernels)
- Have experience with LLM apps, devtools, compilers, building games, or code intelligence
- Prefer ownership and agency > bureaucracy
Company Overview
Fast Apply Edits + Retrieval for AI Coding Agents It was founded in 2025, and is headquartered in San Francisco, California, USA, with a workforce of 2-10 employees. Its website is https://www.morphllm.com.Company H1B Sponsorship
Morph has a track record of offering H1B sponsorships, with 1 in 2025, 1 in 2024, 1 in 2023. Please note that this does not guarantee sponsorship for this specific role.
Apply Now