About Me

I'm a software engineer with 4+ years of experience building AI infrastructure at AWS. My focus is the capacity layer — the systems that manage how GPU resources get reserved, scheduled, and delivered to ML workloads at cloud scale.

I build distributed workflow orchestration for products like Capacity Blocks for ML (reserved GPU scheduling for training jobs) and UltraServers (multi-instance GPU supercomputers connected via high-bandwidth accelerator interconnects for trillion-parameter model training). The problems I work on daily involve state machines, idempotency in long-running workflows, capacity reservation lifecycle management, and the interface between capacity planning and workload scheduling.

Before AWS, I interned at Apple working on strategic data infrastructure.

I studied Computer Science and Mathematics at Boston University (BA, 2020), then Entertainment Technology at Carnegie Mellon University (MS, 2022).

This Blog

I write about the infrastructure that makes large-scale AI possible:

  • GPU scheduling in Kubernetes (DRA, topology-aware placement, Karpenter)
  • Capacity planning and reservation systems for AI workloads
  • Distributed systems patterns (idempotency, workflow orchestration, state machines)
  • The scheduling–capacity interface: how cluster schedulers and capacity planners should talk to each other

Posts are bilingual (中文/English). Deep technical dives tend to be in English; industry commentary and career reflections often in Chinese.

Beyond Work

I host a Chinese-language podcast (杨思特的半熟电台) interviewing ordinary people with extraordinary stories. Topics range from LGBTQ+ identity to overseas Chinese work culture to the entertainment industry.

Contact

Email: mincany0708@gmail.com