ElasticMM is an efficient and scalable serving system for large multimodal models (LMMs). It introduces Elastic Multimodal Parallelism (EMP), a new parallelization strategy that optimize resource ...