Abstract: Large language models (LLMs) can perform a plethora of tasks, however, they often require cloud servers for deployment due to their computing cost and size. Meanwhile, small and ...