WiP: Towards Light Adaptation of Large Language Models for Personal Hardware

Liangyu Wang, Junxiao Wang, Di Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The large language models (LLMs) that everyone is using are not deployed locally. Users need to send relatively private and important data to LLM when using it. Handing over private and important data to LLM will cause people to worry, especially now that many people have begun to use LLM to deal with life and work affairs. Such concerns cannot be easily dispelled by various guarantees and agreements. However, LLMs are often resource-intensive and computationally demanding, making the transition from server-side to device-side difficult because LLM's self-attention module contains a large number of tensor multiplications that are heavy and inefficient for hardware. While previous work proposed approximate neural operators that enable hardware-efficient implementation of multiplication-less neural networks, they introduce new challenges of significant accuracy loss, making these methods inefficient in practice. In this paper, we examine the problem of light adaptation of LLMs. We propose a new neural operator that enables the adapted LLM to obtain original accuracy without fine-tuning or only requiring a few fine-tuning steps, while our neural operator has high hardware inference efficiency.

Original languageEnglish (US)
Title of host publicationEdgeFM 2024 - Proceedings of the 2024 Workshop on Edge and Mobile Foundation Models
PublisherAssociation for Computing Machinery, Inc
Pages30-32
Number of pages3
ISBN (Electronic)9798400706639
DOIs
StatePublished - Jun 3 2024
Event2024 Workshop on Edge and Mobile Foundation Models, EdgeFM 2024 - Minato-ku, Japan
Duration: Jun 3 2024Jun 7 2024

Publication series

NameEdgeFM 2024 - Proceedings of the 2024 Workshop on Edge and Mobile Foundation Models

Conference

Conference2024 Workshop on Edge and Mobile Foundation Models, EdgeFM 2024
Country/TerritoryJapan
CityMinato-ku
Period06/3/2406/7/24

Keywords

  • large language model
  • transformer

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'WiP: Towards Light Adaptation of Large Language Models for Personal Hardware'. Together they form a unique fingerprint.

Cite this