Efficient MOE inference and training
we introduce WorkForceAgent-R1, an LLM-based web agent trained using a rule-based R1-style reinforcement learning framework designed explicitly to enhance single-step reasoning and planning for business-oriented web navigation tasks.
We explore developing a \texttt{H}eterogeneous-aware \texttt{EX}pert \texttt{A}llocation framework, \textbf{\texttt{HEXA-MoE}}, with significantly enhanced computing efficiency.
We propose Lightening-Transformer, the first light-empowered, high-performance, and energy-efficient photonic Transformer accelerator.
We design and tape-out the SpAtten architecture in TSMC 28nm technology digital chip.
This paper provides an overview of efficient deep learning methods, systems and applications.