Shirin Tahmasebi
  • About
  • CV
    Publications Background
  • Notes (current)
    Bias Tokenization LLM as Evaluator Prompt Engineering LLM Compression LLM Agents
    Recommendation Systems Others
  • MoE-Pruner

    Pruning MoE LLMs

    8 min read   ·   October 15, 2024

    2024   ·   compression   pruning  

    image
  • AWQ

    Activation-aware Weight Quantization

    5 min read   ·   July 17, 2024

    2024   ·   compression   quantization  

    image
  • Wanda

    Pruning by Weights and Activations

    7 min read   ·   May 06, 2024

    2024   ·   compression   pruning  

    image
  • Distilling Step-by-Step!

    Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

    4 min read   ·   May 07, 2023

    2023   ·   compression   distillation  

    image
  • SparseGPT

    Making Smaller LLMs in One-shot

    4 min read   ·   March 22, 2023

    2023   ·   compression   pruning  

    image
  • GPTQ

    Accurate Post-Training Quantization for Generative Pre-Trained Transformers

    6 min read   ·   March 22, 2023

    2023   ·   compression   quantization  

    image
  • Attention Head Pruning

    Pruning the Attention Heads in Layer-wise Way to Make LLMs Smaller

    4 min read   ·   October 07, 2021

    2021   ·   compression   pruning  

    image
© Copyright 2025 Shirin Tahmasebi.