• Journal of Electronic Science and Technology
  • Vol. 22, Issue 3, 100278 (2024)
Yan Li1,*, Tai-Kang Tian2, Meng-Yu Zhuang2, and Yu-Ting Sun3,*
Author Affiliations
  • 1School of Economics and Management, University of Electronic Science and Technology of China, Chengdu, 611731, China
  • 2School of Economics and Management, Beijing University of Posts and Telecommunication, Beijing, 100876, China
  • 3School of Electrical Engineering and Computer Science, The University of Queensland, Brisbane, 4072, Australia
  • show less
    DOI: 10.1016/j.jnlest.2024.100278 Cite this Article
    Yan Li, Tai-Kang Tian, Meng-Yu Zhuang, Yu-Ting Sun. De-biased knowledge distillation framework based on knowledge infusion and label de-biasing techniques[J]. Journal of Electronic Science and Technology, 2024, 22(3): 100278 Copy Citation Text show less

    Abstract

    Knowledge distillation, as a pivotal technique in the field of model compression, has been widely applied across various domains. However, the problem of student model performance being limited due to inherent biases in the teacher model during the distillation process still persists. To address the inherent biases in knowledge distillation, we propose a de-biased knowledge distillation framework tailored for binary classification tasks. For the pre-trained teacher model, biases in the soft labels are mitigated through knowledge infusion and label de-biasing techniques. Based on this, a de-biased distillation loss is introduced, allowing the de-biased labels to replace the soft labels as the fitting target for the student model. This approach enables the student model to learn from the corrected model information, achieving high-performance deployment on lightweight student models. Experiments conducted on multiple real-world datasets demonstrate that deep learning models compressed under the de-biased knowledge distillation framework significantly outperform traditional response-based and feature-based knowledge distillation models across various evaluation metrics, highlighting the effectiveness and superiority of the de-biased knowledge distillation framework in model compression.
    $ q_C^T = \frac{{{e^{{Z_C}/T}}}}{{{e^{{Z_C}/T}} + {e^{{Z_F}/T}}}} $(1a)

    View in Article

    $ q_F^T = 1 - q_C^T $(1b)

    View in Article

    $ {L_{{\text{KD}}}} = \alpha {L_{{\text{Soft}}}} + \left( {1 - \alpha } \right){L_{{\text{Hard}}}} $(2)

    View in Article

    $ {L_{{\text{Soft}}}} = - \left( {q_C^T{\text{log}}\left( {p_C^{T = 1}} \right) + q_F^T{\text{log}}\left( {p_F^{T = 1}} \right)} \right) {\mathrm{.}} $(3)

    View in Article

    $ {L_{{\text{Hard}}}} = - {\text{log}}\left( {p_C^{T = 1}} \right) {\mathrm{.}} $(4)

    View in Article

    $ {L_{{\text{DeB}}}}{ { = }}\left\{ {1, if pCTIj0, if pCTIj.} \right. $(5)

    View in Article

    $ {L_{{\text{DeBKD}}}} = \beta {L_{{\text{DeB}}}} + {L_{{\text{Hard}}}} $(6)

    View in Article

    $ {\text{Acc}} = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}}} $(7)

    View in Article

    $ {\text{Pre}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FP}}}} $(8)

    View in Article

    $ {\text{Rec}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}} $(9)

    View in Article

    Yan Li, Tai-Kang Tian, Meng-Yu Zhuang, Yu-Ting Sun. De-biased knowledge distillation framework based on knowledge infusion and label de-biasing techniques[J]. Journal of Electronic Science and Technology, 2024, 22(3): 100278
    Download Citation