WEKO3
アイテム
SCUNLP-2 at the NTCIR-18 FigArg-2 Task: Apply Repeat-Error-Correction Learning on Text Classification
https://doi.org/10.20736/0002002041
https://doi.org/10.20736/00020020412e667d41-f726-4cdf-b755-737ba477b63f
| 名前 / ファイル | ライセンス | アクション |
|---|---|---|
|
|
|
| アイテムタイプ | デフォルトアイテムタイプ(フル)(1) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 公開日 | 2025-06-06 | |||||||||
| タイトル | ||||||||||
| タイトル | SCUNLP-2 at the NTCIR-18 FigArg-2 Task: Apply Repeat-Error-Correction Learning on Text Classification | |||||||||
| 言語 | en | |||||||||
| 作成者 |
Tong-Ru Wu
× Tong-Ru Wu
× Jheng-Long Wu
|
|||||||||
| 内容記述 | ||||||||||
| 内容記述タイプ | Abstract | |||||||||
| 内容記述 | Large Language Models (LLMs) have shown promising capabilities for zero-shot text classification, yet they often do not outperform fine-tuned traditional models like BERT when trained on sufficient labeled data. However, acquiring large-scale human-labeled datasets can be challenging, particularly in specialized domains. To address this gap, we propose Repeat-Error-Correction Learning, a framework that iteratively identifies and rewrites misclassified samples to augment the training set. First, we train a base BERT model using available text–label pairs. Next, the trained model infers labels on the same dataset, and we collect the misclassified samples. An LLM, such as GPT-4o-mini, then rewrites these erroneous texts while preserving their original labels. The rewritten texts are reintroduced into the training set, and the model is fine-tuned on this expanded corpus. By iteratively refining the training data through error correction and text rewriting, the proposed method aims to achieve robust classification performance despite limited initial annotations. Our results indicate that fine-tuning the base model by adding rewritten misclassified text achieved the highest validation set Micro-F1 score (77.33%). These findings contribute to a deeper understanding of a cost-friendly and efficient way to generate data for augmenting text classification models. | |||||||||
| 言語 | en | |||||||||
| 出版者 | ||||||||||
| 出版者 | NII Institutional Repository | |||||||||
| 言語 | en | |||||||||
| 日付 | ||||||||||
| 日付 | 2025-06-06 | |||||||||
| 日付タイプ | Issued | |||||||||
| 言語 | ||||||||||
| 言語 | eng | |||||||||
| 資源タイプ | ||||||||||
| 資源タイプ識別子 | http://purl.org/coar/resource_type/c_5794 | |||||||||
| 資源タイプ | conference paper | |||||||||
| ID登録 | ||||||||||
| ID登録 | 10.20736/0002002041 | |||||||||
| ID登録タイプ | JaLC | |||||||||
| 関連情報 | ||||||||||
| 関連タイプ | isReferencedBy | |||||||||
| 識別子タイプ | URI | |||||||||
| 関連識別子 | https://research.nii.ac.jp/ntcir/ntcir-18/index.html | |||||||||
| 言語 | en | |||||||||
| 関連名称 | NTCIR-18 Conference | |||||||||
| 開始ページ | ||||||||||
| 開始ページ | none | |||||||||
| 会議記述 | ||||||||||
| 会議名 | NTCIR-18 Conference | |||||||||
| 言語 | en | |||||||||
| 回次 | 18 | |||||||||
| 主催機関 | National Institute of Informatics | |||||||||
| 言語 | en | |||||||||
| 開始年 | 2025 | |||||||||
| 開始月 | 6 | |||||||||
| 開始日 | 10 | |||||||||
| 終了年 | 2025 | |||||||||
| 終了月 | 6 | |||||||||
| 終了日 | 13 | |||||||||
| 開催期間 | June 10-13, 2025 | |||||||||
| 言語 | en | |||||||||
| 開催会場 | National Institute of Informatics | |||||||||
| 言語 | en | |||||||||
| 開催国 | JPN | |||||||||