ログイン
Language:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. カンファレンス等
  2. NTCIR
  3. 18th (2024-2025)

UPxSocio at NTCIR-18 MedNLP-CHAT Task: Similarity-Based Few-Shot Example Selection for Prompt-Based Detection

https://doi.org/10.20736/0002002055
https://doi.org/10.20736/0002002055
1faa1590-4c25-4cd8-be01-5a515a2b9516
名前 / ファイル ライセンス アクション
05-NTCIR18-MEDNLP-SupranesM.pdf 05-NTCIR18-MEDNLP-SupranesM.pdf (1.1 MB)
アイテムタイプ デフォルトアイテムタイプ(フル)(1)
公開日 2025-06-06
タイトル
タイトル UPxSocio at NTCIR-18 MedNLP-CHAT Task: Similarity-Based Few-Shot Example Selection for Prompt-Based Detection
言語 en
作成者 Michael Van Supranes

× Michael Van Supranes

en Michael Van Supranes

Search repository
Martin Augustine Borlongan

× Martin Augustine Borlongan

en Martin Augustine Borlongan

Search repository
Joseph Ryan Lansangan

× Joseph Ryan Lansangan

en Joseph Ryan Lansangan

Search repository
Genelyn Ma. Sarte

× Genelyn Ma. Sarte

en Genelyn Ma. Sarte

Search repository
Shaowen Peng

× Shaowen Peng

en Shaowen Peng

Search repository
Shoko Wakamiya

× Shoko Wakamiya

en Shoko Wakamiya

Search repository
Eiji Aramaki

× Eiji Aramaki

en Eiji Aramaki

Search repository
内容記述
内容記述タイプ Abstract
内容記述 This paper presents our submission to the MedNLP-CHAT Task at NTCIR-18, which focuses on detecting medical, ethical, and legal risks in chatbot-generated responses. We propose a two-step prompt-based classification framework using the Gemini-1.5-flash model. The method first generates support statements to guide reasoning, which are then integrated into a few-shot prompt for final classification. We evaluated our approach on the English versions of the Japanese and German subtasks, submitting two systems per subtask that varied in example selection strategy and label distribution. Our systems achieved strong performance in detecting medical risks—particularly in the German subtask—while ethical and legal risks were more challenging. To better understand the design factors influencing performance, we conducted ablation studies across 24 prompt variants. Logistic regression and CHAID analyses revealed that accuracy depends on complex interactions between subtask language, example similarity, actual label, and selection method. Higher similarity improves classification of risk-present cases but harms performance on risk-absent cases, indicating a trade-off between recall and false positives. The $k$-nearest method was more effective under high similarity, while $k$-spread offered balanced results across classes. Although the two-step prompting strategy did not show a statistically significant advantage overall, the best-performing configuration used five support statements, with diminishing gains beyond that. Our findings suggest that optimized prompt design, particularly with controlled support and example selection, can improve risk detection without requiring large-scale training or high computational resources.
言語 en
出版者
出版者 NII Institutional Repository
言語 en
日付
日付 2025-06-06
日付タイプ Issued
言語
言語 eng
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_5794
資源タイプ conference paper
ID登録
ID登録 10.20736/0002002055
ID登録タイプ JaLC
関連情報
関連タイプ isReferencedBy
識別子タイプ URI
関連識別子 https://research.nii.ac.jp/ntcir/ntcir-18/index.html
言語 en
関連名称 NTCIR-18 Conference
開始ページ
開始ページ none
会議記述
会議名 NTCIR-18 Conference
言語 en
回次 18
主催機関 National Institute of Informatics
言語 en
開始年 2025
開始月 6
開始日 10
終了年 2025
終了月 6
終了日 13
開催期間 June 10-13, 2025
言語 en
開催会場 National Institute of Informatics
言語 en
開催国 JPN
戻る
0
views
See details
Views

Versions

Ver.1 2025-06-04 08:01:38.973975
Show All versions

Share

Share
tweet

Cite as

Other

print

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX
  • ZIP

コミュニティ

確認

確認

確認


Powered by WEKO3


Powered by WEKO3