Publications

You can also find my articles on my Google Scholar profile.

Papers


None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions Answering

Published in arXiv preprint, 2025

This study examines how “None of the Above” (NA) options affect LLM performance on multiple-choice questions. Results reveal a consistent 30-50% performance drop when NA is the correct answer, with domain dependency showing minimal impact on math reasoning but severe effects on uncertainty handling tasks like business ethics.

Recommended citation: Tam, Z.R., Wu, C.K., & Chen, Y.N. (2025). "None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions Answering." arXiv preprint arXiv:2503.01550.
Download Paper

Answer, Refuse, or Guess? Investigating Risk-Aware Decision Making in Language Models

Published in arXiv, 2025

This study formalizes the task of risk-aware decision making in LLMs, explores how models adapt their decisions to different risk levels, and proposes skill decomposition solutions to improve performance. The findings show that even advanced LMs require explicit prompt chaining to handle risk-aware decision making effectively.

Recommended citation: Wu, C.K., Tam, Z.R., Lin, C.Y., Chen, Y.N., & Lee, H. (2024). "Answer, Refuse, or Guess? Investigating Risk-Aware Decision Making in Language Models." arXiv preprint arXiv:2503.01332.
Download Paper

Clear Minds Think Alike: What Makes LLM Fine-tuning Robust? A Study of Token Perplexity

Published in arXiv, 2025

A systematic analysis revealing that fine-tuning with LLM-generated data not only improves target task performance but also reduces out-of-domain degradation compared to fine-tuning with ground truth data and ways to mitigate it

Recommended citation: Wu, C.C., Tam, Z.R., Lin, C.Y., Lee, H.Y., & Chen, Y.N. (2025). "Clear Minds Think Alike: What Makes LLM Fine-tuning Robust? A Study of Token Perplexity." arXiv preprint arXiv:2501.14315.
Download Paper

Let Me Speak Freely? A Study On The Impact Of Format Restrictions On Large Language Model Performance

Published in EMNLP 2024 Industry Track, 2024

Structured generation, the process of producing content in standardized formats like JSON and XML, is widely utilized in real-world applications to extract key output information from large language models (LLMs).

Recommended citation: Tam, Z.R., Wu, C.K., Tsai, Y.L., Lin, C.Y., Lee, H., & Chen, Y.N. (2024). "Let Me Speak Freely? A Study On The Impact Of Format Restrictions On Large Language Model Performance." EMNLP Industry Track, 1218-1236.
Download Paper

I Need Help! Evaluating LLM’s Ability to Ask for Users’ Support: A Case Study on Text-to-SQL Generation

Published in EMNLP 2024 Main Track, 2024

This study explores the proactive ability of LLMs to seek user support. We propose metrics to evaluate the trade-off between performance improvements and user burden, and investigate whether LLMs can determine when to request help under varying information availability.

Recommended citation: Wu, C.K., Tam, Z.R., Wu, C.C., Lin, C.Y., Lee, H., & Chen, Y.N. (2024). "I Need Help! Evaluating LLM's Ability to Ask for Users' Support: A Case Study on Text-to-SQL Generation." arXiv preprint arXiv:2407.14767.
Download Paper

Personalized EDM Subject Generation via Co-factored User-Subject Embedding

Published in PAKDD, 2024

This paper introduces the Co-Factored User-Subject Embedding based Personalized EDM Subject Generation Framework (COUPES), a model for creating personalized Electronic Direct Mail (EDM) subjects.

Recommended citation: Chen, Y.H., Tam, Z.R., & Shuai, H.H. (2024). "Personalized EDM Subject Generation via Co-factored User-Subject Embedding." Pacific-Asia Conference on Knowledge Discovery and Data Mining, 55-67.
Download Paper

An improved traditional chinese evaluation suite for foundation model

Published in arXiv, 2024

We present TMMLU+, a new benchmark designed for Traditional Chinese language understanding. TMMLU+ is a multi-choice question-answering dataset with 66 subjects from elementary to professional level. It is six times larger and boasts a more balanced subject distribution than its predecessor, Taiwan Massive Multitask Language Understanding (TMMLU).

Recommended citation: Tam, Z.R., Pai, Y.T., Lee, Y.W., Chen, J.D., Chu, W.M., Cheng, S., & Shuai, H.H. (2024). "An improved traditional chinese evaluation suite for foundation model." arXiv preprint arXiv:2403.01858.
Download Paper

Openassistant conversations-democratizing large language model alignment

Published in NeurIPS, 2024

Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT.

Recommended citation: Köpf, A., Kilcher, Y., von Rütte, D., Anagnostidis, S., Tam, Z.R., et al. (2024). "Openassistant conversations-democratizing large language model alignment." NeurIPS, 36.
Download Paper

Improving entity disambiguation using knowledge graph regularization

Published in PAKDD, 2022

Entity disambiguation plays the role on bridging between words of interest from an input text document and unique entities in a target Knowledge Base (KB).

Recommended citation: Tam, Z.R., Wu, Y.L., & Shuai, H.H. (2022). "Improving entity disambiguation using knowledge graph regularization." Pacific-Asia Conference on Knowledge Discovery and Data Mining, 341-353.
Download Paper

Gradient normalization for generative adversarial networks

Published in ICCV, 2021

In this paper, we propose a novel normalization method called gradient normalization (GN) to tackle the training instability of Generative Adversarial Networks (GANs) caused by the sharp gradient space.

Recommended citation: Wu, Y.L., Shuai, H.H., Tam, Z.R., & Chiu, H.Y. (2021). "Gradient normalization for generative adversarial networks." ICCV, 6373-6382.
Download Paper

Character-preserving coherent story visualization

Published in ECCV, 2020

Story visualization aims at generating a sequence of images to narrate each sentence in a multi-sentence story.

Recommended citation: Song, Y.Z., Tam, Z.R., Chen, H.J., Lu, H.H., & Shuai, H.H. (2020). "Character-preserving coherent story visualization." European Conference on Computer Vision, 18-33.
Download Paper