HCM City students' research on LLM accuracy published in top-tier AI journal

April 13, 2026 - 11:11
A research paper originating from a university graduation thesis at the HCM City University of Technology (HCMUT) under Việt Nam National University-HCM City has been published in a world-leading journal on artificial intelligence in education.
Members of the research team from the HCM City University of Technology. The team’s innovative Single-Token Logit (STL) technique to improve Large Language Model accuracy was recently published in a prestigious Q1 academic journal. — Photo Courtesy of HCMUT

HCM CITY — A research paper originating from a university graduation thesis at the HCM City University of Technology (HCMUT) under Việt Nam National University-HCM City has been published in a world-leading journal on artificial intelligence in education.

The study, titled "Enhancing Large Language Model Performance for Automatic Zero-Shot Multiple-Choice Question Answering via Single-Token Logit Prompting", was featured in Computers and Education: Artificial Intelligence.

The journal is ranked Q1, standing at the top of the Education field and fifth in the Artificial Intelligence category, according to SCImago.

The research team, led by Đặng Phú Quốc (former student) and Trần Trương Tuấn Phát (lecturer), included several third-year students under the guidance of Associate Professor Quản Thành Thơ, Dean of the Faculty of Computer Science and Engineering at HCMUT.

Solving the "bias" in AI

The research addresses a common flaw in open-source Large Language Models (LLMs) like LLaMA, DeepSeek, and Mistral when handling multiple-choice questions (MCQ) in a "zero-shot" environment – a setting where an AI agent or model must perform tasks or navigate scenarios it has never encountered during training.

While these models are powerful, they often lack a true understanding of multiple-choice structures.

Simply shuffling the order of answers (A, B, C, or D) can cause the model to change its result, even if the question content remains the same.

This phenomenon, known as Multiple-Choice Symbol Binding (MCSB) limitation, reduces the reliability of AI in educational applications such as automated grading or question bank generation.

To overcome this, the HCMUT team proposed the Single-Token Logit (STL) technique. Instead of presenting all options at once and asking the AI to choose one, the STL technique isolates each answer.

The model is asked a simple "Yes/No" question: "Is this specific answer correct?"

The final decision is determined based on the probability of the "Yes" token for each evaluation.

This mechanism allows the model to verify each choice independently, eliminating the bias caused by the position or label of the answers.

The team also integrated Retrieval-Augmented Generation (RAG), allowing the model to access external knowledge sources.

Testing across three standard scientific datasets (ARC, OpenBookQA, and SciQ) showed that the STL method performed equal to or better than popular methods like Chain-of-Thought (CoT).

In some configurations, it improved accuracy by up to 11 percentage points while significantly reducing computational costs.

The STL technique offers immediate benefits for the education sector. It can assist teachers in reviewing exam quality, automatically suggesting answers for new questions, and supporting automated grading systems.

The success of the project is particularly noteworthy as it spanned two years and four months of persistent refinement following several initial rejections.

“Improving the reliability of AI models does not always require larger models or more complex architectures,” according to the research team.

Sometimes, a small change in how we design prompts can create a major difference in practical implementation, they said. — VNS

E-paper