PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY

 

e-ISSN 2231-8526
ISSN 0128-7680

Home / Pre-Press / JST-6120-2025

 

Automated Generation of Calculation-based Physics Questions Using Text-to-Text Transfer Transformer (T5) for Malaysian Upper Secondary Education

Tuong Kiet Ngang and Chih How Bong

Pertanika Journal of Science & Technology, Pre-Press

DOI: https://doi.org/10.47836/pjst.34.3.10

Keywords: Artificial intelligence, automatic question generation, deep learning, natural language processing, question generation, Text-to-Text Transfer Transformer (T5), transfer learning

Published: 2026-06-19

Creating assessment questions to evaluate student achievement is a time-intensive task, especially in specialised subjects like Physics, particularly for structured, calculation-based problems in Science, Technology, Engineering, and Mathematics (STEM) disciplines. This study presents a web-based Automatic Question Generation (AQG) system that generates Physics questions for Malaysian upper secondary levels (Form 4 and 5). Covering topics such as Force and Motion, Heat, Light and Optics (Form 4), Pressure, and Electricity (Form 5), the system leverages transfer learning through the fine-tuning of the Text-to-Text Transfer Transformer (T5) model, a state-of-the-art natural language processing (NLP) technique. The methodology encompasses data construction, pre-processing, dataset generation, fine-tuning T5 model, evaluation, and inference. Performance was assessed through system experiments, ROUGE-L automatic evaluations, and human evaluations by expert educators, focusing on relevance, correctness, usefulness, and variety. The high ROUGE-L scores (0.82–0.85) indicate strong alignment with reference questions, while human evaluations demonstrate that the system generates contextually relevant and high-quality questions. The results from this study show that the AQG system matches the template approach for quality, but it is far more flexible and saves teachers a lot of manual work. It can also be scaled easily should more questions are needed. A comparative analysis with ChatGPT-4 was conducted, revealing the edge that a purpose-built, structured generator has over a broad and open-ended one. In short, deep-learning NLP can automate domain-specific question writing and make large-scale assessment design much simpler. These findings should interest researchers in computational linguistics, AI, and test automation.

ISSN 0128-7702

e-ISSN 2231-8534

Article ID

JST-6120-2025

Download Full Article PDF

Share this article

Recent Articles