Generating Multiple-Choice Knowledge Questions with Interpretable Difficulty Estimation using Knowledge Graphs and Large Language Models
Mehmet Can Şakiroğlu
Master Student
(Supervisor: Prof.Dr. H. Altay Güvenir) Computer Engineering Department
Bilkent University
Abstract: Generating multiple-choice questions (MCQs) with difficulty estimations remains challenging in automated MCQ-generation systems. This thesis proposes a novel methodology for generating MCQs with difficulty estimations from the given text documents by utilizing knowledge graphs (KGs) and large language models (LLMs). Given a set of documents, the approach proposed in this thesis uses an LLM to construct a KG, from which MCQs are then systematically generated. Each MCQ is generated by selecting a node from the KG as the key, sampling a related triple or quintuple—optionally augmented with an extra triple—and prompting an LLM to generate a corresponding stem from these graph components. Distractors, the wrong choices, are then selected from the KG. For each MCQ, nine difficulty signals are computed and combined into a unified difficulty score using a data-driven approach. Experimental results demonstrate that our method generates high-quality MCQs whose difficulty estimations are interpretable and align with human perceptions. Our approach improves MCQ generation by integrating structured knowledge representations with LLMs and a data-driven difficulty estimation model.
Date: October 21, Tuesday @ 10:00
Place: EA 516