On General and Biomedical Text-to-Graph Large Language Models

11 Dec 2025·
Lorenzo Bertolini
Equal contribution
Roel Hulsman
Roel Hulsman
Equal contribution
,
Sergio Consoli
,
Antonio Puertas Gallardo
,
Mario Ceresa
· 0 min read
Figure 2
Abstract
Knowledge graphs and ontologies represent symbolic and factual information that can offer structured and interpretable knowledge. Extracting and manipulating this type of information is a crucial step in complex processes. While large language models (LLMs) are known to be useful for extracting and enriching knowledge graphs and ontologies, previous work has largely focused on comparing architecture-specific models (e.g. encoder-decoder only) across benchmarks from similar domains. In this work, we provide a large-scale comparison of the performance of certain LLM features (e.g. model architecture and size) and task learning methods (fine-tuning vs. in-context learning (iCL)) on text-to-graph benchmarks in two domains, namely the general and biomedical ones. Experiments suggest that, in the general domain, small fine-tuned encoder-decoder models and mid-sized decoder-only models used with iCL reach overall comparable performance with high entity and relation recognition and moderate yet encouraging graph completion. Our results also suggest that, regardless of other factors, biomedical knowledge graphs are notably harder to learn and are better modelled by small fine-tuned encoder-decoder architectures. Pertaining to iCL, we analyse hallucinating behaviour related to sub-optimal prompt design, suggesting an efficient alternative to prompt engineering and prompt tuning for tasks with structured model output.
Type
Publication
Semantic Web Journal, 17(1):1-27
publications
Roel Hulsman
Authors
PhD Candidate in Causal Machine Learning

I am a second-year PhD candidate in causal machine learning at the Amsterdam Machine Learning Lab (AMLab), supervised by Sara Magliacane and Herke van Hoof. My PhD is funded by Adyen, a global financial technology company, where I spent a minor portion of my time. My research primarily focuses on causal methods for (nonstationary) time series, although I find myself broadly interested in the intersection of machine learning, statistics and econometrics, with a hint of philosophy.

I graduated with distinction from the University of Oxford with a MSc in Statisticial Science. While at Oxford, I was fortunate to be supervised by Rob Cornish and Arnaud Doucet for my dissertation on the mathematical guarantees of conformal prediction. I also graduated from the University of Groningen with a BSc in Econometrics and Operations Research and a BA in Philosophy of a Specific Discipline (in my case the social sciences), both cum laude.

Before starting my PhD, I spent a short period at ASML as a data analyst for business intelligence, where I optimised business processes for the manufacturing of lithography systems. Afterwards, I moved to a role in AI for healthcare at the Joint Research Centre (JRC) in Italy, an independent research institute of the European Commission. There, I mainly worked on conformal risk control for pulmonary nodule detection and knowledge graph construction using Large Language Models (LLMs).