An Evaluation of the Subject Matter Expertise of a Large Language Model Trained Using a Curated Pipeline Integrity Management Dataset
Proceedings Publication Date
Presenter
Nigel Curson
Presenter
Company
Author
Nigel Curson, Daniel Foster-Smith, Louise O'Sullivan
Part of the proceedings of
Abstract

As artificial intelligence tools, particularly large language models (LLMs), continue to evolve, their potential for supporting technically complex domains such as pipeline integrity management is rapidly gaining attention. This paper presents a structured initiative to train a state-of-the-art LLM on the domain-specific knowledge of pipeline integrity, aligning with the technical depth and strategic relevance of Penspen’s core business.
The research has involved fine-tuning and contextual reinforcement of the LLM using curated datasets, technical standards, failure case studies, inspection technologies (ILI, CIPS, DCVG), and asset lifecycle data. The model's responses will then be evaluated through a progressive series of technical challenges posed by independent subject matter experts. These challenges will increase in complexity, covering key areas such as corrosion threat assessment, crack management, Fitness-for-Service analysis, risk modelling, and hydrogen repurposing.
Scoring will be conducted using a quantitative framework designed to assess three core criteria:
1. Technical Accuracy – The degree to which responses are correct and compliant with applicable standards.
2. Actionability – The practical reliability of the recommendations for decision support.
3. Request Dependency – The extent to which response quality is affected by the specificity or structure of the user query.
Each output will be rated for its usefulness and risk of misdirection, providing objective metrics on the trustworthiness of LLM-generated guidance in safety-critical applications.
The outcomes of this evaluation will inform the development of LLM-driven tools that can safely augment engineering workflows, improve consistency in integrity assessments, and support the training of early-career engineers while identifying the current limitations and governance requirements for AI in high-stakes infrastructure contexts.

To view the video or download the paper please register here for free

You already have access? Sign in now.