Semantic Layering

Semantic layering: the technology reshaping pharma data architecture

How is semantic layering technology addressing significant efficiency gaps, creating the infrastructure for more efficient manufacturing operations?

Patrick Hylett at Plvs Ultra

Modern pharmaceutical manufacturing lines can contain many pieces of equipment generating continuous data streams. Yet, when a batch deviates from specification, investigation teams must navigate a wide range of disconnected systems to determine a root cause. The industry faces an operational challenge where substantial data volumes exist across systems, yet synthesis across these systems remains time-intensive.

Deviation investigations represent a significant operational challenge. Poor quality costs represent 25-40% of sales revenue for companies facing quality challenges, while supply chain disruptions contribute substantially to operational disruption.1 The disconnect between data availability and operational decision-making capability represents a significant efficiency gap.


Investigation process and resource allocation

When a pharmaceutical manufacturing deviation occurs, investigation teams must access data across multiple systems to determine root cause and scope. Investigation effort is substantially consumed by data gathering, data integration and engineering activities. During investigations, medicines remain in quarantine, regulatory timelines extend and operational costs accumulate.

The scale of deviations is also substantial: typical manufacturing sites can experience hundreds of deviations per year; chemistry, manufacturing and controls data preparation for regulatory submissions remain significantly manual when systems are fragmented; and regulatory submissions and tech transfer processes can experience delays of several months. Technical structure contributes to this pattern. Batch data exists in parallel within the manufacturing system, quality assurance system, supply chain system and regulatory system. These systems do not maintain semantic awareness of their interdependencies. Without semantic connection, investigation requires manual cross-system correlation rather than systematic analysis.

The industry has invested substantially in cloud data lakes and enterprise systems in recent years, yet return on investment metrics remain unclear. Migrating fragmented data to centralised cloud locations does not resolve underlying system fragmentation. In short, the core issue remains semantic disconnection, not data accessibility.


Traditional data integration approaches

Traditional data integration approaches consolidate information into unified systems or data lakes. This improves data accessibility, but does not establish semantic meaning. When systems employ different vocabularies and contexts for equivalent concepts, consolidation alone cannot resolve the underlying problem.

Current industry benchmarking quantifies this challenge. Approximately 51% of laboratory instruments operate manually.2 Manufacturing machines range from 35-57% manual operation depending on product category and, despite substantial cloud investment and enterprise system deployment, operational efficiency improvements have remained modest.2 This pattern reflects a structural issue whereby migrating fragmented data to centralised locations does not address the semantic gap. A ‘batch’ in the manufacturing system, a ‘lot’ in the quality assurance system, and a ‘shipment’ in the supply chain system represent different states or transformations of the same entity, yet the systems do not recognise this equivalence.

Traditional integration requires ongoing bespoke engineering to maintain system connections. Each system modification creates new disconnections. The result is that organisations with sophisticated information technology infrastructure continue to rely on manual processes for critical operational decisions. The fundamental issue is semantic rather than technological, and resolution requires a different architectural approach.


Semantic layering architecture

Semantic layering does not replace existing systems. Rather, it establishes an intelligence layer above existing infrastructure that creates semantic connections. The architecture comprises three components:

Image

The raw data layer: this contains existing fragmented data across all systems, which remains unchanged and independent. Manufacturing systems, quality assurance databases, supply chain platforms and regulatory repositories continue operating without modification

The ontology and enrichment layer: this sits above the raw data, establishing formal definitions of business concepts and semantic transformations into connected data. Ontologies specify the semantic definition of a batch, which elements are interdependent, how material sourcing relates to process parameters and how quality outcome relates to environmental conditions

Consumption layer: applications and users access semantically connected data through the consumption layer, which exposes a unified business data fabric. This enables users and analytical systems to query information using familiar business language rather than system-specific database schemas.

The practical result is that when batch data in manufacturing, batch records in quality assurance and batch tracking in supply chain are semantically connected, the system can recognise these relationships. Inference becomes possible. The system can respond to queries such as ‘which other batches used the same raw material source?’ without manual investigation. This approach preserves existing system investments while adding semantic capability. Implementation timelines are shorter than traditional integration because existing infrastructure remains in place. Maintenance burden is reduced because system modifications do not cascade through complex data transformation pipelines.

Operationally, this transforms data from a compliance documentation requirement into a resource for operational decision-making, enabling faster analysis, more efficient operations and reduced development timelines.


Industry drivers for implementation

Three factors establish operational drivers for this transformation:


Advanced therapies scale requirements
Cell and gene therapies involve development costs approaching $2bn and per-dose manufacturing costs that require substantial process optimisation for market viability.3 Manual, labour-intensive processes present scaling constraints. Regulatory approval structures mean process modifications require significant validation investment. Scaling these therapies requires decision-making systems based on comprehensive operational understanding.


Supply chain operational resilience
Recent global disruptions have demonstrated supply chain vulnerability. The IQVIA Institute for Human Data Science estimates that the biopharma industry loses approximately $35bn annually as a result of failures in temperature-controlled logistics – from lost product, clinical trial loss and replacement costs, to wasted logistics costs and the costs of root-cause analysis.4 Distributed manufacturing through contract manufacturing organisations, contract packaging and third-party logistics creates coordination complexity across multiple partners. Real-time visibility and coordination across the supply network have become operationally necessary.


Regulatory requirements evolution
Regulatory guidance increasingly emphasises artificial intelligence life cycle management, data integrity and validation standards.5 International guidelines require system validation commensurate with operational criticality.6 Quality by design frameworks require manufacturers to establish comprehensive process understanding. Regulatory bodies increasingly require outcome-based compliance supported by auditable data systems.

These factors converge on a single operational requirement. Pharmaceutical manufacturers must establish data understanding capabilities that extend across their operations. This requires understanding system relationships and interdependencies across the entire organisation. This capability enables downstream requirements including faster investigation response, regulatory documentation efficiency, supply chain coordination and automated decision support for advanced therapies.

Image


Conclusion

Pharmaceutical manufacturing operations accumulate substantial data across multiple systems. The operational challenge is not data volume, but rather semantic understanding across system boundaries. Semantic layering establishes the missing infrastructure for this understanding. By establishing formal ontologies that define business concepts and their relationships, organisations can extract insights from existing data systems and enable faster operational decisions.

Current industry investments in cloud infrastructure and enterprise systems represent significant capital allocation. Semantic layering leverages these investments rather than requiring replacement. Adding semantic meaning to existing data systems allows organisations to improve return on existing infrastructure investment. The progression from fragmented data systems to integrated understanding, from reactive investigation to systematic analysis, and from manual process to informed decision-making, becomes possible when pharmaceutical manufacturers establish common data meaning across enterprise operations.

This capability represents an operational efficiency requirement for competitive pharmaceutical manufacturing in 2025 and beyond.


Image

Patrick Hyett is CEO and co-founder of Plvs Ultra. With 31 years of pharmaceutical industry experience, including leadership roles at GSK’s Digital Innovation Centre, he has spent his career addressing data connectivity challenges in manufacturing and regulatory operations. He founded Plvs Ultra to create practical solutions for connecting previously isolated pharmaceutical data sources and enabling faster investigation times.

0