Mereology in Scientific Knowledge Graphs

Fylo Team

June 6, 2025

Key Points

Mereology provides formal theoretical foundations for part-whole relationships that enable hierarchical organization and integration of fragmented scientific knowledge
Hierarchical emergence creates novel properties at higher levels through complex interactions among lower-level components, with “downward causation” allowing higher structures to influence lower components
Fylo transforms scientific papers into interconnected knowledge graphs using mereological principles for part-whole relationship analysis
Scientific discourse graphs utilize mereological hierarchies to organize knowledge from atomic data units to comprehensive paradigms
Mereological principles enable modular knowledge organization across open science platforms, enabling collaborative contributions, multiple valid decompositions, and evolutionary growth

Mereology as a foundational framework for scientific discourse graphs

What mereology brings to knowledge representation

Mereology—the formal study of part-whole relationships—provides mathematical precision and conceptual clarity for organizing scientific knowledge into hierarchical, modular structures. This research reveals how mereological principles enable the systematic decomposition of complex scientific discourse into atomic units that can be flexibly recombined, supporting both collaborative knowledge building and emergent understanding across multiple scales of analysis.

Mereological Knowledge Graphs

The convergence of classical mereological theory with modern knowledge representation systems offers solutions to fundamental challenges in scientific communication: fragmentation across disciplines ^[1] , ^[2] , difficulty in synthesizing diverse findings, and barriers to collaborative contribution. By treating scientific knowledge as composed of formal part-whole relationships, we can create systems that support multiple valid decompositions, enable distributed contribution without requiring complete knowledge of the whole, and facilitate the emergence of higher-order insights from lower-level components.

Classical mereology meets modern knowledge systems

Foundational principles for knowledge organization

Classical Extensional Mereology provides three core axioms that translate directly to knowledge representation: reflexivity (every piece of knowledge is part of itself), transitivity (if evidence supports a claim, and that claim supports a theory, then the evidence supports the theory), and antisymmetry (preventing circular dependencies in knowledge structures) ^[3] , ^[4] , ^[5] , ^[6] , ^[7] , ^[8] , ^[9] , ^[10] . These principles establish parthood as a partial ordering relation, giving mathematical rigor to intuitive notions about how scientific knowledge builds from evidence to claims to theories.

The supplementation principles ensure meaningful decompositions—proper parts must be “supplemented” by disjoint parts, preventing trivial breakdowns where a theory is only composed of itself ^[11] , ^[12] , ^[13] , ^[14] , ^[15] . Extensionality guarantees that objects with identical proper parts are themselves identical, crucial for maintaining consistency when the same knowledge components are accessed through different pathways or perspectives.

Major ontological frameworks like SUMO and DOLCE already incorporate these mereological principles, demonstrating their practical applicability ^[16] . These upper-level ontologies use part-whole relationships to structure domain knowledge systematically, enabling hierarchical classification, compositional reasoning about complex entities, and consistent handling of inheritance across knowledge bases.

The emergence paradox in knowledge hierarchies

Recent advances in emergence theory reveal how higher-level knowledge structures exert downward causation on their components through contextual constraints and systemic coherence requirements ^[17] , ^[18] , ^[19] , ^[20] . In knowledge systems, this manifests as theoretical frameworks constraining how evidence is interpreted, or paradigm-level assumptions shaping which questions researchers pursue. The phenomenon involves “configurational forces”—novel causal powers arising from specific arrangements of knowledge components rather than the components themselves ^[21] .

This bidirectional causality creates a dynamic tension: while knowledge builds bottom-up from evidence to theories, established theoretical structures simultaneously shape how new evidence is gathered and interpreted ^[22] , ^[23] . Understanding this interplay is crucial for designing systems that support both rigorous empirical grounding and theoretical innovation.

Scientific discourse as mereological structure

Natural hierarchies in scientific knowledge

Scientific knowledge exhibits inherent mereological organization across five primary levels: data → evidence → claims → theories → paradigms. Each level represents both a “whole” composed of lower-level parts and a “part” contributing to higher-level structures ^[24] . Raw observations aggregate into processed evidence, evidence supports specific claims, claims integrate into coherent theories, and theories operate within overarching paradigms.

This hierarchy reflects how scientists actually work: they don’t simply accumulate facts but organize them into increasingly abstract and general structures. The mereological framework captures this process formally, enabling computational systems to mirror human scientific reasoning patterns.

Different scientific disciplines exhibit distinct mereological patterns. Hierarchical structures in physics and mathematics show strong vertical coherence with cumulative building upward through general principles. Horizontal structures in linguistics and sociology display more segmented, domain-specific organization with specialized languages and weaker vertical integration. These differences suggest that effective knowledge representation systems must support variable mereological organizations rather than imposing uniform structures.

Atomic decomposition in practice

The Discourse Graphs protocol exemplifies practical mereological decomposition by breaking research into “atomic elements” that function like “Lego bricks”—modular components that can be shared, remixed, and updated independently ^[25] . This approach distinguishes evidence (empirical observations) from claims (proposed answers), enabling multiple interpretations of the same evidence and supporting decentralized knowledge exchange.

Recent work on “atomic reasoning” for scientific table claim verification demonstrates how complex scientific assertions can be decomposed into atomic semantic units, processed independently using modular reasoning skills, then synthesized through compositional aggregation ^[26] , ^[27] . This enables fine-grained verification while maintaining awareness of how parts relate to wholes.

Fylo’s implementation of mereological principles

Architecture embodying part-whole relationships

Fylo Core transforms “siloed PDFs” into a “shared graph” of knowledge through systematic mereological decomposition ^[28] , ^[29] . The platform breaks scientific papers into fundamental units—evidence nodes capturing empirical observations, claim nodes representing propositional statements, and relationship edges encoding part-whole connections. These atomic units can be recombined flexibly, with the same evidence potentially supporting different claims or claims drawing support from varied evidence combinations.

While specific documentation of Fylo’s six-layer architecture remains limited, the system clearly implements hierarchical structuring where individual statements function as parts of larger arguments, arguments form parts of broader theoretical frameworks, and papers become compositions of interconnected discourse units. This multi-level organization enables researchers to work at their preferred level of abstraction while maintaining connections across scales.

Mereological disclosure through UI design

Fylo’s interface patterns embody mereological disclosure—a sophisticated application of progressive disclosure that maintains awareness of both parts and wholes during information exploration. Accordions and expandable elements reveal part-whole relationships incrementally, allowing users to drill down from summary views to detailed evidence while preserving context. Tree structures enable traversal between general concepts and specific implementations, supporting both top-down decomposition and bottom-up composition of knowledge.

The system’s use of Directed Acyclic Graphs ensures coherent part-whole relationships while preventing circular dependencies that would violate mereological principles ^[30] . Path generation through these DAGs creates multiple valid routes through knowledge structures, embodying the mereological principle that the same whole can be reached through different combinations of parts.

Open science through mereological decomposition

Practical implementations enabling collaboration

The Open Research Knowledge Graph (ORKG) demonstrates large-scale application of mereological principles by decomposing research papers into semantic components that can be compared and synthesized across publications ^[31] , ^[32] , ^[33] , ^[34] . This transformation from document-based to component-based knowledge enables machine-processable research synthesis while supporting incremental knowledge building where new contributions link to existing components.

Wikidata’s success—with 117+ million data items organized through hierarchical part-whole relationships—proves that mereological structures can scale while supporting collaborative editing ^[35] , ^[36] , ^[37] , ^[38] . Contributors modify specific knowledge components without affecting the whole system, while property-based organization allows multiple valid decompositions of the same entities.

Supporting evolutionary research processes

Mereological decomposition uniquely enables the “openness” that Yanai & Lercher describe in their Nature Biotechnology paper ^[39] , ^[40] . By making the research process transparent through modular components rather than hiding it within traditional publications, systems can support the natural evolution of ideas through variation and selection. Researchers can “follow the data” by branching from existing knowledge components in new directions without rebuilding entire theoretical frameworks.

The Collective Knowledge framework validates this approach in industrial ML/AI optimization, where research projects decompose into reusable components with unified APIs ^[41] , ^[42] , ^[43] , ^[44] . This “plug-and-play” model enables parallel development of different knowledge modules while maintaining compatibility through standardized interfaces.

Implementation principles for knowledge graphs

Design patterns for mereological systems

Successful implementations share key patterns: hierarchical decomposition with clear part-whole relationships at defined levels, interface standardization through common APIs and metadata formats, semantic interoperability via standardized vocabularies ^[45] , and flexible granularity supporting multiple decomposition levels based on use cases.

Technical approaches center on RDF/OWL frameworks for expressing mereological relationships, SPARQL queries for navigating part-whole hierarchies, and graph databases optimized for mereological operations. These technologies enable both human researchers and AI systems to traverse knowledge structures efficiently while maintaining formal correctness.

Managing multiple valid decompositions

Real-world knowledge rarely admits single correct decompositions. Wikidata’s property-based system demonstrates how to handle multiple classification schemes through qualifier mechanisms for contextual part-whole relationships and community-driven resolution of conflicts ^[46] . Materials science knowledge graphs like MatKG show domain-specific applications where materials admit multiple valid decompositions based on different properties—structural, functional, or compositional ^[47] , ^[48] , ^[49] .

Systems must balance consistency with flexibility, implementing validation systems to check part-whole relationship integrity while allowing schema evolution as knowledge develops. Version control becomes crucial, maintaining consistency across different versions of knowledge components while supporting growth through incremental additions and hierarchical expansions.

Theoretical foundations and future directions

Context-dependent mereology in knowledge systems

Traditional mereology assumes fixed part-whole relationships, but knowledge representation demands contextual mereology where relationships vary based on analytical context, intended use, and domain-specific requirements ^[50] , ^[51] . The same entity might be atomic in one context but composite in another—a research finding could be an indivisible unit for policy applications but decomposable into methods and results for methodological analysis.

This context-dependency requires adaptive systems that adjust part-whole relationships based on query context, support multi-perspective analysis of the same knowledge structures, and recognize discipline-specific part-whole conventions. Dynamic boundaries must evolve as knowledge develops, theoretical frameworks shift, and research priorities change.

Integration with machine learning and AI

Future systems will likely combine formal mereological principles with data-driven approaches ^[52] , ^[53] , ^[54] , ^[55] , ^[56] , ^[57] . Machine learning can identify implicit part-whole relationships in unstructured text, suggest optimal decompositions based on usage patterns, and detect emergent structures arising from component interactions. However, maintaining mereological consistency while leveraging statistical methods remains an open challenge requiring hybrid architectures that preserve formal guarantees while enabling flexible learning.

Conclusion: Towards mereological scientific infrastructure

Mereology provides more than theoretical elegance—it offers practical solutions to pressing challenges in scientific communication and collaboration ^[58] , ^[59] . By treating knowledge as formally structured through part-whole relationships, we can build systems that support distributed contribution without sacrificing coherence, enable multiple perspectives without losing consistency, and facilitate emergence of higher-order insights while maintaining empirical grounding.

The convergence of mereological theory, open science principles, and modern computational capabilities points toward new forms of scientific infrastructure. These systems will move beyond document-centric models toward dynamic knowledge graphs where ideas flow freely across disciplinary boundaries, researchers contribute at multiple scales simultaneously, and collective intelligence emerges from well-structured collaborative processes.

Success requires balancing formal rigor with practical flexibility, supporting both human intuition and computational precision, and maintaining local autonomy while enabling global coherence. The examples of ORKG, Wikidata, and emerging platforms like Fylo demonstrate that these goals are achievable. By grounding scientific discourse systems in mereological principles, we can create infrastructure that amplifies human scientific capabilities while preserving the rigor and reliability that define scientific knowledge.