Skip to main content

Metadata

Metadata provides critical context and structure to data, enabling efficient querying and accurate responses. It is categorized into two types: Schema Metadata and Semantic Metadata. This documentation details these metadata types, along with examples in YAML format.

1. Schema Metadata

Schema Metadata represents the structure of tabular data, defining how data is organized into rows and columns. It outlines the relationships between different tables or datasets, such as linking related data through foreign keys and joins. Additionally, it specifies the type of each column (e.g., integer, string, decimal), ensuring that data is accurately represented and enabling efficient operations across datasets. This metadata acts as the foundation for querying and interacting with data, ensuring proper relationships and data integrity across the system.

Key Elements of Schema Metadata

  • Entities and Attributes: Defines primary data storage units (e.g., tables, collections) and their attributes (columns, fields).
  • Data Structure and Columns: Defines the organization of data into tabular formats, detailing the columns (attributes) of each table or dataset and their respective data types (e.g., integer, string, decimal).
  • Column Types and Validation: Specifies the types of data each column can hold (e.g., text, number, date) and any validation or integrity constraints applied to the column (e.g., uniqueness, required values).
  • Data Relationships: Describes how different datasets or tables are related through mechanisms like foreign keys and joins, establishing links between related data.

Example: Schema Metadata in YAML Format

Schema Metadata Example

2. Semantic Metadata

Semantic Metadata enriches data by adding context, meaning, and business logic. It goes beyond basic data structure, capturing detailed context and defining relationships, rules, and calculations that provide analytical value. It can derive from multiple Schema Metadata sources, representing both the attributes and the calculative relationships between columns and data elements, facilitating deeper insights and meaningful analysis.

Key Elements of Semantic Metadata

  • Contextual Enrichment: Adds detailed context to data, capturing not just structural information but also the meaning behind data elements and their interrelations.
  • Derived Metrics: Defines computed values that help in analysis, such as aggregations or complex calculations between data points (e.g., TotalRevenue, AverageOrderValue).
  • Complex Relationships: Captures calculative relationships between multiple Schema Metadata sources, representing how different data attributes are related (e.g., joins, computed values between tables or entities).
  • Aggregations: Details how data should be aggregated for reporting or analysis (e.g., summing, averaging, or grouping data).

Example: Semantic Metadata in YAML Format

Semantic Metadata Example

Conclusion

Metadata is essential for structuring and interpreting data. Schema Metadata defines the structure and organization of data, ensuring it can be efficiently queried. It outlines the relationships between different data elements, such as tables, columns, and foreign keys.

Semantic Metadata, on the other hand, adds context by defining business rules, calculations, and the meaning behind the data. It connects Schema Metadata elements and establishes relationships between them, enabling deeper analysis and insights. Together, these two types of metadata ensure precise answers to your queries.

For comprehensive information on metadata structuring and management, please refer to the Getting Started with Metadata section.