Skip to content

Data Product Creation in Analytical Data Sharing

Vendia enables you to create meaningful, curated data products for secure sharing with both internal and external consumers, regardless of their data platform. Data products can be distributed in multiple formats and are governed by robust access controls and transformation options.

What are Data Products?

Data products are curated, reusable, and value-delivering datasets that are designed to be consumed by internal teams or external customers/partners to unlock insights.

Data products in Vendia are built on open data standards and support packaging and distribution across clouds, regions, and formats like Iceberg, Delta Share, and CSV, enabling Data Providers to meet their customers where they are.

Key Benefits

  • Democratization: Enables data teams to truly democratize their data estate
  • Seamless Consumption: Data can be seamlessly consumed, produced, shared, and monetized across diverse business ecosystems
  • Innovation Driver: Unlocks innovation, drives efficiency, and helps maintain competitiveness

Best Practices for Creating Data Products

Follow these best practices to ensure your data products are valuable, compliant, and easily consumable by your intended audience.

Start with the Business Goal or End User in Mind

  • Capture end user goals
  • Understand target data platform, data formats and update frequency
  • Ensure data quality and implement cleansing procedures
  • Understand data sensitivity and privacy requirements for compliance

Data Product Design

  • Identify which data sources you need to import data from to create your standard or bespoke data product
  • If you use a data catalog, look for datasets that have passed quality checks and are complete
  • Review metadata and documentation for comprehensiveness
  • Ensure your source data is accessible from a permissions perspective
  • Make sure that you reuse data assets to create your data product

Data Product Creation, Governance, and Deployment

  • Create data product versions for standard and bespoke data products
  • Tag data products with relevant metadata (e.g., sensitive data, PII, etc.)
  • Define appropriate data contracts to ensure smooth collaboration with your partners
  • Provide a clear and concise description for each data product
  • Include sample SQL queries to help consumers understand how to use the data
  • Supply a data dictionary to explain the schema and fields in your data product
  • Ensure your data products are available in the regions and open data formats (Iceberg, Delta Share, CSV) that your customers need

Steps to Create a Data Product

  1. Ingest and Prepare Data

    • Ingest data from supported sources (e.g., Snowflake, Cloudera, MySQL, PostgreSQL, Amazon S3)
    • Apply filtering, masking, and joins as needed to curate your dataset
  2. Define the Data Product

    • Select one or more Vendia Tables containing the data you wish to share
    • Optionally, create new tables with additional transformations for your data product
    • Add metadata to help consumers understand the product’s content and purpose
  3. Choose Distribution Format(s)

    • Export your data product in one or more of the following formats:
      • Apache Iceberg
      • Delta Share
      • CSV
  4. Invite Consumers and Manage Access

    • Specify which partners or consumers can access your data product
    • Send email invitations; external users will receive instructions to access the data
    • Monitor usage

Example Use Cases

  • Partner Analytics: Share a curated, masked dataset with a partner for analytics in their preferred platform
  • Multi-format Distribution: Distribute a data product in multiple formats to meet the needs of different consumers
  • Data Democratization: Enable data democratization by sharing data products with users outside Vendia

Learn More