Skip to content

Analytical Data Sharing FAQ

Why should I use Vendia’s analytical data sharing?

Vendia enables you to share analytical datasets (from small to terabyte scale) across organizational boundaries, clouds, and regions securely and with proper governance. Data teams, including data engineers and data product managers, can share raw, curated, and bespoke datasets with collaborators or consumers in a platform agnostic way.

Common use cases include:

  • A data seller sharing a bespoke data product residing in Snowflake with a specific consumer, with certain fields redacted, for access exclusively through Databricks.
  • A retailer sharing raw data (with sensitive user information removed) with partners for data enrichment. Partners on Vendia receive updates in near-real time for analytics.
  • Data stewards and data governors of operating companies needing to ensure trusted, private, and controlled data sharing across organizational boundaries, while maintaining end-to-end data lineage for audit purposes.

Key features and benefits include:

  • No infrastructure provisioning or management: Share data products ready for querying in various formats without additional infrastructure. Data consumers can use their preferred query/processing engines.
  • Point-and-click sharing and collaboration: Data teams can share across multiple data platforms through Vendia’s intuitive interface.
  • Trust, control, and privacy: Apply granular access controls through RBAC and data filtering for sensitive information. Get end-to-end visibility of data lineage across organizational boundaries. All transactions are ledgered in Vendia’s proprietary blockchain for auditability.
  • Cross-platform compatibility: Meet partners and data consumers where they are. Distribute data products in various formats, allowing consumers to access data from their preferred platforms.

What ingestion sources are supported for analytical data sharing?

Currently, analytical data sharing supports the following ingestion sources with more coming soon:

  • Snowflake
  • Cloudera CDP
  • PostgreSQL and PostgreSQL-Compatible Data Sources such as:
    • Amazon Redshift
    • Amazon RDS PostgreSQL
    • Amazon Aurora PostgreSQL
  • Vendia Tables (native Vendia data format based on Apache Iceberg)

What external destination formats are supported for analytical data sharing?

Vendia can create Data Products for downstream consumption by data consumers in the following formats:

  • Apache Iceberg
  • Delta Lake Delta Share
  • CSV

How does analytical data sharing with Vendia work?

Sharing data with your collaborators or data consumers is a straightforward process:

  1. Data Ingestion: First, your data must be ingested into Vendia. See supported sources for more information.
  2. Data Processing: Once data is ingested, the platform applies granular access and governance policies to the data based on settings you define.
  3. Data Sharing/Distribution: Depending on your use case, your data will either be:
    • Shared with a collaboration partner that is also using the Vendia platform as Vendia Tables (based on Apache Iceberg)
      or
    • Distributed into endpoints using a specified data format that a data consumer can directly query via their preferred query engine or tools
  4. Access Management: Vendia handles all the data mappings, transformations, security, and access management with trust, control, and privacy between multiple parties, enabling secure data sharing and collaboration across organizational boundaries.

This approach ensures that you maintain full control over your data while enabling secure sharing across organizational boundaries.

What types of data transformations are supported for analytical data sharing?

Vendia provides compute capabilities to transform and prepare your data for data sharing and collaboration. Data teams can perform the following without writing any additional code:

  • Data Filtering: Apply row-based filters and entitlements to limit data or create tenant-specific bespoke datasets.
  • Dynamic Data Masking: Apply masking and policies to mask or redact sensitive data (PII, confidential data) before sharing with partners to adhere to geography-based compliance and data privacy requirements.
  • Data Joins: Combine multiple datasets from different sources to create meaningful, curated data sets before sharing data with partners.

These capabilities allow you to create precise data products that meet your partners’ needs while maintaining data governance requirements.

How do I share data with collaboration partners within Vendia and why would I want to do so?

You can share data with collaboration partners for data enrichment and data cleanroom-based scenarios across disparate source systems which is a powerful capability that Vendia provides. This involves the following steps:

  1. Create a shared table: Define what data you want to share in the shared table by:

    • Selecting your source system, which can be:
    • Applying data filtering and masking policies if needed
    • Defining a schedule for updates
  2. Define access controls: Specify access controls and permissions for partners and collaborators in your multi-party sharing ecosystem.

Collaboration partners who receive the shared table(s) can create derived tables or their own data products from the shared table(s), allowing them to use the data in their own environments while you maintain control over the originally shared table.

How do I share or distribute data with consumers outside Vendia?

Vendia’s key value proposition for sharing analytical data is helping organizations achieve data democratization. Data providers can share meaningful, curated data products with consumers who are not on Vendia. This involves the following steps:

  1. Create a data product: Define a meaningful, curated data product by:

    • Selecting Vendia Tables which have necessary filtering/masking applied, or creating new Vendia Tables with the required filtering/masking (Note: a data product can comprise multiple Vendia Tables)
    • Defining the update schedule for your data product
    • Adding metadata to help your partners understand the purpose and content of your data product
  2. Select the destination data format(s): You can create the data product in formats that your consumers prefer:

    • Apache Iceberg
    • Delta Share
    • CSV
    • Any combination of the above (for partners requiring access via multiple data formats)
  3. Invite partner(s): Specify which external partners can receive an email invitation to access your data product(s).

Vendia streamlines this process through a user-friendly interface that guides you through each step while enforcing your organization’s data governance policies.

I am a data provider and want to share data with a consumer on a different data platform. Do I need to have an account on the destination platform?

No, as a data provider using Vendia’s analytical data sharing solution, you do not need to have an account on your consumer’s preferred destination platform. After ingesting your data into Vendia, you can share it with consumers regardless of what platform they use. You maintain control over your data through Vendia’s platform while your consumers can access it through their preferred query engine or tools.

Vendia’s platform enables you to share meaningful, curated data products with consumers who are not on Vendia, facilitating true data democratization.

I am a data consumer and not on Vendia. How do I access data shared with me?

As a data consumer who is not on Vendia, you will be invited by your data provider to access the data product(s) being shared with you. Once you accept the invitation via email, a Vendia account will be created automatically as part of your acceptance process, enabling you to view the shared data product(s).

You will be guided through a set of steps and instructions to access your data product from your preferred platform or query engine (e.g., Snowflake, Databricks, Redshift, BigQuery, etc.).

What security features are supported for analytical data sharing?

Vendia’s analytical data sharing solution supports the following security controls:

  • SOC2 compliance
  • Data encryption at rest
  • Data encryption in transit
  • Column-level masking and filtering
  • Row-level masking and filtering
  • Data access control via RBAC and sharing policies

Additional Resources

For more information about Vendia’s analytical data sharing capabilities, please refer to our documentation or contact Vendia Support with specific questions about your use case.