Skip to content

Ingesting Data from Amazon S3

Connecting to Amazon S3

To connect to Amazon S3 and ingest CSV files, you need to provide the following information:

  • Name: A friendly name for your connection to easily identify and reuse it for ingesting additional files
  • Role ARN: The ARN of the AWS role that Vendia will assume to access your S3 bucket
  • S3 Bucket Name: The name of the S3 bucket containing your CSV file
  • Bucket Region: The AWS region where your S3 bucket is located (e.g., us-east-1)

Prerequisites

Before connecting, you must update your IAM role’s trust relationship to allow Vendia to access your S3 bucket.

Update Trust Relationship

Allow Vendia to access your S3 bucket by adding a trust relationship to your IAM role. This grants Vendia’s AWS accounts permission to assume your role and access the S3 bucket on your behalf.

Note: The actual Vendia AWS account numbers are provided within the product UI when setting up your S3 connection.

Follow these steps to update the trust relationship:

  1. Go to the AWS IAM console
  2. Find the role you’re using for Vendia access
  3. Click on the “Trust relationships” tab
  4. Click “Edit trust policy”
  5. Add or merge the trust relationship policy shown below (replace VENDIA_ACCOUNT_ID with the account numbers from the product UI)
  6. Click “Update trust policy”
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::VENDIA_ACCOUNT_ID1:root",
"arn:aws:iam::VENDIA_ACCOUNT_ID2:root"
]
},
"Action": "sts:AssumeRole"
}
]
}

Required IAM Permissions

The IAM role must have the following permissions for the S3 bucket:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::your-bucket-name",
"arn:aws:s3:::your-bucket-name/*"
]
}
]
}

Advanced Settings (Optional)

Client-Side Encryption (CSE)

If your S3 objects are encrypted using client-side encryption, you can provide the encryption master key:

  • Encryption Key (Optional): The encryption master key for client-side encryption (CSE)

Supported File Formats

Currently, Vendia supports ingesting CSV files from Amazon S3. The CSV files should follow standard formatting conventions:

  • Comma-separated values
  • Optional header row
  • UTF-8 encoding (recommended)

Supported Data Types

Vendia supports the following data types for CSV columns:

Data TypeDescriptionExample Values
STRINGText data of any length”John Doe”, “Product Name”
INTEGER32-bit integer numbers123, -456, 0
LONG64-bit integer numbers1234567890123, -987654321
FLOATFloating-point decimal numbers3.14, -2.5, 1.23E+10
BOOLEANTrue/false valuestrue, false, 1, 0
DATEDate values2023-01-18, 1/18/2023
TIMESTAMPDate and time values2024-06-08 17:28:00
BINARYBinary data encoded as base64 or hexbase64 encoded data

Example Configuration

Here’s an example of a typical S3 connection configuration:

FieldExample Value
NameProduction Data Bucket
Role ARNarn:aws:iam::123456789012:role/VendiaS3AccessRole
S3 Bucket Namemy-company-data-bucket
Bucket Regionus-east-1
Encryption Key(optional, only if using CSE)

Best Practices

  • Use IAM roles instead of access keys for enhanced security
  • Apply the principle of least privilege when setting up IAM permissions
  • Ensure your S3 bucket and objects are accessible from the specified region
  • Consider using S3 bucket policies to further restrict access
  • Test the connection with a small sample file first

Troubleshooting

If you encounter connection issues:

  1. Verify the trust relationship includes the Vendia account IDs (provided in the product UI)
  2. Confirm the IAM role has the required S3 permissions
  3. Check that the bucket region matches the specified region
  4. Ensure the bucket name is spelled correctly
  5. Verify that the objects you want to ingest exist in the bucket

Next Steps

After successfully connecting to your S3 bucket, you can:

  • Browse and select specific CSV files to ingest
  • Configure CSV parsing options (headers, delimiters, etc.)
  • Set up data transformations and mappings