S3

How to configure and use S3 storage with Datograde

You can use Datograde to upload and manage files using Amazon S3 or compatible storage services.

Store:

  • Large datasets and files
  • Training data for AI models
  • Documents, images, and other media files
  • Backup files and archives

Configuring S3 Storage

Before you can start uploading files, you'll need to configure your S3 credentials.

Setting up S3 Configuration

  1. Navigate to the S3 Integration page in your account settings
  2. Fill in the required fields:
    • Bucket Name: Your S3 bucket name
    • Region: The AWS region (e.g., us-east-1)
    • Endpoint URL: Your S3 endpoint (e.g., https://s3.amazonaws.com)
    • Access Key ID: Your AWS access key
    • Secret Key: Your AWS secret key
  3. Click Save Configuration to store your credentials

Once configured, you'll see a green checkmark indicating that S3 is properly connected. You can edit these settings anytime by clicking Edit Configuration.

Uploading Files

Single Upload

  1. Locate the File Upload section below the S3 configuration
  2. Click Choose Files or drag files into the upload area
  3. Select one or multiple files from your computer
  4. Click Upload Files to start the upload
  5. Wait for the confirmation message indicating successful upload

'Magic' S3 Collections

When you create a new upload in S3, its identified by a 'Batch ID'. Multiple files in the same upload are grouped together under the same Batch ID. You can add more files to the same batch by clicking + File next to any batch.

When you create a new batch, Datograde will automatically create a new Collection for you, synced to all the files in the batch. Each file in the batch will be an Entry in the Collection.

You can also enable this by creating a new Collection that points to a S3 folder or key.

Let's say you have a folder in S3 called incoming-emails. When you create a Collection based on this folder as a data source, Datograde enables you to view, edit, and manage all the files in that folder. Each file will be an Entry in the Collection, and so is a subfolder. For files, the File column will only have 1 file, while for folders, the File column will have all the files in the folder.

You can press + Add to add new files to a subfolder, or + New Entry to add a new file to the folder.

Managing Upload History

Your uploads are organized in batches for easy management:

  • View all your uploads in the Upload History table
  • Each batch shows:
    • Batch ID for reference
    • List of files in the batch
    • Upload date and time
  • Click any file name to open it directly in your browser
  • Add more files to an existing batch by clicking + File next to any batch

Adding Files to Existing Batches

  1. Find the batch you want to add files to in the Upload History
  2. Click the + File button next to the batch
  3. Select additional files in the popup window
  4. Click Upload Files to add them to the existing batch

Best Practices

  • Keep your S3 credentials secure and never share them
  • Use meaningful file names to easily identify your uploads
  • Organize related files in the same batch for better management
  • Regularly review your upload history to manage storage

On this page