Skip to main content

Overview

The Data Connectors component provides custom Airbyte connectors that extract data from 9 different e-commerce platforms, enabling Trendteller to aggregate data from 11 brands into a unified analytics platform.

Technology Stack

Airbyte CDK

Connector Development Kit for building integrations

TypeScript

Type-safe connector development

Lerna

Monorepo management for multiple connectors

Jest

Comprehensive testing framework

Connector Architecture

Monorepo Structure

The connectors are organized in a Lerna monorepo:
airbyte-connectors/
├── sources/
│   ├── source-bling/         # Bling ERP connector
│   ├── source-bling-v3/      # Bling API v3
│   ├── source-vnda/          # VNDA platform
│   ├── source-vnda-moda/     # VNDA fashion variant
│   ├── source-shoppub/       # Shoppub marketplace
│   ├── source-tiny/          # Tiny ERP
│   ├── source-tiny-v3/       # Tiny API v3
│   ├── source-microvix/      # Microvix retail
│   ├── source-braavo/        # Braavo e-commerce
│   ├── source-totvs-moda/    # Totvs fashion ERP
│   ├── source-varejo-online/ # Varejo Online
│   ├── source-google-shopping/ # Google Shopping
│   └── ...
├── destinations/
│   ├── destination-crm360/   # CRM360 integration
│   └── destination-shops/    # Shops destination
└── shared/
    └── common-utils/         # Shared utilities

Source Connectors

Available Sources

  • ERP Systems
  • E-commerce Platforms
  • Other Integrations
Bling (source-bling, source-bling-v3)
  • Orders, products, customers, inventory
  • Invoices, payments, shipping
  • Incremental sync support
Tiny (source-tiny, source-tiny-v3)
  • Complete ERP data extraction
  • Multi-entity support
  • Real-time inventory updates
Totvs Moda (source-totvs-moda)
  • Fashion industry specific
  • Size/color variant handling
  • Collection management

Sync Modes

When to use: Small datasets, no incremental support
  • Replaces all existing data
  • Ensures consistency
  • Higher resource usage
Example: Product catalogs, brand configurations
When to use: Large datasets with timestamp fields
  • Only syncs new/modified records
  • Uses cursor field (e.g., updated_at)
  • Efficient and fast
Example: Orders, customer updates, inventory changes
When to use: Real-time requirements
  • Tracks all changes at source
  • Minimal latency
  • Requires source support
Example: Real-time order processing

Destination Connectors

Available Destinations

CRM360

destination-crm360Syncs customer and order data to CRM360 for marketing campaigns and customer engagement.

Shops

destination-shopsPushes consolidated product and inventory data to Shops platform for multi-channel selling.

Connector Development

Creating a New Connector

1

Generate Connector Scaffold

cd airbyte-connectors
lerna create source-new-platform sources/
2

Implement Source Interface

import { AirbyteSourceRunner } from '@airbyte/cdk'

export class SourceNewPlatform {
  async spec() {
    // Define configuration schema
  }

  async check(config) {
    // Test connection
  }

  async discover(config) {
    // Return available streams
  }

  async read(config, catalog, state) {
    // Extract data
  }
}
3

Add Stream Definitions

const streams = {
  orders: {
    name: 'orders',
    json_schema: OrderSchema,
    supported_sync_modes: ['full_refresh', 'incremental'],
    source_defined_cursor: true,
    default_cursor_field: ['updated_at']
  }
}
4

Write Tests

describe('SourceNewPlatform', () => {
  it('should connect successfully', async () => {
    const result = await source.check(config)
    expect(result.status).toBe('SUCCEEDED')
  })
})

Stream Configuration

Each stream defines:
  • Schema: JSON schema for data validation
  • Sync modes: Supported synchronization methods
  • Cursor field: Field for incremental sync
  • Primary key: Unique identifier(s)
  • Partitioning: How data is divided for sync

Testing Strategy

Test Levels

  • Unit Tests
  • Integration Tests
  • Acceptance Tests
Test individual functions and utilities:
  • API request formatting
  • Response parsing
  • Data transformation
  • Error handling
npm run test:unit -- source-bling

Test Configuration

# acceptance-test-config.yml
connector_image: airbyte/source-bling:dev
tests:
  spec:
    - spec_path: "source_bling/spec.json"
  connection:
    - config_path: "secrets/config.json"
  discovery:
    - config_path: "secrets/config.json"
  basic_read:
    - config_path: "secrets/config.json"
      configured_catalog_path: "integration_tests/configured_catalog.json"

Error Handling

Retry Strategies

Strategy: Exponential backoff with jitter
  • Network timeouts
  • Rate limit errors (429)
  • Server errors (5xx)
const retry = exponentialBackoff({
  maxRetries: 5,
  initialDelay: 1000,
  maxDelay: 30000
})
Strategy: Fail fast and log
  • Authentication failures (401)
  • Invalid configuration (400)
  • Resource not found (404)
Require user intervention to resolve.

Logging

Comprehensive logging for debugging:
  • Request/response logging (sanitized)
  • Sync progress and statistics
  • Error details with context
  • Performance metrics

Performance Optimization

Batching

Fetch multiple records per API call to reduce overhead

Pagination

Efficiently handle large datasets with cursor-based pagination

Parallel Streams

Sync independent streams concurrently

Caching

Cache API responses for repeated requests

Deployment

Building Connectors

# Build all connectors
lerna run build

# Build specific connector
lerna run build --scope=source-bling

Docker Images

Connectors are packaged as Docker images:
FROM airbyte/integration-base:latest

COPY source-bling /airbyte/integration_code
RUN npm install

ENTRYPOINT ["node", "/airbyte/integration_code/main.js"]

Airbyte Cloud

Connectors can be deployed to:
  • Airbyte Cloud: Hosted Airbyte service
  • Self-hosted: On-premise Airbyte instance
  • Kubernetes: Scalable container orchestration

Monitoring

Sync Metrics

Track connector performance:
  • Sync duration: Time to complete sync
  • Records synced: Number of records extracted
  • Data volume: Bytes transferred
  • Error rate: Failed syncs percentage

Health Checks

Automated monitoring:
  • Connection health (daily checks)
  • API quota usage
  • Sync schedule adherence
  • Data freshness alerts

Next Steps