Axelered AI

Connectors (Crawlers)

Automatically import documents from external sources into your collections using S3, Web, Google Drive, or FTP/SFTP crawlers.

Beyond direct uploads, documents can be automatically fetched from external sources using Connectors (also called Crawlers). Connectors link your collection to remote file systems and websites, polling for new or updated files on a schedule you define.

 ┌──────────────┐    ┌──────────────────┐    ┌──────────────────┐
 │  External    │───►│   Crawler        │───►│   Collection     │
 │  Source      │    │   Worker         │    │   Documents      │
 │              │    │                  │    │                  │
 │  • S3 Bucket │    │  1. List source  │    │  Ingested &      │
 │  • Web Page  │    │  2. Download new │    │  indexed for     │
 │  • Google Dr │    │  3. Upload to    │    │  search/chat     │
 │  • FTP/SFTP  │    │     collection   │    │                  │
 └──────────────┘    └──────────────────┘    └──────────────────┘

Every file discovered by a connector goes through the same ingestion pipeline (parse → chunk → embed) as a manually uploaded document. Once documents are imported via connectors, they appear in your collection like any uploaded document.


📂 Supported Source Types

TypeSourceAuthenticationDescription
S3S3-compatible bucket (AWS, MinIO)Access key / secret in URLSync files from a bucket prefix.
WEBWebsite URLNone / Basic / BearerCrawl and index web pages recursively.
GOOGLE_DRIVEGoogle Drive folderOAuth 2.0Watch a shared folder for new files.
FTPFTP serverUsername / passwordDownload files from a remote directory.
SFTPSFTP (SSH) serverUsername / passwordSecurely download files over SSH.

🛠️ Connector Management

Connectors are managed at the collection level. For a detailed technical reference of every field and parameter, see the API Connector Reference.

Create a Connector

Connectors link your collection to external data sources. You can define multiple types of sources including S3, Web, and Google Drive.

# Example: Creating an S3 Connector
curl -X POST "https://api.axelered.com/v1/w/{workspace_id}/col/{collection_id}/crawls" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "crawlType": "S3",
    "url": "s3://admin:password@minio.local:9000/my-bucket/data",
    "cronSchedule": "0 */6 * * *"
  }'

Read, List & Update

To manage your existing connectors and their sync schedules, use the following specialized endpoints:

Delete a Connector

Deleting a connector immediately halts its sync schedule. Documents already ingested will remain in the collection.

curl -X DELETE "https://api.axelered.com/v1/w/{workspace_id}/col/{collection_id}/crawls/{crawl_id}" \
  -H "Authorization: Bearer YOUR_API_KEY"

⚙️ Operations & Scheduling

Manual Trigger

Trigger a crawl run immediately, regardless of the cron schedule:

curl -X POST "https://api.axelered.com/v1/w/{workspace_id}/col/{collection_id}/crawls/{crawl_id}/start" \
  -H "Authorization: Bearer YOUR_API_KEY"

Pause / Resume

Pausing stops scheduled runs from starting. Currently processing items will complete.

curl -X POST "https://api.axelered.com/v1/w/{workspace_id}/col/{collection_id}/crawls/{crawl_id}/pause"
curl -X POST "https://api.axelered.com/v1/w/{workspace_id}/col/{collection_id}/crawls/{crawl_id}/resume"

Scheduling with Cron

Connectors support standard 5-field cron expressions (* * * * *).

ExampleSchedule
0 */6 * * *Every 6 hours
0 0 * * *Daily at midnight
*/30 * * * *Every 30 minutes

On this page