Architecture

AIDRAN is a unified TypeScript system running entirely on Vercel. One language, one deployment target, one database. No microservices, no message queues, no Python. Every pipeline stage is a serverless function triggered by a cron schedule.

One language, one platform, zero split-brain drift.

System Overview

Data flows through a six-stage pipeline. Each stage is decoupled — they communicate exclusively through the database. No runtime coupling, no shared state, no message bus.

Collect
01
Sources7 platforms
02
IngestNormalize + dedup
Understand
03
ProcessEmbed + analyze
04
DetectSignals + clusters
Publish
05
NarrateAI editorial
06
ServeAPI + SSE + ISR
Neon Postgres

Technology Stack

11 technologies, four layers. Every choice optimizes for simplicity and operational cost over theoretical scalability.

Runtime & Language

TechnologyRole
Next.js 16Application framework
TypeScriptEnd-to-end language

Data Layer

TechnologyRole
NeonServerless Postgres
Drizzle ORMSchema-as-code

AI & ML

TechnologyRole
Claude (Anthropic)AI analysis & generation
Voyage AIEmbeddings
Vercel AI SDKAI orchestration

Infrastructure & Frontend

TechnologyRole
VercelDeployment & runtime
Tailwind CSS v4Styling
PostHogObservability
Framer MotionAnimation

Pipeline Data Flow

Data flows through 10 stages, each implemented as an independent serverless function. Stages communicate through the database — no runtime coupling, no shared state.

01
IngestHourly to every 30 min

7 source adapters normalize public discourse into a shared schema

RedditBlueskyHNYouTubeGDELTarXivX
02
EmbedEvery 5 min

Voyage AI vectorizes new records (512-dim, cosine distance)

Voyage AIpgvector
03
AnalyzeEvery 10 min

Claude Haiku extracts sentiment, entities, key phrases, emotional register

Claude HaikuBatch API
04
Detect SignalsEvery 30 min

7 detectors scan for anomalies — volume spikes, sentiment shifts, entity surges

Z-scoreSliding windowTF-IDF
05
ClusterWeekly (Monday 6 AM)

K-means++ groups related discourse into narrative threads with TF-IDF labels

K-means++Cosine similarity
06
Meta-PatternsTwice daily

5 detectors find cross-topic convergence, platform anomalies, aggregate trends

Cross-correlationGap analysis
07
NarrateEvery 6 hrs (8 AM–8 PM)

Claude Sonnet generates editorial stories from signals and clusters

Claude SonnetVercel AI SDK
08
DispatchEvery 2 hrs (8:30 AM–10:30 PM)

Breaking news dispatches from high-severity signals

Saliency scoringCadence enforcement
09
PublishEvery 5 min

Saliency scoring, cadence enforcement, cross-platform distribution

Bluesky APIISR revalidation
10
SnapshotContinuous (on write)

Daily rollups per topic — volume, sentiment, source breakdown

Postgres aggregatesMaterialized views

Database Schema

All tables are defined in Drizzle ORM. The database is a single Neon Postgres instance with pgvector enabled.

TablePurpose
discourseRaw records from all sources — the core data table
topicsEditorial beats with keywords, subreddits, category, status
narrativesAI-generated stories, dispatches, and beat narratives
signalsDetected anomalies — volume spikes, sentiment shifts, etc.
entitiesNamed entities extracted from discourse analysis
clustersK-means++ narrative groupings with top terms
meta_patternsCross-topic patterns and convergence signals
embedding_queueRecords pending Voyage AI vectorization
analysis_queueRecords pending Claude Haiku analysis
seen_source_idsDedup table — prevents re-ingesting known records
ingestion_logPer-run observability for each source worker
daily_topic_snapshotsPre-computed daily rollups per topic per day
publishing_logCross-platform publishing history and cadence tracking

Cron Schedule

18 scheduled jobs orchestrate the full pipeline. All run as Vercel serverless functions with a 5-minute execution budget.

Ingest

RedditHourly
BlueskyEvery 30 min
Google NewsEvery 4 hours
Hacker NewsHourly
arXivEvery 4 hours
YouTubeEvery 4 hours
X / TwitterDaily at 2 PM UTC

Process

EmbedEvery 5 min
AnalyzeEvery 10 min
Batch ResultsEvery 2 min

Detect

SignalsEvery 30 min
ClusterWeekly Monday 6 AM
Meta-Patterns8 AM & 8 PM

Generate

NarrateEvery 6 hours
DispatchEvery 2 hours

Distribute

PublishEvery 5 min
Bluesky PublishHourly
Bluesky PostEvery 30 min

Key Decisions

Every architectural choice optimizes for simplicity and operational cost over theoretical scalability. The system serves a single editorial product, not a platform.

Unified Next.js (TypeScript end-to-end)

Eliminates split-brain type drift between Python backend and TS frontend

Instead of Python backend + TS frontend

Neon (serverless Postgres + pgvector)

Native Postgres compatibility, serverless driver for Vercel, embeddings co-located with metadata

Instead of Supabase, PlanetScale, local Postgres

Drizzle

Type-safe, schema-as-code, lightweight, zero-dependency runtime

Instead of Prisma, TypeORM, raw SQL

Vercel AI SDK + Claude

Streaming, server-side generation, editorial voice via system prompts

Instead of OpenAI GPT-4, local models

Voyage AI (voyage-4-lite, 512-dim)

Consistent with v1, shared embedding space, cosine similarity

Instead of OpenAI ada-002, Cohere embed

TypeScript (Vercel Cron + serverless)

One language, deployable to same platform, 5-min execution budget

Instead of Python workers, separate microservices

Server-Sent Events

Unidirectional, CDN-friendly, no extra infrastructure

Instead of WebSockets, polling, Pusher

Server Components + ISR (5 min)

Narratives pre-rendered, not client-fetched. Data visualizations secondary to prose.

Instead of Client-side SPA, full SSR

Vercel

ISR, edge functions, cron, single deployment target for everything

Instead of AWS, Railway, self-hosted