# AI Microservice Architecture

## System Overview

The AI Microservice is a scalable, production-ready Python service built with FastAPI and Celery, designed to handle AI tasks, social media publishing, and automation for the AI Business Market platform.

## Architecture Diagram

```
┌─────────────────────────────────────────────────────────────────┐
│                         Laravel Platform                         │
│                    (AI Business Market)                          │
└────────────────┬────────────────────────────────┬────────────────┘
                 │                                │
                 │ HTTP/REST API                  │ Webhooks/Callbacks
                 │                                │
┌────────────────▼────────────────────────────────▼────────────────┐
│                      FastAPI Application                          │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  Routers: /ai/jobs, /publish, /rag, /reels, /admin      │   │
│  └──────────────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  Middleware: CORS, Auth, Rate Limiting, Logging          │   │
│  └──────────────────────────────────────────────────────────┘   │
└────────────────┬────────────────────────────────┬────────────────┘
                 │                                │
                 │ Enqueue Tasks                  │ Store Jobs
                 │                                │
┌────────────────▼────────────────┐  ┌───────────▼────────────────┐
│       Redis (Message Broker)    │  │   PostgreSQL (Database)    │
│  - Task Queue                   │  │  - Job Records             │
│  - Result Backend               │  │  - Cost Tracking           │
│  - Cache                        │  │  - Audit Logs              │
└────────────────┬────────────────┘  └────────────────────────────┘
                 │
                 │ Consume Tasks
                 │
┌────────────────▼────────────────────────────────────────────────┐
│                    Celery Workers (Async)                        │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  Job Processor: AI Tasks, Publishing, RAG                │   │
│  └──────────────────────────────────────────────────────────┘   │
└────────┬───────────────┬──────────────┬────────────┬────────────┘
         │               │              │            │
         │               │              │            │
┌────────▼──────┐ ┌─────▼─────┐ ┌─────▼─────┐ ┌───▼──────────┐
│  OpenAI API   │ │Anthropic  │ │  Social   │ │ Vector Store │
│  - GPT-4      │ │  Claude   │ │  Media    │ │  - FAISS     │
│  - Embeddings │ │           │ │  APIs     │ │  - Weaviate  │
└───────────────┘ └───────────┘ └───────────┘ └──────────────┘
```

## Component Details

### 1. FastAPI Application Layer

**Responsibilities:**
- HTTP request handling
- Request validation (Pydantic schemas)
- Authentication & authorization
- Rate limiting
- API documentation (OpenAPI/Swagger)

**Key Files:**
- [`app/main.py`](app/main.py) - Application entry point
- [`app/routers/`](app/routers/) - API route handlers
- [`app/schemas.py`](app/schemas.py) - Request/response models

### 2. Celery Task Queue

**Responsibilities:**
- Asynchronous task processing
- Job scheduling and retries
- Worker management
- Task result tracking

**Key Files:**
- [`app/celery_app.py`](app/celery_app.py) - Celery configuration and tasks
- [`app/services/job_processor.py`](app/services/job_processor.py) - AI job processing
- [`app/services/publisher_processor.py`](app/services/publisher_processor.py) - Publishing logic

**Task Queues:**
- `ai_jobs` - AI task processing
- `publishers` - Social media publishing
- `callbacks` - Webhook callbacks

### 3. Database Layer

**PostgreSQL Schema:**

```sql
-- AI Jobs
CREATE TABLE ai_jobs (
    id VARCHAR(36) PRIMARY KEY,
    job_type VARCHAR(50) NOT NULL,
    tenant_id VARCHAR(36) NOT NULL,
    owner_id VARCHAR(36) NOT NULL,
    status VARCHAR(20) NOT NULL,
    payload JSON NOT NULL,
    result JSON,
    error_message TEXT,
    callback_url VARCHAR(500),
    provider VARCHAR(50),
    model_used VARCHAR(100),
    tokens_input INTEGER DEFAULT 0,
    tokens_output INTEGER DEFAULT 0,
    cost_usd FLOAT DEFAULT 0.0,
    created_at TIMESTAMP NOT NULL,
    started_at TIMESTAMP,
    completed_at TIMESTAMP,
    INDEX idx_tenant_status (tenant_id, status),
    INDEX idx_owner_created (owner_id, created_at)
);

-- Publish Jobs
CREATE TABLE publish_jobs (
    id VARCHAR(36) PRIMARY KEY,
    tenant_id VARCHAR(36) NOT NULL,
    platform VARCHAR(50) NOT NULL,
    status VARCHAR(20) NOT NULL,
    content JSON NOT NULL,
    platform_post_id VARCHAR(255),
    retry_count INTEGER DEFAULT 0,
    created_at TIMESTAMP NOT NULL,
    published_at TIMESTAMP,
    INDEX idx_tenant_platform (tenant_id, platform)
);

-- Cost Tracking
CREATE TABLE cost_tracking (
    id VARCHAR(36) PRIMARY KEY,
    tenant_id VARCHAR(36) NOT NULL,
    job_id VARCHAR(36),
    provider VARCHAR(50) NOT NULL,
    model VARCHAR(100) NOT NULL,
    tokens_total INTEGER DEFAULT 0,
    cost_usd FLOAT NOT NULL,
    created_at TIMESTAMP NOT NULL,
    INDEX idx_tenant_created (tenant_id, created_at)
);
```

### 4. AI Provider Layer

**Supported Providers:**
- **OpenAI**: GPT-4, GPT-3.5, Embeddings
- **Anthropic**: Claude 3 (Opus, Sonnet, Haiku)
- **Local LLM**: Ollama integration

**Key Features:**
- Provider abstraction
- Automatic cost calculation
- Token counting
- PII redaction before API calls
- Retry logic with exponential backoff

**Key Files:**
- [`app/services/ai_providers.py`](app/services/ai_providers.py)

### 5. Social Media Publishers

**Supported Platforms:**
- Twitter/X
- Facebook Pages
- Instagram Business
- LinkedIn
- YouTube (planned)

**Key Features:**
- Platform-specific adapters
- Media upload handling
- Idempotent publishing
- Automatic retries
- Post status tracking

**Key Files:**
- [`app/services/social_publishers.py`](app/services/social_publishers.py)

### 6. Vector Store (RAG)

**Supported Stores:**
- **FAISS**: File-based, fast similarity search
- **Weaviate**: Cloud-native vector database
- **PGVector**: PostgreSQL extension (planned)

**Key Features:**
- Document ingestion
- Embedding generation
- Similarity search
- Multi-tenant isolation
- Document metadata

**Key Files:**
- [`app/services/vector_store.py`](app/services/vector_store.py)

### 7. Security Layer

**Features:**
- PII redaction (email, phone, SSN, credit cards)
- Data encryption at rest
- Secure token management
- HMAC signature validation
- API key authentication

**Key Files:**
- [`app/utils/security.py`](app/utils/security.py)

### 8. Observability Layer

**Metrics (Prometheus):**
- Job counts by type/status
- LLM token usage
- API latency
- Cost tracking
- Worker health

**Logging (Structured JSON):**
- Request/response logs
- Error tracking
- Performance metrics
- Audit trails

**Key Files:**
- [`app/utils/observability.py`](app/utils/observability.py)

## Data Flow

### AI Job Processing Flow

```
1. Laravel → POST /v1/ai/jobs
   ↓
2. FastAPI validates request
   ↓
3. Create AIJob record in PostgreSQL
   ↓
4. Enqueue task to Redis
   ↓
5. Return job_id immediately
   ↓
6. Celery worker picks up task
   ↓
7. Process job (call LLM, generate content)
   ↓
8. Update job status & result in PostgreSQL
   ↓
9. Track costs in cost_tracking table
   ↓
10. Send callback to Laravel (if configured)
```

### Publishing Flow

```
1. Laravel → POST /v1/publish
   ↓
2. FastAPI validates request
   ↓
3. Create PublishJob record
   ↓
4. Enqueue to publishers queue
   ↓
5. Return publish_job_id
   ↓
6. Celery worker processes
   ↓
7. Call social media API
   ↓
8. Update job with platform_post_id
   ↓
9. Send callback to Laravel
```

### RAG Query Flow

```
1. Laravel → POST /v1/rag/query
   ↓
2. Generate query embedding
   ↓
3. Search vector store (FAISS/Weaviate)
   ↓
4. Retrieve top-k similar documents
   ↓
5. Build context from documents
   ↓
6. Call LLM with context + query
   ↓
7. Return answer with sources
```

## Scalability Considerations

### Horizontal Scaling

**API Layer:**
- Stateless design
- Load balancer ready
- Multiple instances supported

**Worker Layer:**
- Independent worker processes
- Auto-scaling based on queue length
- Worker pools per queue type

### Performance Optimization

**Caching:**
- Redis for frequently accessed data
- Embedding cache for repeated queries
- API response caching

**Database:**
- Connection pooling
- Indexed queries
- Async operations

**Queue Management:**
- Priority queues
- Task routing
- Rate limiting per tenant

## Security Architecture

### Authentication Flow

```
Client Request
    ↓
[API Gateway/Load Balancer]
    ↓
[FastAPI Middleware]
    ↓
Verify X-API-Key header
    ↓
Check rate limits
    ↓
[Route Handler]
```

### Data Protection

**At Rest:**
- Encrypted database fields
- Secure token storage
- Encrypted backups

**In Transit:**
- HTTPS/TLS
- Signed webhooks
- Encrypted API keys

**PII Handling:**
- Automatic redaction
- No PII in logs
- Compliance ready

## Monitoring & Alerting

### Key Metrics

**Application:**
- Request rate
- Error rate
- Response time (p50, p95, p99)

**Jobs:**
- Queue length
- Processing time
- Success/failure rate

**Costs:**
- Token usage per tenant
- Cost per job type
- Daily/monthly spend

### Health Checks

**Liveness:** `/health/liveness`
- Service is running

**Readiness:** `/health/readiness`
- Database connected
- Redis available
- Workers active

## Disaster Recovery

### Backup Strategy

**Database:**
- Daily automated backups
- Point-in-time recovery
- Cross-region replication

**Vector Store:**
- Regular index snapshots
- Document metadata backup

### Failure Handling

**Job Failures:**
- Automatic retries (3 attempts)
- Exponential backoff
- Dead letter queue

**Service Failures:**
- Circuit breakers
- Graceful degradation
- Fallback providers

## Future Enhancements

1. **Multi-region deployment**
2. **GraphQL API support**
3. **Streaming responses**
4. **Advanced caching strategies**
5. **ML model serving**
6. **Real-time analytics dashboard**
7. **A/B testing framework**
8. **Advanced cost optimization**