# 🗄️ **RECORA MongoDB Collections Schema**

## 📊 **Database Overview**

**Database Name**: `recora`  
**Connection**: MongoDB with auto-generated indexes  
**Engine**: Go with `go.mongodb.org/mongo-driver`

---

## 🏢 **Core Collections**

### **1. `tenants` - Tenant Management**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",                    // Auto-generated incremental ID
    "name": "Demo Company",              // Human-readable name
    "status": "active",                  // active|inactive
    "keys": [                            // HMAC authentication keys
        {
            "key_id": "k1",
            "hmac_secret": BinData(...),  // Binary HMAC secret
            "active": true,
            "created_at": ISODate("...")
        }
    ],
    "created_at": ISODate("2024-01-15T10:30:00Z"),
    "updated_at": ISODate("2024-01-15T10:30:00Z")
}
```

### **2. `apps` - Application Management**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",                    // Foreign key to tenants
    "app_id": "1",                       // Auto-generated per tenant
    "name": "Streaming Platform",       // Human-readable app name
    "status": "active",                  // active|inactive
    "adapter": "ott",                    // ott|pharma|retail|consulting
    "policy_chain": ["content_filter", "geo_filter"],
    "facet_weights": {                   // Domain-specific weights
        "genre": 0.3,
        "year": 0.2,
        "rating": 0.5
    },
    "scoring_weights": {                 // Recommendation scoring
        "covisit": 0.3,
        "trending": 0.4,
        "content": 0.3,
        "fresh_bonus": 0.1,
        "seen_penalty": 0.2
    },
    "mmr_lambda": 0.6,                   // Diversity parameter
    "retention_days": {
        "interactions": 120,             // Days to keep user interactions
        "audit": 730                     // Days to keep audit logs
    },
    "limits": {
        "rpm": 2000                      // Requests per minute limit
    },
    "created_at": ISODate("2024-01-15T10:30:00Z"),
    "updated_at": ISODate("2024-01-15T10:30:00Z")
}
```

### **3. `counters` - Auto-Incremental ID Management**
```javascript
// Global tenant ID counter
{
    "_id": "tenant_id",
    "sequence_value": 5                  // Next tenant will be "6"
}

// Per-tenant app ID counters
{
    "_id": "app_id_1",                   // For tenant "1"
    "sequence_value": 3                  // Next app for tenant "1" will be "4"
}

{
    "_id": "app_id_2",                   // For tenant "2"
    "sequence_value": 1                  // Next app for tenant "2" will be "2"
}
```

---

## 📦 **Content & Data Collections**

### **4. `items` - Content/Product Catalog**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "item_id": "movie_12345",            // Unique item identifier
    "type": "content",                   // content|product|service|expert
    "text": {
        "title": "The Great Adventure",
        "desc": "Epic adventure movie..."
    },
    "facets": {                          // Domain-specific metadata
        "app_id": "1",                   // ⭐ Added during ingestion
        "adapter": "ott",                // ⭐ Added during ingestion
        
        // OTT-specific facets
        "genre": ["Action", "Adventure"],
        "duration_minutes": 120,
        "release_year": 2023,
        "rating": "PG-13",
        "director": "John Smith",
        "cast": ["Actor One", "Actor Two"],
        "language": "English",
        "quality": "4K",
        
        // OR Retail-specific facets
        "category": "Electronics",
        "brand": "TechBrand",
        "price": 299.99,
        "currency": "USD",
        "color": ["Black", "White"],
        
        // OR Pharma-specific facets
        "therapeutic_area": ["Cardiology"],
        "dosage_form": "Tablet",
        "strength": "10mg",
        "manufacturer": "PharmaCorp"
    },
    "duration_seconds": 7200,           // Content duration
    "availability": {
        "geo": ["US", "CA", "UK"],       // Geographic availability
        "device": ["mobile", "tv", "web"], // Device support
        "time": {
            "start": "2023-01-01T00:00:00Z",
            "end": "2024-12-31T23:59:59Z"
        }
    },
    "compliance": {                      // Regulatory compliance
        "age_18_plus": false,
        "hcp_only": false,               // Healthcare professional only
        "on_label": true
    },
    "created_at": ISODate("2024-01-15T10:30:00Z"),
    "updated_at": ISODate("2024-01-15T10:30:00Z")
}
```

### **5. `interactions` - User Events & Analytics**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "user_id": "user_12345",
    "item_id": "movie_12345",            // Optional - can be null for search events
    "event_type": "view",                // view|play|purchase|add_to_cart|contact_submit|search
    "value": 1.0,                        // Event value (price, rating, duration, etc.)
    "ts": ISODate("2024-01-15T10:30:00Z"), // Event timestamp
    "session_id": "session_abc123",      // Optional session grouping
    "context": {                         // Event metadata
        "app_id": "1",                   // ⭐ Added during ingestion
        "adapter": "ott",                // ⭐ Added during ingestion
        
        "device": "mobile",              // mobile|desktop|tv|tablet
        "platform": "ios",               // ios|android|web|roku|etc
        "location": "US",                // Geographic location
        "referrer": "home_page",         // How user reached this item
        "position": 3,                   // Position in recommendation list
        
        // Domain-specific context
        "watch_duration": 45,            // OTT: seconds watched
        "watch_percentage": 37.5,        // OTT: % of content watched
        "quality_played": "HD",          // OTT: playback quality
        
        "quantity": 1,                   // Retail: items purchased
        "payment_method": "credit_card", // Retail: payment type
        "order_id": "ORD-123",          // Retail: order reference
        
        "prescription_id": "RX123",      // Pharma: prescription reference
        "doctor_id": "DOC789",          // Pharma: prescribing doctor
        
        "project_budget": "50k-100k",   // Consulting: project size
        "timeline": "3-6 months"        // Consulting: project timeline
    }
}
```

---

## 🔧 **System Collections**

### **6. `idempotency_keys` - Duplicate Prevention** 
*(Currently commented out but available)*
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "app_id": "1", 
    "idem_key": "unique-request-id-123",  // Client-provided idempotency key
    "created_at": ISODate("2024-01-15T10:30:00Z")
}
```

### **7. `watermarks` - Data Pipeline Checkpoints**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "app_id": "1",
    "name": "daily_batch_process",        // Process name
    "value": "2024-01-15",               // Last processed value
    "updated_at": ISODate("2024-01-15T10:30:00Z")
}
```

---

## 🎯 **Recommendation Engine Collections**

### **8. `user_aliases` - Identity Mapping**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "app_id": "1",
    "external_id": "email:user@domain.com", // External identifier
    "u_id": "user_12345",                    // Internal user ID
    "created_at": ISODate("2024-01-15T10:30:00Z")
}
```

### **9. `continue_watching` - User Progress**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "app_id": "1", 
    "u_id": "user_12345",
    "c_id": "movie_12345",               // Content ID
    "progress_sec": 2700,                // Seconds watched
    "duration_sec": 7200,                // Total content duration
    "updated_at": ISODate("2024-01-15T10:30:00Z")
}
```

### **10. `user_top_n` - Personalized Recommendations Cache**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "app_id": "1",
    "u_id": "user_12345",
    "items": [                           // Precomputed top recommendations
        {
            "c_id": "movie_12345",
            "score": 0.85
        },
        {
            "c_id": "movie_67890", 
            "score": 0.78
        }
    ],
    "updated_at": ISODate("2024-01-15T10:30:00Z")
}
```

### **11. `pair_counts` - Co-occurrence Matrix**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "app_id": "1",
    "src": "movie_12345",                // Source item
    "dst": "movie_67890",                // Destination item  
    "w": 0.65,                          // Co-occurrence weight
    "updated_at": ISODate("2024-01-15T10:30:00Z")
}
```

### **12. `content_embeddings` - Vector Representations**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "app_id": "1",
    "kind": "item",                      // item|user
    "item_id": "movie_12345",            // Content or user ID
    "vector": [0.1, 0.5, -0.3, ...],   // Dense vector representation
    "updated_at": ISODate("2024-01-15T10:30:00Z"),
    "meta": {                           // Additional vector metadata
        "model": "sentence-transformers",
        "version": "v1.0"
    }
}
```

---

## 📈 **Analytics & Monitoring Collections**

### **13. `reco_requests` - Recommendation Logs**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "app_id": "1",
    "user_id": "user_12345",
    "request_id": "req_abc123",          // Unique request identifier
    "context": "home_page",              // Request context
    "limit": 10,                         // Number of items requested
    "served": [                          // Items served in response
        {
            "c_id": "movie_12345",
            "rank": 1,
            "score": 0.85,
            "why": {                     // Explanation for recommendation
                "covisit": 0.3,
                "trending": 0.2,
                "content": 0.35
            }
        }
    ],
    "req_ts": ISODate("2024-01-15T10:30:00Z")
}
```

### **14. `reco_outcomes` - Recommendation Performance**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1", 
    "app_id": "1",
    "request_id": "req_abc123",          // Links back to reco_requests
    "user_id": "user_12345",
    "c_id": "movie_12345",               // Item that was clicked/purchased
    "event_type": "view",                // Outcome event type
    "ts": ISODate("2024-01-15T10:30:00Z")
}
```

### **15. `reco_audit` - Governance & Compliance**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "app_id": "1",
    "user_id": "user_12345",
    "request_id": "req_abc123",
    "policies": ["content_filter", "geo_filter"], // Applied policies
    "items": [                           // Detailed audit trail
        {
            "c_id": "movie_12345",
            "score": 0.85,
            "why": {
                "covisit": 0.3,
                "trending": 0.4,
                "content": 0.15,
                "policy_applied": "content_filter"
            }
        }
    ],
    "ts": ISODate("2024-01-15T10:30:00Z")
}
```

### **16. `daily_kpi` - Aggregated Metrics**
```javascript
{
    "_id": ObjectId("..."),
    "tenant_id": "1",
    "app_id": "1", 
    "date": "2024-01-15",               // Aggregation date
    "exp_id": "homepage_exp_1",         // Experiment ID
    "variant": "control",               // A/B test variant
    "served": 10000,                    // Total recommendations served
    "ctr": 0.15,                       // Click-through rate
    "play60_rate": 0.35,               // 60-second play rate
    "goal_rate": 0.08,                 // Conversion rate
    "avg_watch_sec": 1800,             // Average watch time
    "diversity_at_30": 0.7,            // Diversity in top 30
    "freshness_at_30": 0.6,            // Freshness in top 30
    "generated_at": ISODate("2024-01-16T02:00:00Z")
}
```

---

## 🔍 **Key Indexes & Performance**

### **Critical Indexes:**
```javascript
// tenants collection
db.tenants.createIndex({"tenant_id": 1}, {unique: true})

// apps collection  
db.apps.createIndex({"tenant_id": 1, "app_id": 1}, {unique: true})
db.apps.createIndex({"tenant_id": 1})

// interactions collection (most queried)
db.interactions.createIndex({"tenant_id": 1, "user_id": 1})
db.interactions.createIndex({"tenant_id": 1, "user_id": 1, "context.app_id": 1})
db.interactions.createIndex({"tenant_id": 1, "user_id": 1, "context.adapter": 1})
db.interactions.createIndex({"tenant_id": 1, "ts": -1})

// items collection
db.items.createIndex({"tenant_id": 1, "item_id": 1}, {unique: true})
db.items.createIndex({"tenant_id": 1, "facets.app_id": 1})

// counters collection
db.counters.createIndex({"_id": 1}, {unique: true})
```

---

## 🎯 **Usage Patterns**

### **Data Ingestion Flow:**
1. **Create Tenant** → `tenants` collection
2. **Create App** → `apps` collection  
3. **Ingest Items** → `items` collection (with app_id & adapter in facets)
4. **Ingest Events** → `interactions` collection (with app_id & adapter in context)

### **Analytics Queries:**
```javascript
// All user interactions
db.interactions.find({"tenant_id": "1", "user_id": "user_123"})

// Filter by adapter
db.interactions.find({
    "tenant_id": "1", 
    "user_id": "user_123",
    "context.adapter": "ott"
})

// Filter by app
db.interactions.find({
    "tenant_id": "1",
    "user_id": "user_123", 
    "context.app_id": "1"
})

// Combined filter
db.interactions.find({
    "tenant_id": "1",
    "user_id": "user_123",
    "context.app_id": "1",
    "context.adapter": "ott"
})
```

---

## 🚀 **Schema Evolution**

### **Recent Additions (v2.0):**
- ✅ **Auto-incremental IDs**: `counters` collection for tenant_id and app_id generation
- ✅ **App management**: Enhanced `apps` collection with adapter field
- ✅ **Adapter context**: Added `adapter` field to interactions and items
- ✅ **User analytics**: Enhanced query patterns for cross-domain analytics

### **Schema Benefits:**
- 🎯 **Multi-tenant**: Complete isolation between tenants
- 📱 **Multi-app**: Support for multiple apps per tenant
- 🔧 **Multi-domain**: Adapter-specific metadata and analytics
- 📊 **Comprehensive analytics**: User behavior across domains
- 🚀 **Scalable**: Indexed for high-performance queries

---

**📊 Total Collections: 16+**  
**🎯 Primary Use: Domain-agnostic recommendation engine with comprehensive analytics**  
**⚡ Optimized for: Multi-tenant, multi-app, cross-domain user behavior tracking**
