Content
# Datavault — Hetzner Cloud Server
Migrated from local Parallels VM (10.211.55.13) on 2026-03-03.
---
## Server
| Field | Value |
|---|---|
| Provider | Hetzner Cloud |
| Plan | CX22 (2 vCPU, 4GB RAM, 40GB disk) |
| OS | Ubuntu 24.04.3 LTS aarch64 |
| Public IP | 89.167.102.34 |
| Tailscale IP | 100.113.209.23 |
| Tailscale hostname | autosys-vault |
| SSH | root (key-based only) |
| Vault user | vault (uid 1000) |
| Location | Helsinki, Finland (hel1) |
---
## DNS & URLs
All DNS managed at [Squarespace Domains](https://domains.squarespace.com) under autosysapp.com.
| Subdomain | A Record | Purpose |
|---|---|---|
| vault.autosysapp.com | 89.167.102.34 | CouchDB (Obsidian LiveSync) |
| dashboard.autosysapp.com | 89.167.102.34 | Vault Web UI (password protected) |
TLS certificates auto-provisioned by Caddy via Let's Encrypt.
---
## Web Access
### Obsidian LiveSync (CouchDB)
- **URL:** https://vault.autosysapp.com
- **Database:** obsidian_vault
- **Username:** admin
- **Password:** CouchVault2026
- **Docs:** ~52,854 (chunked from ~5.8k Obsidian files)
- **Max document size:** 50MB
### Vault Dashboard (Management UI)
- **URL:** https://dashboard.autosysapp.com
- **Username:** mason
- **Password:** vault2026
- **Features:** RAG query, client browser, job triggers, Slack/reMarkable/GDrive config, job status
### Vault API (Tailscale only)
- **URL:** http://100.113.209.23:8900
- **Health:** http://100.113.209.23:8900/health
---
## Credentials
| Service | Username | Password | Port | Bind |
|---|---|---|---|---|
| CouchDB | admin | CouchVault2026 | 5984 | 127.0.0.1 |
| PostgreSQL | vaultdb_user | BNE9tWic7O6Xdfk5kKdkrJ5VHnINfO1Q | 5432 | 127.0.0.1 |
| Redis | — | VaultRedis2026 | 6379 | 127.0.0.1 |
| Dashboard (Caddy) | mason | vault2026 | 443 | 0.0.0.0 |
- **PostgreSQL database:** vaultdb (pgvector extension, ~2,189 indexed documents)
- **CouchDB cookie:** brumbrum42vault
- **CouchDB CORS:** enabled, require_valid_user=true
---
## Firewall (UFW)
Default: deny incoming, allow outgoing.
| Rule | Port/Interface | Purpose |
|---|---|---|
| ALLOW IN | tailscale0 (all) | Full access over Tailscale VPN |
| ALLOW IN | 443/tcp | HTTPS (Caddy: CouchDB + Dashboard) |
| ALLOW IN | 41641/udp | Tailscale WireGuard |
All backend services bind to 127.0.0.1. Only Caddy (443) is publicly reachable.
---
## Always-On Services
| Service | Description | User | Port |
|---|---|---|---|
| vault-api | FastAPI RAG API + Web UI | vault | 8900 |
| vault-slack-bot | AutoBot (Slack Socket Mode) | vault | — |
| vault-watcher | Filesystem watcher (inotify) | vault | — |
| vault-worker | RQ job queue worker | vault | — |
| livesync-bridge | CouchDB <-> filesystem sync (Deno) | vault | — |
| couchdb | Apache CouchDB 3.5.1 | couchdb | 5984 |
| redis-server | Redis key-value store | redis | 6379 |
| postgresql@16-main | PostgreSQL 16 + pgvector | postgres | 5432 |
| caddy | Reverse proxy + auto-TLS | caddy | 80, 443 |
---
## Scheduled Jobs (systemd timers)
| Timer | Schedule | Description |
|---|---|---|
| vault-stray-images | Every 30s | Move stray images to vault_files/ |
| vault-gmail-ingest | Every 5 min | Ingest emails by Gmail label prefix Clients/ |
| vault-indexer | Every 10 min | Index new/changed vault files into PostgreSQL |
| vault-unsub-monitor | Every 10 min | Monitor email unsubscribe status |
| vault-slack-ingest | Every 15 min | Ingest Slack channel messages |
| vault-gdoc-sync | Every 30 min | Sync Google Docs to vault |
| vault-gdrive-ingest | Every 30 min | Ingest Google Drive files |
| vault-remarkable-ingest | Every hour | Ingest reMarkable notebooks and PDFs |
| vault-reconcile | Daily 04:00 UTC | Reconcile vault state |
| vault-task-digest | Weekly Sun 16:00 UTC | Generate task due digest |
---
## API Endpoints
| Method | Path | Description |
|---|---|---|
| GET | /health | Health check (DB + Redis status) |
| GET | / | Web UI home page |
| GET | /clients | List all clients |
| GET | /clients/{client}/entities | List entities for a client |
| GET | /clients/{client}/entities/{type} | List entities by type |
| GET | /entities/search | Search entities |
| POST | /query | RAG query (semantic search + LLM) |
| GET | /preview/{client}/{path} | Preview a document |
| GET | /ui/client/{name} | Client detail page |
| GET | /ui/doc/{id} | Document detail page |
| GET | /ui/query | Query interface |
| GET | /ui/jobs | Job status dashboard |
| POST | /api/trigger/{service} | Manually trigger an ingest service |
| POST | /api/timer/{name}/toggle | Enable/disable a timer |
| GET | /api/digest/config | Get digest configuration |
| POST | /api/digest/config | Update digest configuration |
| GET | /ui/slack | Slack config page |
| POST | /ui/slack | Update Slack config |
| GET | /api/slack/channels | List Slack channels |
| GET | /ui/remarkable | reMarkable config page |
| POST | /ui/remarkable | Update reMarkable config |
| GET | /ui/gdrive | Google Drive config page |
| POST | /ui/gdrive | Update GDrive config |
| GET | /ui/unsub | Unsubscribe monitor page |
| POST | /ui/unsub | Update unsub config |
---
## File Paths
### Application
| Path | Contents |
|---|---|
| /opt/vault/ | Application root |
| /opt/vault/.venv/ | Python 3.12.3 virtual environment |
| /opt/vault/.env | Environment variables and secrets |
| /opt/vault/api/main.py | FastAPI application (72KB) |
| /opt/vault/ingestors/ | All ingestor modules |
| /opt/vault/indexer/ | File indexer + watcher |
| /opt/vault/taskqueue/ | RQ worker + jobs |
| /opt/vault/scripts/ | Slack bot, utilities |
| /opt/vault/config/ | YAML configs |
| /opt/vault/state/ | Sync cursors and state |
| /opt/vault/logs/ | Application logs |
### Ingestors
| File | Description |
|---|---|
| ingestors/gmail_ingest.py | Gmail ingestion by label |
| ingestors/slack_ingest.py | Slack channel message ingestion |
| ingestors/gdrive_ingest.py | Google Drive file ingestion |
| ingestors/gdoc_ingest.py | Google Docs sync |
| ingestors/remarkable_ingest.py | reMarkable Cloud (notebooks + PDFs) |
| ingestors/unsub_monitor.py | Email unsubscribe monitoring |
| ingestors/enex_convert.py | Evernote ENEX conversion |
### Config Files
| File | Description |
|---|---|
| config/slack_channels.yaml | Slack channels to ingest + bot token |
| config/gdrive_config.yaml | Google Drive folder mappings |
| config/remarkable_config.yaml | reMarkable device/user tokens + folder mappings |
| config/remarkable_cloud_cache.json | Cached reMarkable cloud item list |
| config/digest_config.yaml | Task digest configuration |
### State Files (sync cursors — critical, do not delete)
| File | Description |
|---|---|
| state/slack_cursors.json | Slack channel read positions |
| state/slack_canvas_state.json | Slack canvas sync state |
| state/gdoc_tracking.json | Google Docs revision tracking |
| state/gdrive_sync_state.json | Google Drive sync state (775KB) |
| state/remarkable_sync_state.json | reMarkable doc hash tracking |
| state/unsub_state.json | Email unsubscribe state |
### Google OAuth
| File | Description |
|---|---|
| /opt/vault/gmail_credentials.json | OAuth client credentials |
| /opt/vault/gmail_token.json | OAuth refresh token (may need re-auth from new IP) |
### Data
| Path | Contents |
|---|---|
| /srv/obsidian/Vault/ | Obsidian vault root (synced via LiveSync) |
| /home/vault/.cache/huggingface/ | Sentence transformer model (all-MiniLM-L6-v2, 88MB) |
### LiveSync Bridge
| Path | Contents |
|---|---|
| /opt/livesync-bridge/ | LiveSync bridge application (Deno) |
| /opt/livesync-bridge/dat/config.json | Bridge configuration |
| /opt/livesync-bridge/main.ts | Entry point |
### System Config
| Path | Contents |
|---|---|
| /etc/caddy/Caddyfile | Caddy reverse proxy config |
| /etc/redis/redis.conf | Redis config (requirepass set) |
| /home/vault/.deno/bin/deno | Deno 2.7.2 runtime |
---
## Caddy Configuration
```
vault.autosysapp.com {
reverse_proxy localhost:5984
}
dashboard.autosysapp.com {
basicauth {
mason <bcrypt-hash>
}
reverse_proxy localhost:8900
}
```
---
## Environment Variables (.env)
| Variable | Value | Description |
|---|---|---|
| OPENAI_API_KEY | sk-proj-3hJW...juIA | OpenAI API key for RAG queries |
| CHAT_MODEL | gpt-4.1-mini | LLM model for chat |
| VISION_MODEL | gpt-4.1-mini | LLM model for vision tasks |
| EMBED_MODEL | all-MiniLM-L6-v2 | Local sentence transformer |
| EMBED_DIM | 384 | Embedding dimension |
| POSTGRES_DB | vaultdb | Database name |
| POSTGRES_USER | vaultdb_user | Database user |
| POSTGRES_PASSWORD | BNE9tWic7O6Xdfk5kKdkrJ5VHnINfO1Q | Database password |
| VAULT_ROOT | /srv/obsidian/Vault | Obsidian vault path |
| GMAIL_LABEL_PREFIX | Clients/ | Gmail label filter |
| MAX_CONTEXT_CHUNKS | 12 | RAG context window |
| REDIS_URL | redis://:VaultRedis2026@localhost:6379/0 | Redis connection |
| SLACK_BOT_TOKEN | (empty) | Loaded from config/slack_channels.yaml |
| SLACK_APP_TOKEN | xapp-1-A0AFT...8593 | Slack Socket Mode token |
| SIGNL4_API_KEY | 4b67a...814f | SIGNL4 alerting integration |
| HF_HUB_OFFLINE | 1 | Use local HuggingFace cache only |
---
## Data Flow
```
External Sources --> Ingestors --> /srv/obsidian/Vault/ --> Indexer --> PostgreSQL (embeddings)
|
LiveSync Bridge
|
CouchDB (chunks)
|
Obsidian LiveSync Plugin
|
Obsidian (all devices)
```
1. **Ingestors** pull from Gmail, Slack, Google Drive, Google Docs, reMarkable Cloud
2. **Files** land in /srv/obsidian/Vault/Clients/{client}/{category}/
3. **LiveSync bridge** detects filesystem changes, chunks files into CouchDB
4. **Obsidian clients** sync via LiveSync plugin over HTTPS to CouchDB
5. **Indexer** generates embeddings (all-MiniLM-L6-v2), stores in PostgreSQL with pgvector
6. **Vault API** serves semantic search (RAG) queries against embeddings
7. **Slack bot (AutoBot)** answers questions via the API in DMs to Mason (U08EGEVRYSE)
---
## Obsidian LiveSync Client Setup
To connect a new Obsidian client:
1. Install "Self-hosted LiveSync" community plugin
2. Settings:
- **URI:** https://vault.autosysapp.com
- **Database:** obsidian_vault
- **Username:** admin
- **Password:** CouchVault2026
- **Sync mode:** LiveSync
3. Enable "Sync on start" and "Periodic sync"
4. No E2E encryption configured
---
## Slack Integration
- **Bot name:** AutoBot
- **Socket Mode:** Enabled (xapp token in .env)
- **Bot token:** Stored in config/slack_channels.yaml
- **Mason's Slack user ID:** U08EGEVRYSE
- **Rule:** DMs to Mason only, never post to public channels
- **Slack app:** A0AFTAB1QH4
---
## reMarkable Integration
- **Cloud API:** hash-based sync (eu.tectonic.remarkable.com)
- **Auth:** webapp-prod.cloud.remarkable.engineering
- **Config:** config/remarkable_config.yaml (device_token, user_token, folder mappings)
- **Output:** Clients/{client}/remarkable/{subfolders}/{name}.md or .pdf
- **546 cloud items** (130 folders, 416 documents), ~143 mapped to vault
- **PDF support:** native PDFs written directly, notebooks rendered as .md with page images
- **State:** state/remarkable_sync_state.json (keyed by doc_id)
- **Gotcha:** move-stray-images.sh excludes */remarkable/* to avoid relocating PDFs
---
## Google Integration
- **Gmail:** OAuth (gmail_credentials.json + gmail_token.json), label prefix Clients/
- **Google Drive:** OAuth via same credentials, config in config/gdrive_config.yaml
- **Google Docs:** Synced via config/gdoc_tracking.json
- **Note:** OAuth tokens may need re-auth from new server IP (run scripts/reauth_google.py)
---
## Software Versions
| Package | Version |
|---|---|
| Python | 3.12.3 |
| PostgreSQL | 16 + pgvector |
| CouchDB | 3.5.1 |
| Redis | 7.x |
| Caddy | 2.x (auto-TLS) |
| Deno | 2.7.2 |
| Tailscale | 1.94.2 |
| rclone | 1.73.1 |
| fail2ban | active |
| tesseract-ocr | installed |
| unattended-upgrades | enabled |
---
## Backup
- **rclone cron (vault user):** Daily at 03:00 UTC
- **Command:** rclone sync /srv/obsidian/Vault gdrive:ObsidianVaultBackup
- **Log:** /opt/vault/logs/backup.log
- **Status:** rclone remote not yet configured (needs `rclone config` for Google Drive)
---
## Security
- All services bind to 127.0.0.1 (no direct public access)
- CouchDB accessed publicly only via Caddy (TLS + CouchDB native auth, require_valid_user=true)
- Dashboard accessed publicly via Caddy (TLS + HTTP basic auth)
- Vault API accessible only via Tailscale
- SSH accessible only via Tailscale (port 22 public rule to be removed post-migration)
- fail2ban active (SSH + Caddy)
- unattended-upgrades enabled for security patches
- Redis password protected (requirepass)
- CORS enabled on CouchDB (required for LiveSync plugin)
---
## Tailscale Network
| Device | IP | OS | Status |
|---|---|---|---|
| autosys-vault (this server) | 100.113.209.23 | Linux | Online |
| autosys3 | 100.108.147.39 | Windows | Online |
| iphone-15-pro-max | 100.87.241.74 | iOS | Offline |
| laras-macbook-pro | 100.69.164.90 | macOS | Offline |
| mason-ftview-vm | 100.100.20.8 | Windows | Offline |
---
## Admin Commands
```bash
# Check all services
systemctl status vault-api vault-slack-bot vault-watcher vault-worker livesync-bridge
# Check all timers
systemctl list-timers | grep vault
# View live logs
journalctl -f -u vault-api
journalctl -f -u vault-slack-bot
journalctl -u vault-gmail-ingest --since "1 hour ago"
# Restart a service
systemctl restart vault-api
# Trigger an ingest manually
systemctl start vault-gmail-ingest
# CouchDB status
curl -s http://admin:CouchVault2026@localhost:5984/obsidian_vault | python3 -m json.tool
# PostgreSQL doc count
sudo -u postgres psql -d vaultdb -c "SELECT count(*) FROM documents;"
# API health check
curl -s http://localhost:8900/health
# Disk usage
df -h /
# Check replication status (during migration)
curl -s http://admin:CouchVault2026@localhost:5984/obsidian_vault | python3 -c "import json,sys; d=json.load(sys.stdin); print(f'Docs: {d[\"doc_count\"]:,}')"
```
---
## Migration Notes
- Migrated from Parallels VM (10.211.55.13) on Mason's MacBook
- Both servers ARM64 (aarch64) + Python 3.12.3 — venv and model cache copied directly
- Old datavault services stopped and disabled post-migration
- CouchDB data transferred via native replication over HTTPS