Clients/Autosys/Autobot/vault-outstanding-steps.md

Content

# AI Client Vault - Outstanding Steps

## ~~1. Slack Bot Setup~~
- ~~Create a Slack app at https://api.slack.com/apps~~
- ~~Add OAuth scopes: `channels:history`, `channels:read`, `users:read`~~
- ~~Install to workspace and copy the Bot User OAuth Token (`xoxb-...`)~~
- ~~Provide the token to be added to `/opt/vault/.env` as `SLACK_BOT_TOKEN`~~
- ~~Configure channel-to-client mapping in `/opt/vault/config/slack_channels.yaml`:~~
  ```yaml
  channels:
    general-gq3: GQ3
    project-reno: Reno
  ```
- ~~Code is ready: `ingestors/slack_ingest.py` + `vault-slack-ingest.timer` (5 min)~~

## 2. rclone Google Drive Backup
- Needs interactive browser OAuth via `rclone config` on datavault
- Run: `ssh datavault.local` then `sudo -u vault rclone config`
  - Choose: New remote → Google Drive → accept defaults → browser auth
- Name the remote `gdrive`
- Cron job already configured for 3 AM nightly backup
- Backs up: vault files, database dump, config to `gdrive:vault-backup/`

---

## Feature Ideas

### Obsidian Plugin - Vault Search
Build a custom Obsidian plugin that connects to the Vault RAG API (port 8900) directly from Obsidian. Features:
- ~~**Command palette search**: `Ctrl+Shift+V` opens a query modal, ask questions in natural language, get answers with source links that open directly in Obsidian~~
- **Sidebar panel**: Persistent search panel showing entity matches, related documents, and recent queries
- **Inline suggestions**: When editing a client note, auto-suggest related entities and documents from the same client
- **Entity hover cards**: Hover over a known entity (server name, contact, etc.) to see a popup with all stored info about it
- **Status bar widget**: Shows sync status, index health, and document count
- **Right-click context**: Select text in any note → "Search Vault for this" sends it as a query
- Tech: Obsidian plugin API (TypeScript), calls `POST /query` and `GET /entities/search` endpoints

### Client Onboarding Automation
- Auto-detect new Gmail labels matching `Clients/*` pattern and create folder structure
- Template system: when a new client folder is created, auto-populate with a `_template.md` containing standard fields (contacts, servers, VPN info, project scope)
- Onboarding checklist that tracks what data sources are connected per client

### Email Thread Intelligence
- ~~Group emails by thread ID and build conversation summaries~~ **DONE** (thread files in `emails/threads/` with GPT-4.1-mini summaries)
- Detect action items and deadlines from email threads using GPT
- Auto-create task entities from emails containing phrases like "please do", "by Friday", "action required"
- Weekly digest: auto-generated summary of client activity across all channels

### Slack Lists Ingestion — BLOCKED
Slack Lists are used for task tracking with comment threads on each item. Important project context lives in these comment threads.
- **API probed (2026-02-19)**: `lists:read` scope added, tested against Autosys Tasks list (F09DBBTNQFJ, 202 items)
- **Task data available**: All fields work — status, assignee, due date, priority, type, client, description, subtasks. Linked Message field gives `channel_id` + `ts` for `conversations.replies`.
- **BLOCKER: List item comments are NOT exposed via API**. The comment threads added directly on task items in the Slack UI have no API surface — `items.info` returns no conversation/thread metadata, `files.info` shows `comments_count: 0`, and the download export is CSV-only. This is the most valuable data.
- **Decision**: Skip this integration until Slack exposes list item comments. Task metadata alone (status/assignee/dates) isn't worth the complexity without the discussion threads.
- **Key finding**: List comment threads use a pseudo-channel where channel_id = list file ID with C prefix (e.g. list `F09DBBTNQFJ` -> channel `C09DBBTNQFJ`). Each task item's comments are a `thread_ts` under that channel. The URL format is `/archives/{channel}/p{ts}?thread_ts={item_ts}`. Bot gets `channel_not_found` — Slack hasn't opened this channel type to bot tokens yet.
- **Revisit when**: Slack grants bot access to list pseudo-channels, or adds comment endpoints to the Lists API

### Document Change Tracking
- Track what changed between re-syncs of Google Docs (diff detection)
- Changelog view per document showing what was added/removed/modified
- Notification system: alert when a critical document (e.g., alarm procedures) is modified
- Version history browser in the web UI

### Smart Alerts and Monitoring
- Watch for new entities that match critical patterns (e.g., new server IPs, credential references)
- Stale document detection: flag documents that haven't been updated in X days
- Missing data alerts: detect clients with no recent emails, no contacts, or incomplete entity profiles
- Integration with SIGNL4/PagerDuty for critical document change alerts

### Multi-Modal Ingestion
- **Attachment extraction**: Pull PDF/image attachments from Gmail and index them (OCR via tesseract for images)
- **Voice memo transcription**: Drop audio files in a client folder, auto-transcribe with Whisper, index the text
- **Photo/whiteboard capture**: OCR handwritten notes or whiteboard photos dropped into client folders
- **Calendar integration**: Pull Google Calendar events tagged with client names, index meeting notes

### Client Dashboard Enhancements
- **Timeline view**: Chronological activity feed per client across all sources (emails, docs, Slack, manual notes)
- **Relationship graph**: Visual network showing connections between clients, contacts, servers, and systems
- **Search analytics**: Track most-queried topics to identify knowledge gaps
- **Export reports**: Generate client summary PDFs with all entities, recent activity, and key documents

### Team Collaboration
- **Multi-user access**: API key authentication for the web UI, role-based access per client
- **Shared annotations**: Add notes/tags to entities and documents visible to all team members
- **Query history**: Shared log of questions asked, useful for team knowledge sharing
- **Audit trail**: Track who queried what and when for compliance

### Local LLM Fallback
- Run Ollama with a small model (Phi-3, Llama 3.2) as a fallback when OpenAI is unreachable
- Use local LLM for non-critical tasks (summaries, simple entity extraction) to reduce API costs
- Keep GPT-4.1-mini for high-accuracy tasks (complex entity extraction, RAG answers)

### Mobile Access
- Progressive Web App (PWA) wrapper for the web UI for mobile access
- Quick query shortcut from phone home screen
- Push notifications for critical document changes or new client emails

Extracted Entities

Type	Key	Value	Confidence	Evidence
server	datavault hostname	datavault.local	95%	Run: `ssh datavault.local` then `sudo -u vault rclone config`
server	Vault RAG API port	8900	90%	Vault RAG API (port 8900) directly from Obsidian
server	Google Drive remote name	gdrive	90%	Name the remote `gdrive`
task	rclone Google Drive backup setup	Run interactive OAuth config on datavault, cron job at 3 AM nightly	90%	Run: `ssh datavault.local` then `sudo -u vault rclone config`
task	Slack Lists API access	Wait for Slack to grant bot access to list pseudo-channels or add comment endpoints	85%	Revisit when Slack grants bot access to list pseudo-channels

File: Clients/Autosys/Autobot/vault-outstanding-steps.md
Updated: 2026-03-06 05:16:56.561534