Skip to content

Introduction

This is the technical overview of Cognitive Companion, written for developers and operators who want to deploy, inspect, or extend the system. If you are a family member or care partner deciding whether the system fits your household, start with Cognitive Companion for families, which covers what the system does without the implementation detail.

Cognitive Companion is a privacy-first, on-premise AI system for senior care in multigenerational households. It combines safety monitoring, cognitive support, and a personal knowledge repository. Camera feeds and sensor data flow through composable rule-based pipelines; vision and language models run entirely on local hardware.

The problem

Seniors experiencing cognitive decline face a difficult tradeoff: full-time monitoring that strips away independence, or no monitoring at all. Existing solutions tend toward one extreme:

  • Basic motion sensors trigger too many false alarms and lack context awareness.
  • Cloud-based AI cameras send private footage off-premises and require internet connectivity.
  • Full automation systems remove the daily routines that maintain cognitive function.

The approach

Cognitive Companion addresses this through six design choices:

  1. Context-aware perception. Vision LLMs analyze what is happening in each frame rather than flagging all motion. A person in the kitchen at noon is routine; at 3 AM, it warrants attention.
  2. Composable pipelines. Each rule assembles its own graph of steps and edges. Rules are not constrained to a fixed trigger-action template.
  3. Reminders over automation. The system delivers suggestions and alerts without acting on the senior's behalf, preserving daily routines and decision-making agency.
  4. Local inference. All models run on-premise through vLLM, llama.cpp, and sibling services. Camera frames stay in local MinIO storage. No footage leaves the network unless an outbound channel is explicitly configured.
  5. Multigenerational interfaces. Caregivers receive alerts through Telegram, webhooks, or the admin console. Seniors interact through voice, popup notifications, e-ink displays, and TTS.
  6. Personal knowledge repository. Caregivers curate facts about people, places, and routines. The system generates narrated info cards, review-gated quizzes, and a voice Q&A interface backed by RAG, helping seniors stay connected to their own history.

How it works

Event flow:

  1. Edge devices (cameras, sensors, RTSP streams) send data to the backend or stream into the continuous-tracking service.
  2. The EventAggregator batches frames per sensor with windowing and cooldown.
  3. The RulesEngine matches each event against enabled rules using context filters, dependencies, and rate limits.
  4. Each matching rule's composable pipeline executes via the PipelineExecutor.
  5. Pipeline steps perform person identification, scene analysis, presence queries, LLM reasoning, condition branching, wait and resume, activity recording, daily reports, knowledge retrieval, and so on.
  6. Outputs flow to any combination of channels: PWA popup, Telegram, e-ink display, HA Speaker TTS, PWA TTS announcement, PWA Realtime AI, and outbound webhooks.

Key capabilities

CapabilityDescription
23 pipeline step typesllm_call, person_identification, scene_analysis, image_crop, semantic_memory_query, semantic_memory_write, object_trend_analysis, presence_query, home_state, notification, ha_action, activity_detection, activity_session_start, activity_session_end, daily_report, verification, condition, wait, interactive_prompt, info_card, quiz_start, recamera_media_poll, cts_window_poll.
7 notification channelspwa_popup_text, pwa_realtime_ai, pwa_tts_announcement, telegram, eink, ha_speaker_tts, webhook.
13 context filtersroom, time_range, day_of_week, person_presence, person_activity, room_transition, person_movement_memory, scene_contains, scene_trend, home_state, presence_status, presence_dwell, dementia_signal.
8 trigger typessensor_event, cron, manual, webhook (HMAC), telegram (bot command), occupancy_duration, cts_window, dementia_signal.
Person trackingArcFace face recognition fused with HA presence sensors, with whole-house location.
Multi-camera trackingOptional continuous-tracking-service for floor-plane Kalman world tracking, Bayesian identity resolution, and dementia signal generation.
Activity trackingDetect and record activities; duration-aware sessions; end-of-day wellness rollup with optional LLM summary.
Voice companionRealtime conversations via Google Gemini Live with WebSocket audio and tool calling.
Knowledge repositoryCaregiver-curated facts with narrated info cards, review-gated quizzes, and voice Q&A backed by RAG (Triton embeddings + pgvector + LLM synthesis).
Visual pipeline builderGraph canvas with nodes, edges, output ports, dynamic step palette, and per-step config dialogs.
Presence fusionPriority-ordered chain: bed sensor, CTS, HA device tracker, fallback. Configured in config/presence.yaml.
E-ink displaysPer-device notification images with template editor and refresh suppression.
MCP tool serverOver 30 tools (read-only plus rule triggering, rule authoring, and interactive response recording).
Plugin systemsStep handlers, channels, and filters auto-discovered as Python files.
RBACAPI keys, hardware device keys, and fnmatch permission patterns.

Technology stack

LayerTechnology
BackendPython 3.14, FastAPI, SQLAlchemy 2.0, Pydantic 2, APScheduler
FrontendVue 3, Vuetify 3, Vite
DatabasePostgreSQL 18 (TimescaleDB, PostGIS, pgvectorscale) with Alembic migrations
Vision LLMCosmos-Reason2-8B via vLLM
General LLMGemma 4 26B via llama.cpp llama-server
VoiceGoogle Gemini 2.5 Flash (Live API)
Face recognitionInsightFace buffalo_l with ArcFace embeddings
Scene analysisYOLO11x, Florence-2-large, CLIP ViT-L/14
Semantic memoryPostgreSQL + pgvectorscale
Knowledge embeddingsembeddinggemma-300m via Triton Inference Server
Multi-camera trackingYOLO26L + SOLIDER-REID + RTMPose + floor-plane Kalman world tracker (Triton for inference)
Object storageMinIO (S3-compatible)
LoggingPython stdlib logging via a thin BoundLogger

Next steps

Released under the AGPL-3.0 License.