ReferencePrivacy

Privacy & data residency

Beside is built so that your captured activity, your wiki, and the model that indexes them all live on your machine. This page is the concrete version of that promise: which components run locally, where bytes are stored, what crosses the process boundary, and which knobs you have to make the boundary stricter.

If you can only remember one sentence: out of the box, with the default config, no screenshots, no audio, no transcripts, no embeddings, and no wiki pages ever leave your computer.

What runs on your machine

Every part of the pipeline that handles raw user data is local:

  • Capturenode capture and the native macOS sidecar both run in your user session. Screenshots are written through your local filesystem; system audio is recorded via the macOS Core Audio Tap or ScreenCaptureKit backend that you choose, never streamed off-device.
  • OCR — done locally via Tesseract (tesseract.js) inside a runtime worker. Image bytes are read from your storage root, OCR runs in-process, and only the resulting text is persisted.
  • Audio transcription — done locally via the whisper CLI subprocess (configurable via capture.audio.whisper_command, default whisper, default model base). The runtime hands it a path to a file in ~/.beside/raw/audio/inbox/, reads the transcript back, and (if delete_audio_after_transcribe: true, the default) removes the audio file.
  • Indexing & wiki generation — runs through the configured model adapter. The default is Ollama, which means a local HTTP daemon at http://127.0.0.1:11434 running open-weight models (Gemma by default).
  • Embeddings — generated by the same local Ollama instance via the nomic-embed-text model and stored in your local SQLite database.
  • Storage — every byte (events, screenshots, audio, frames, embeddings, sessions, meetings, hook records, wiki pages) lives under ~/.beside/. There is no Beside cloud, no telemetry endpoint, and no upload step in the runtime.
  • MCP server — binds explicitly to 127.0.0.1:3456, so only processes running on your own machine can reach it. (See Export & MCP for transport details.)

There is no "Beside backend." The product, in its default configuration, literally cannot phone home — there is no code path that opens an outbound network connection with your data.

What can cross the network (and only when you choose it)

A small number of things involve the network, and they are explicit and opt-in:

SurfaceWhen it talks to the networkCarries your data?
Ollama auto-install (init)One-time download of the Ollama binary + model weights from ollama.com and the chosen model registry.No — only software downloads.
model.plugin: openai (or any hosted adapter you configure)Every prompt sent to the configured base_url.Yes — the prompt content. Disabled by default.
MCP HTTP serverWhen an agent on your machine connects to http://127.0.0.1:3456.Loopback only. The agent itself may be hosted (Claude, ChatGPT) — see below.
Desktop auto-updatePeriodic check against the GitHub releases feed for sagivo/beside.No — only release metadata. Can be disabled.
Plugin install (manual)Pulling a third-party plugin from npm/GitHub.No — only the plugin code.

That’s the full list. If a future plugin you install adds a new network call, it does so under its own identity, not Beside’s.

What MCP agents actually receive

The MCP exporter is the most common way Beside ends up integrated with a hosted agent (Claude Desktop, Cursor, ChatGPT desktop). Even then the privacy boundary is preserved by design:

  • The MCP server does not return raw screenshots. It returns MemoryChunks — wiki excerpts, summaries, structured day events — capped at text_excerpt_chars (default 5000).
  • Frame access is gated through queries that return text and metadata, not asset bytes. An agent never sees your image files unless you explicitly build a tool that exposes them.
  • The MCP server binds to 127.0.0.1 by default, so even on a shared network the protocol surface is unreachable from another machine.

If you connect a hosted agent (e.g. Claude on the web), the agent provider will see whatever excerpts that agent decides to send up its own pipeline as part of your conversation. That is the agent’s responsibility, not Beside’s, but it’s a real boundary worth knowing about — which is exactly why local agents (Cursor, local Claude Desktop) plus local Ollama is the strictest default.

Defence in depth — the privacy controls in config.yaml

The defaults are conservative; the knobs let you go stricter.

Exclusion lists (capture-time)

These prevent the data from ever being captured in the first place — they’re matched before a screenshot is taken or an event is written.

capture:
  excluded_apps:
    - 1Password
    - Bitwarden
    - Keychain Access
  excluded_url_patterns:
    - "^https://.*\\.bank\\."
    - "^https://admin\\."

excluded_apps is a substring/regex list against the frontmost app name and bundle ID. excluded_url_patterns is a list of regexes matched against the active URL. Both are evaluated by capture and by hooks, so a hook can never see a surface capture skipped.

Sensitive keyword redaction

capture:
  privacy:
    blur_password_fields: true
    pause_on_screen_lock: true
    sensitive_keywords:
      - password
      - api_key
      - secret

blur_password_fields asks the OS / accessibility layer to blank password fields before a screenshot is taken. pause_on_screen_lock halts the capture loop entirely while the screen is locked. sensitive_keywords is also reused by the audio transcript worker, which redacts matching tokens from transcripts before they land in storage.

Accessibility text limits

capture:
  accessibility:
    enabled: true
    timeout_ms: 1500
    max_chars: 8000
    max_elements: 4000
    excluded_apps: []

Accessibility text reads are bounded both by time (timeout_ms) and volume (max_chars, max_elements). A misbehaving app can’t leak a giant DOM into your storage by being slow.

Hook prompt and image caps

hooks:
  max_image_bytes: 2097152      # 2 MiB
  max_prompt_chars: 14000
  max_records_per_hook: 2000

Even when a hook is wired to a hosted model, the engine won’t feed it a multi-megabyte screenshot or a novel’s worth of OCR text — there are hard ceilings, enforced by the runtime, that hooks cannot override.

Storage retention & vacuum

Activity bytes don’t live forever. The vacuum tiers them and ultimately deletes them on a schedule you control:

storage:
  local:
    retention_days: 365
    vacuum:
      compress_after_minutes: 60
      thumbnail_after_days: 30
      delete_after_days: 180

After delete_after_days, the asset bytes for old frames are removed from disk; only the metadata (timestamp, app, OCR text) stays. Drop these numbers to keep less raw data on disk.

Process & identity guarantees

A few quieter properties worth calling out:

  • No telemetry. The runtime ships no analytics, no error reporter, no usage pings. There is nothing to opt out of because there is nothing enabled.
  • No account. Beside doesn’t ask you to sign in. There is no Beside-side identity for your data to be associated with.
  • Per-hook isolation. Capture hooks live in their own SQLite namespace enforced by the storage layer (hookPut / hookList are scoped to the caller’s hookId). One hook cannot read another hook’s records, even when installed in the same process.
  • MIT, open source, auditable. Every line of capture, storage, indexing, and export code is in github.com/sagivo/beside. Verifying the privacy claims above is git clone away — there is no closed binary in the path between your screen and your wiki.
  • Privacy-flagged events. RawEvent.privacy_filtered is honoured by every downstream consumer. If a capture plugin marks an event as filtered, the frame builder, hook engine, and exports skip it.

Recipes for stricter postures

"Air-gapped"

The strictest realistic posture. No outbound network calls at runtime.

index:
  model:
    plugin: ollama          # local
export:
  plugins:
    - { name: markdown, enabled: true }
    - { name: mcp, enabled: true }       # 127.0.0.1 loopback only

Run beside start --offline if you want the runtime to refuse the Ollama auto-install path even on a fresh machine.

"Regulated team"

Local model, narrow capture, short retention, MCP only to local agents.

capture:
  plugin: node
  excluded_apps: [1Password, Bitwarden, Keychain Access, "Activity Monitor"]
  excluded_url_patterns:
    - "^https://.*\\.bank\\."
    - "^https://admin\\."
    - "^https://.*\\.workday\\."
  privacy:
    sensitive_keywords: [password, api_key, secret, ssn, dob]

storage:
  local:
    retention_days: 90
    vacuum:
      delete_after_days: 30

index:
  model:
    plugin: ollama

"Hosted model, but only for indexing"

Use OpenAI for the heavy reorganisation pass; keep capture, OCR, audio, and embeddings local. (Requires a small custom adapter that wraps openai for chat and ollama for embeddings — Beside is happy to load a custom adapter under index.model.plugin: my-adapter.)

Summary

Beside is engineered around one default: nothing leaves your machine unless you explicitly opt in. Local capture, local Tesseract OCR, local Whisper transcription, local Ollama models, local SQLite, local Markdown wiki, and a loopback-only MCP server are not a configuration — they’re the shape of the product. Every knob this page describes is there to make that boundary stricter, never to soften it.