Gemini Consistency Tester · Vanilla HTML/CSS/JS

Configuration

Your key is stored locally and never sent anywhere except to Google APIs.

Generation Config

Controls randomness. 0.0 for deterministic, 1.0 for creative.
Limits token selection from a probability distribution. Higher values are more diverse.
Controls internal reasoning token budget. Set 0 to disable.

Task Definition

A clear, deterministic prompt is the most important factor for consistency.
Images are sent as inline base64 data. Large files increase latency; prefer ≤ 2MB each.
Idle.
(Most common normalized response ÷ N)
(Average word-set overlap vs. the majority response)

Runs

Each request is sent sequentially with the same prompt & images.
# Status Start Time Duration Attempts Output (truncated)
How the consistency scores are computed
  1. Exact Match: Normalize each response (lowercase, trim, collapse whitespace). Find the most frequent string. Consistency = frequency_of_mode ÷ N.
  2. Token Jaccard: Tokenize responses into lowercased word sets. Take the majority response (mode) and compute Jaccard similarity |A∩B| / |A∪B| with every other response, then average.