Monago Atrium — API Documentation

AI Governance Gateway for enterprise LLM access. Drop-in, OpenAI-compatible, with PII protection, configurable enforcement, cost visibility, and a signed audit trail on every request.

Overview

Monago Atrium sits between your application and large language model (LLM) providers (OpenAI, Anthropic, and others) as a Policy Enforcement Point. Every request passes through the gateway before reaching a provider, and every response passes back through it. At this checkpoint, Atrium applies governance: it detects and redacts personally identifiable information (PII), enforces per-tenant policy, scans for prompt injection, accounts for cost, and writes a tamper-evident audit record.

Atrium is designed around Zero Trust principles (aligned with NIST SP 800-207) at the data-governance layer: no request is implicitly trusted, every request is evaluated against policy, and security enforcement is top-down — a security baseline set at the tenant level cannot be weakened by a workspace or user below it.

What Atrium gives you

Multi-layer PII detection — structured identifiers (national ID, tax ID, phone, bank account, email) plus named-entity recognition for unstructured PII such as person, organization, and location names.
Configurable enforcement — choose how detected PII is handled per policy: allow, warn, partial mask, full redaction, or block.
Top-down policy — a tenant's security baseline is a floor; teams can tighten it but never loosen it.
Cost visibility — estimated per-request cost is returned on every call and aggregated for reporting.
Audit trail — every request produces a signed audit record, surfaced via a request-level audit ID.
Drop-in compatibility — the chat endpoint mirrors the OpenAI Chat Completions API, so existing clients and SDKs work with minimal changes.

Supported providers

A single Atrium API works across providers — the governance layer is provider-agnostic, and routing is resolved from the model identifier.

Provider	Status
OpenAI	Available
Anthropic	Available
AWS Bedrock	On the roadmap
Google Vertex AI	On the roadmap
Alibaba Cloud Model Studio (Qwen)	On the roadmap

Your code does not change per provider: keep the same OpenAI-compatible request shape and switch the model field. Which providers and models are permitted is governed by your policy's model allowlist.

Who this is for

Evaluators / security & compliance reviewers — start with Overview, Governance Headers, PII Enforcement Modes, and Compliance.
Developers — start with Authentication and API Reference.

Getting Started

Base URL

https://api.monago.io

Authentication

All requests authenticate with a workspace API key passed as a Bearer token.

Authorization: Bearer sk-mng-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

API keys are issued and managed in the Monago console. A key is scoped to a tenant (and optionally a workspace); the governance policy that applies to a request is resolved from that scope. Keep keys secret and rotate them if exposed.

Quickstart (cURL)

curl -X POST https://api.monago.io/v1/chat/completions \
  -H "Authorization: Bearer sk-mng-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      { "role": "user", "content": "Summarize the onboarding policy." }
    ]
  }'

Quickstart (Python SDK)

pip install monago-atrium

from monago_atrium import Monago

client = Monago(api_key="sk-mng-YOUR_KEY")

result = client.chat(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarize the onboarding policy."}],
)

print(result.content)
print(result.audit_id)        # signed audit reference
print(result.pii_redacted)    # PII types redacted before the provider saw them
print(result.cost_idr)        # estimated cost for this request

API Reference

Create a chat completion

POST /v1/chat/completions

OpenAI-compatible chat completion, governed by your policy. Supports both standard and streaming responses.

Try it live below — paste your sk-mng- key, edit the request, and the code samples (cURL / Python / TypeScript) update as you type. The response panel surfaces the X-Monago-* governance headers.

POST/v1/chat/completions

Try it

Base URL

API key

Your key is sent once through a same-origin proxy and never stored.

Model

Type any model your policy allows. Enter a key to load suggestions.

User message

max_tokens (optional)

temperature (optional)

Code

curl -X POST https://api.monago.io/v1/chat/completions \
  -H "Authorization: Bearer sk-mng-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "your-model",
  "messages": [
    {
      "role": "user",
      "content": "Summarize the onboarding policy."
    }
  ]
}'

Response

Send a request to see the response and governance headers.

Request body

Field	Type	Required	Description
`model`	string	yes	Model identifier (e.g. `gpt-4o-mini`). Must be permitted by your policy's model allowlist.
`messages`	array	yes	Conversation messages. Each item has `role` (`system` / `user` / `assistant`) and `content` (string). 1–200 messages.
`stream`	boolean	no	When `true`, returns a Server-Sent Events stream. Defaults to `false`.
`max_tokens`	integer	no	Maximum tokens to generate (1–200000).
`temperature`	number	no	Sampling temperature (0.0–2.0).

{
  "model": "gpt-4o-mini",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Draft a one-line welcome message." }
  ],
  "max_tokens": 256,
  "temperature": 0.7,
  "stream": false
}

Response body

The response shape mirrors the OpenAI Chat Completions format.

Field	Type	Description
`id`	string	Completion identifier.
`object`	string	Always `chat.completion`.
`created`	integer	Unix timestamp.
`model`	string	Resolved provider model.
`choices`	array	Generated choices, each with `index`, `message` (`role`, `content`), and `finish_reason`.
`usage`	object	`prompt_tokens`, `completion_tokens`, `total_tokens`.

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1779896327,
  "model": "gpt-4o-mini-2024-07-18",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "Welcome aboard — glad to have you!" },
      "finish_reason": "stop"
    }
  ],
  "usage": { "prompt_tokens": 23, "completion_tokens": 11, "total_tokens": 34 }
}

Governance results (PII detection, policy decision, cost, audit reference) are returned as response headers — see below.

Governance Headers

Every response carries governance metadata in X-Monago-* headers. These let your application observe what the gateway did without parsing the body.

Header	Always present	Description
`X-Monago-Audit-Id`	yes	Signed audit reference for this request. Use it to look up the full audit record.
`X-Monago-Provider`	yes	Upstream provider that served the request.
`X-Monago-Latency-Ms`	yes	Gateway-measured latency in milliseconds.
`X-Monago-Pii-Detected`	yes	`true` / `false` — whether any PII was detected.
`X-Monago-Pii-Redacted`	when PII found	Comma-separated list of PII types redacted before the request reached the provider (e.g. `nik,phone_id,PERSON`).
`X-Monago-Policy-Decision`	yes	The policy outcome, e.g. `allowed`.
`X-Monago-Decision-Source`	yes	Source of the decision (policy engine identifier).
`X-Monago-Cost-Idr`	when computed	Estimated cost of the request in IDR.
`X-Monago-Block-Reason`	on block	Present when a request is blocked by policy.

PII redaction happens before the provider sees the request. When X-Monago-Pii-Redacted lists a type, that data was replaced with a placeholder in the text sent upstream — the provider never receives the original value.

PII Enforcement Modes

How detected PII is handled is controlled by your policy's PII action. Each mode is a point on a strictness ladder; a stricter mode set at the tenant level cannot be overridden to a looser mode by a workspace.

Mode	Behavior	Sent to provider?
`allow`	Pass the request through unchanged.	Original
`warn`	Pass through, but record detection in the audit log.	Original
`partial_mask`	Redact PII keeping a short tail (industry "card-number" style, e.g. `**--**-0123`) so users can recognize their own data.	Masked (tail kept)
`full_mask`	Redact PII completely with no remainder (e.g. `[REDACTED_NIK]`).	Masked (zero-tail)
`block`	Reject the request. It is not forwarded to the provider.	No

Strictness order: allow < warn < partial_mask < full_mask < block.

When a request is blocked, the API returns 403 with error code PII_BLOCKED (see Errors).

Streaming

Set "stream": true to receive a Server-Sent Events (SSE) stream. The stream interleaves governance events with content events, so you can observe policy, PII, routing, and provider stages in real time, followed by the model output.

curl -N -X POST https://api.monago.io/v1/chat/completions \
  -H "Authorization: Bearer sk-mng-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "gpt-4o-mini", "stream": true,
        "messages": [{ "role": "user", "content": "Hello" }] }'

Event types:

Event	Payload
`governance`	A pipeline stage update: `policy`, `pii`, `routing`, `provider`, `audit`. Includes status and a short metadata summary.
`content`	An OpenAI-style chat completion chunk with a `delta`.
`done`	Final event with `usage`, `audit_id`, and `session_id`.

Example governance event:

event: governance
data: {"step": "pii", "status": "flagged", "meta": "3 types · PERSON, nik, phone_id",
       "detail": {"types": ["PERSON", "nik", "phone_id"]}}

On streaming responses, the audit reference is always available (in the audit governance event and the final done event). Some non-streaming headers are not repeated on the SSE channel by design.

Errors

Errors are returned with an appropriate HTTP status and a stable machine-readable code.

Code	HTTP	Meaning
`PII_BLOCKED`	403	Request blocked because PII was detected and policy is set to `block`.
`MODEL_NOT_ALLOWED`	403	The requested model is not in your policy's allowlist.
`OUTSIDE_ALLOWED_HOURS`	403	Request made outside the policy's permitted hours.
`MAX_TOKENS_EXCEEDED`	403	Requested `max_tokens` exceeds the policy limit.
`TRIAL_EXPIRED`	403	The tenant's trial period has ended.
`UNKNOWN_MODEL`	400	The model identifier is not recognized.
`NO_ACTIVE_CREDENTIAL`	400	No active upstream provider credential is configured for this scope.
`PROVIDER_NOT_CONFIGURED`	503	The target provider is not configured.
`PROVIDER_RATE_LIMITED`	429	The upstream provider rate-limited the request.
`PROVIDER_TIMEOUT`	504	The upstream provider timed out.
`PROVIDER_UPSTREAM_ERROR`	502	The upstream provider returned an error.

All error responses include the governance audit headers where available, so a blocked or failed request is still traceable via X-Monago-Audit-Id.

SDK

The Python SDK wraps the API and surfaces governance results as first-class fields.

pip install monago-atrium

from monago_atrium import Monago

client = Monago(api_key="sk-mng-YOUR_KEY")

result = client.chat(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Help me write a customer reply."}],
)

# Model output
print(result.content)

# Governance surface
print(result.audit_id)      # signed audit reference
print(result.pii_detected)  # bool
print(result.pii_redacted)  # list of redacted PII types
print(result.policy_decision)
print(result.cost_idr)      # estimated cost, when available

The SDK is open-core and tracks the API's governance contract. Streaming and additional language SDKs are on the roadmap.

Compliance

Atrium is built to support enterprise data-governance obligations, particularly for regulated sectors such as banking and financial services in Indonesia.

Data minimization at the perimeter. PII can be redacted or blocked at the gateway, so sensitive identifiers need never leave your trust boundary toward a third-party LLM provider. This supports obligations under Indonesia's Personal Data Protection Law (UU PDP).
Top-down enforcement (Zero Trust aligned). Security policy is set at the tenant level as a non-negotiable floor; teams can apply stricter controls but cannot weaken the baseline. This follows the Zero Trust principle (NIST SP 800-207) of explicit, top-down policy enforcement at a dedicated enforcement point. (Atrium is aligned with these principles; "Zero Trust Architecture" is a framework, not a certification.)
Auditability. Every request produces a signed, tamper-evident audit record referenced by X-Monago-Audit-Id, supporting after-the-fact review and regulatory reporting.
Cost transparency. Per-request cost estimates support internal chargeback and budget governance.

For data-residency and on-premise deployment options, contact the Monago team.

Notes & Conventions

The chat endpoint is intentionally OpenAI-compatible: in most cases you can point an existing OpenAI client at the Atrium base URL and add your sk-mng- key.
Governance behavior (which models are allowed, how PII is handled, permitted hours) is determined by the policy bound to your API key's scope, configured in the Monago console — not by request parameters.
All timestamps are Unix epoch seconds; all monetary estimates in X-Monago-Cost-Idr are in Indonesian Rupiah and are estimates for reporting, not provider-billed amounts.