Guides
API Documentation
AI Governance Gateway for enterprise LLM access. Drop-in, OpenAI-compatible, with PII protection, configurable enforcement, cost visibility, and a signed audit trail on every request.
Monago Atrium — API Documentation
AI Governance Gateway for enterprise LLM access. Drop-in, OpenAI-compatible, with PII protection, configurable enforcement, cost visibility, and a signed audit trail on every request.
Overview
Monago Atrium sits between your application and large language model (LLM) providers (OpenAI, Anthropic, and others) as a Policy Enforcement Point. Every request passes through the gateway before reaching a provider, and every response passes back through it. At this checkpoint, Atrium applies governance: it detects and redacts personally identifiable information (PII), enforces per-tenant policy, scans for prompt injection, accounts for cost, and writes a tamper-evident audit record.
Atrium is designed around Zero Trust principles (aligned with NIST SP 800-207) at the data-governance layer: no request is implicitly trusted, every request is evaluated against policy, and security enforcement is top-down — a security baseline set at the tenant level cannot be weakened by a workspace or user below it.
What Atrium gives you
- Multi-layer PII detection — structured identifiers (national ID, tax ID, phone, bank account, email) plus named-entity recognition for unstructured PII such as person, organization, and location names.
- Configurable enforcement — choose how detected PII is handled per policy: allow, warn, partial mask, full redaction, or block.
- Top-down policy — a tenant's security baseline is a floor; teams can tighten it but never loosen it.
- Cost visibility — estimated per-request cost is returned on every call and aggregated for reporting.
- Audit trail — every request produces a signed audit record, surfaced via a request-level audit ID.
- Drop-in compatibility — the chat endpoint mirrors the OpenAI Chat Completions API, so existing clients and SDKs work with minimal changes.
Supported providers
A single Atrium API works across providers — the governance layer is provider-agnostic, and routing is resolved from the model identifier.
| Provider | Status |
|---|---|
| OpenAI | Available |
| Anthropic | Available |
| AWS Bedrock | On the roadmap |
| Google Vertex AI | On the roadmap |
| Alibaba Cloud Model Studio (Qwen) | On the roadmap |
Your code does not change per provider: keep the same OpenAI-compatible request shape and switch the model field. Which providers and models are permitted is governed by your policy's model allowlist.
Who this is for
- Evaluators / security & compliance reviewers — start with Overview, Governance Headers, PII Enforcement Modes, and Compliance.
- Developers — start with Authentication and API Reference.
Getting Started
Base URL
https://api.monago.ioAuthentication
All requests authenticate with a workspace API key passed as a Bearer token.
Authorization: Bearer sk-mng-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxAPI keys are issued and managed in the Monago console. A key is scoped to a tenant (and optionally a workspace); the governance policy that applies to a request is resolved from that scope. Keep keys secret and rotate them if exposed.
Quickstart (cURL)
curl -X POST https://api.monago.io/v1/chat/completions \
-H "Authorization: Bearer sk-mng-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{ "role": "user", "content": "Summarize the onboarding policy." }
]
}'Quickstart (Python SDK)
pip install monago-atriumfrom monago_atrium import Monago
client = Monago(api_key="sk-mng-YOUR_KEY")
result = client.chat(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Summarize the onboarding policy."}],
)
print(result.content)
print(result.audit_id) # signed audit reference
print(result.pii_redacted) # PII types redacted before the provider saw them
print(result.cost_idr) # estimated cost for this requestAPI Reference
Create a chat completion
POST /v1/chat/completionsOpenAI-compatible chat completion, governed by your policy. Supports both standard and streaming responses.
Try it live below — paste your sk-mng- key, edit the request, and the code samples (cURL / Python / TypeScript) update as you type. The response panel surfaces the X-Monago-* governance headers.
/v1/chat/completionsTry it
Your key is sent once through a same-origin proxy and never stored.
Type any model your policy allows. Enter a key to load suggestions.
Code
curl -X POST https://api.monago.io/v1/chat/completions \
-H "Authorization: Bearer sk-mng-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "your-model",
"messages": [
{
"role": "user",
"content": "Summarize the onboarding policy."
}
]
}'Response
Send a request to see the response and governance headers.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | Model identifier (e.g. gpt-4o-mini). Must be permitted by your policy's model allowlist. |
messages | array | yes | Conversation messages. Each item has role (system / user / assistant) and content (string). 1–200 messages. |
stream | boolean | no | When true, returns a Server-Sent Events stream. Defaults to false. |
max_tokens | integer | no | Maximum tokens to generate (1–200000). |
temperature | number | no | Sampling temperature (0.0–2.0). |
{
"model": "gpt-4o-mini",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Draft a one-line welcome message." }
],
"max_tokens": 256,
"temperature": 0.7,
"stream": false
}Response body
The response shape mirrors the OpenAI Chat Completions format.
| Field | Type | Description |
|---|---|---|
id | string | Completion identifier. |
object | string | Always chat.completion. |
created | integer | Unix timestamp. |
model | string | Resolved provider model. |
choices | array | Generated choices, each with index, message (role, content), and finish_reason. |
usage | object | prompt_tokens, completion_tokens, total_tokens. |
{
"id": "chatcmpl-...",
"object": "chat.completion",
"created": 1779896327,
"model": "gpt-4o-mini-2024-07-18",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "Welcome aboard — glad to have you!" },
"finish_reason": "stop"
}
],
"usage": { "prompt_tokens": 23, "completion_tokens": 11, "total_tokens": 34 }
}Governance results (PII detection, policy decision, cost, audit reference) are returned as response headers — see below.
Governance Headers
Every response carries governance metadata in X-Monago-* headers. These let your application observe what the gateway did without parsing the body.
| Header | Always present | Description |
|---|---|---|
X-Monago-Audit-Id | yes | Signed audit reference for this request. Use it to look up the full audit record. |
X-Monago-Provider | yes | Upstream provider that served the request. |
X-Monago-Latency-Ms | yes | Gateway-measured latency in milliseconds. |
X-Monago-Pii-Detected | yes | true / false — whether any PII was detected. |
X-Monago-Pii-Redacted | when PII found | Comma-separated list of PII types redacted before the request reached the provider (e.g. nik,phone_id,PERSON). |
X-Monago-Policy-Decision | yes | The policy outcome, e.g. allowed. |
X-Monago-Decision-Source | yes | Source of the decision (policy engine identifier). |
X-Monago-Cost-Idr | when computed | Estimated cost of the request in IDR. |
X-Monago-Block-Reason | on block | Present when a request is blocked by policy. |
PII redaction happens before the provider sees the request. When
X-Monago-Pii-Redactedlists a type, that data was replaced with a placeholder in the text sent upstream — the provider never receives the original value.
PII Enforcement Modes
How detected PII is handled is controlled by your policy's PII action. Each mode is a point on a strictness ladder; a stricter mode set at the tenant level cannot be overridden to a looser mode by a workspace.
| Mode | Behavior | Sent to provider? |
|---|---|---|
allow | Pass the request through unchanged. | Original |
warn | Pass through, but record detection in the audit log. | Original |
partial_mask | Redact PII keeping a short tail (industry "card-number" style, e.g. ****-****-****-0123) so users can recognize their own data. | Masked (tail kept) |
full_mask | Redact PII completely with no remainder (e.g. [REDACTED_NIK]). | Masked (zero-tail) |
block | Reject the request. It is not forwarded to the provider. | No |
Strictness order: allow < warn < partial_mask < full_mask < block.
When a request is blocked, the API returns 403 with error code PII_BLOCKED (see Errors).
Streaming
Set "stream": true to receive a Server-Sent Events (SSE) stream. The stream interleaves governance events with content events, so you can observe policy, PII, routing, and provider stages in real time, followed by the model output.
curl -N -X POST https://api.monago.io/v1/chat/completions \
-H "Authorization: Bearer sk-mng-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{ "model": "gpt-4o-mini", "stream": true,
"messages": [{ "role": "user", "content": "Hello" }] }'Event types:
| Event | Payload |
|---|---|
governance | A pipeline stage update: policy, pii, routing, provider, audit. Includes status and a short metadata summary. |
content | An OpenAI-style chat completion chunk with a delta. |
done | Final event with usage, audit_id, and session_id. |
Example governance event:
event: governance
data: {"step": "pii", "status": "flagged", "meta": "3 types · PERSON, nik, phone_id",
"detail": {"types": ["PERSON", "nik", "phone_id"]}}On streaming responses, the audit reference is always available (in the
auditgovernance event and the finaldoneevent). Some non-streaming headers are not repeated on the SSE channel by design.
Errors
Errors are returned with an appropriate HTTP status and a stable machine-readable code.
| Code | HTTP | Meaning |
|---|---|---|
PII_BLOCKED | 403 | Request blocked because PII was detected and policy is set to block. |
MODEL_NOT_ALLOWED | 403 | The requested model is not in your policy's allowlist. |
OUTSIDE_ALLOWED_HOURS | 403 | Request made outside the policy's permitted hours. |
MAX_TOKENS_EXCEEDED | 403 | Requested max_tokens exceeds the policy limit. |
TRIAL_EXPIRED | 403 | The tenant's trial period has ended. |
UNKNOWN_MODEL | 400 | The model identifier is not recognized. |
NO_ACTIVE_CREDENTIAL | 400 | No active upstream provider credential is configured for this scope. |
PROVIDER_NOT_CONFIGURED | 503 | The target provider is not configured. |
PROVIDER_RATE_LIMITED | 429 | The upstream provider rate-limited the request. |
PROVIDER_TIMEOUT | 504 | The upstream provider timed out. |
PROVIDER_UPSTREAM_ERROR | 502 | The upstream provider returned an error. |
All error responses include the governance audit headers where available, so a blocked or failed request is still traceable via X-Monago-Audit-Id.
SDK
The Python SDK wraps the API and surfaces governance results as first-class fields.
pip install monago-atriumfrom monago_atrium import Monago
client = Monago(api_key="sk-mng-YOUR_KEY")
result = client.chat(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Help me write a customer reply."}],
)
# Model output
print(result.content)
# Governance surface
print(result.audit_id) # signed audit reference
print(result.pii_detected) # bool
print(result.pii_redacted) # list of redacted PII types
print(result.policy_decision)
print(result.cost_idr) # estimated cost, when availableThe SDK is open-core and tracks the API's governance contract. Streaming and additional language SDKs are on the roadmap.
Compliance
Atrium is built to support enterprise data-governance obligations, particularly for regulated sectors such as banking and financial services in Indonesia.
- Data minimization at the perimeter. PII can be redacted or blocked at the gateway, so sensitive identifiers need never leave your trust boundary toward a third-party LLM provider. This supports obligations under Indonesia's Personal Data Protection Law (UU PDP).
- Top-down enforcement (Zero Trust aligned). Security policy is set at the tenant level as a non-negotiable floor; teams can apply stricter controls but cannot weaken the baseline. This follows the Zero Trust principle (NIST SP 800-207) of explicit, top-down policy enforcement at a dedicated enforcement point. (Atrium is aligned with these principles; "Zero Trust Architecture" is a framework, not a certification.)
- Auditability. Every request produces a signed, tamper-evident audit record referenced by
X-Monago-Audit-Id, supporting after-the-fact review and regulatory reporting. - Cost transparency. Per-request cost estimates support internal chargeback and budget governance.
For data-residency and on-premise deployment options, contact the Monago team.
Notes & Conventions
- The chat endpoint is intentionally OpenAI-compatible: in most cases you can point an existing OpenAI client at the Atrium base URL and add your
sk-mng-key. - Governance behavior (which models are allowed, how PII is handled, permitted hours) is determined by the policy bound to your API key's scope, configured in the Monago console — not by request parameters.
- All timestamps are Unix epoch seconds; all monetary estimates in
X-Monago-Cost-Idrare in Indonesian Rupiah and are estimates for reporting, not provider-billed amounts.