Guides

API Documentation

AI Governance Gateway for enterprise LLM access. Drop-in, OpenAI-compatible, with PII protection, configurable enforcement, cost visibility, and a signed audit trail on every request.

Monago Atrium — API Documentation

AI Governance Gateway for enterprise LLM access. Drop-in, OpenAI-compatible, with PII protection, configurable enforcement, cost visibility, and a signed audit trail on every request.


Overview

Monago Atrium sits between your application and large language model (LLM) providers (OpenAI, Anthropic, and others) as a Policy Enforcement Point. Every request passes through the gateway before reaching a provider, and every response passes back through it. At this checkpoint, Atrium applies governance: it detects and redacts personally identifiable information (PII), enforces per-tenant policy, scans for prompt injection, accounts for cost, and writes a tamper-evident audit record.

Atrium is designed around Zero Trust principles (aligned with NIST SP 800-207) at the data-governance layer: no request is implicitly trusted, every request is evaluated against policy, and security enforcement is top-down — a security baseline set at the tenant level cannot be weakened by a workspace or user below it.

What Atrium gives you

  • Multi-layer PII detection — structured identifiers (national ID, tax ID, phone, bank account, email) plus named-entity recognition for unstructured PII such as person, organization, and location names.
  • Configurable enforcement — choose how detected PII is handled per policy: allow, warn, partial mask, full redaction, or block.
  • Top-down policy — a tenant's security baseline is a floor; teams can tighten it but never loosen it.
  • Cost visibility — estimated per-request cost is returned on every call and aggregated for reporting.
  • Audit trail — every request produces a signed audit record, surfaced via a request-level audit ID.
  • Drop-in compatibility — the chat endpoint mirrors the OpenAI Chat Completions API, so existing clients and SDKs work with minimal changes.

Supported providers

A single Atrium API works across providers — the governance layer is provider-agnostic, and routing is resolved from the model identifier.

ProviderStatus
OpenAIAvailable
AnthropicAvailable
AWS BedrockOn the roadmap
Google Vertex AIOn the roadmap
Alibaba Cloud Model Studio (Qwen)On the roadmap

Your code does not change per provider: keep the same OpenAI-compatible request shape and switch the model field. Which providers and models are permitted is governed by your policy's model allowlist.

Who this is for


Getting Started

Base URL

https://api.monago.io

Authentication

All requests authenticate with a workspace API key passed as a Bearer token.

Authorization: Bearer sk-mng-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

API keys are issued and managed in the Monago console. A key is scoped to a tenant (and optionally a workspace); the governance policy that applies to a request is resolved from that scope. Keep keys secret and rotate them if exposed.

Quickstart (cURL)

curl -X POST https://api.monago.io/v1/chat/completions \
  -H "Authorization: Bearer sk-mng-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      { "role": "user", "content": "Summarize the onboarding policy." }
    ]
  }'

Quickstart (Python SDK)

pip install monago-atrium
from monago_atrium import Monago

client = Monago(api_key="sk-mng-YOUR_KEY")

result = client.chat(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarize the onboarding policy."}],
)

print(result.content)
print(result.audit_id)        # signed audit reference
print(result.pii_redacted)    # PII types redacted before the provider saw them
print(result.cost_idr)        # estimated cost for this request

API Reference

Create a chat completion

POST /v1/chat/completions

OpenAI-compatible chat completion, governed by your policy. Supports both standard and streaming responses.

Try it live below — paste your sk-mng- key, edit the request, and the code samples (cURL / Python / TypeScript) update as you type. The response panel surfaces the X-Monago-* governance headers.

POST/v1/chat/completions

Try it

Your key is sent once through a same-origin proxy and never stored.

Type any model your policy allows. Enter a key to load suggestions.

Code

curl -X POST https://api.monago.io/v1/chat/completions \
  -H "Authorization: Bearer sk-mng-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "your-model",
  "messages": [
    {
      "role": "user",
      "content": "Summarize the onboarding policy."
    }
  ]
}'

Response

Send a request to see the response and governance headers.

Request body

FieldTypeRequiredDescription
modelstringyesModel identifier (e.g. gpt-4o-mini). Must be permitted by your policy's model allowlist.
messagesarrayyesConversation messages. Each item has role (system / user / assistant) and content (string). 1–200 messages.
streambooleannoWhen true, returns a Server-Sent Events stream. Defaults to false.
max_tokensintegernoMaximum tokens to generate (1–200000).
temperaturenumbernoSampling temperature (0.0–2.0).
{
  "model": "gpt-4o-mini",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Draft a one-line welcome message." }
  ],
  "max_tokens": 256,
  "temperature": 0.7,
  "stream": false
}

Response body

The response shape mirrors the OpenAI Chat Completions format.

FieldTypeDescription
idstringCompletion identifier.
objectstringAlways chat.completion.
createdintegerUnix timestamp.
modelstringResolved provider model.
choicesarrayGenerated choices, each with index, message (role, content), and finish_reason.
usageobjectprompt_tokens, completion_tokens, total_tokens.
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1779896327,
  "model": "gpt-4o-mini-2024-07-18",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "Welcome aboard — glad to have you!" },
      "finish_reason": "stop"
    }
  ],
  "usage": { "prompt_tokens": 23, "completion_tokens": 11, "total_tokens": 34 }
}

Governance results (PII detection, policy decision, cost, audit reference) are returned as response headers — see below.


Governance Headers

Every response carries governance metadata in X-Monago-* headers. These let your application observe what the gateway did without parsing the body.

HeaderAlways presentDescription
X-Monago-Audit-IdyesSigned audit reference for this request. Use it to look up the full audit record.
X-Monago-ProvideryesUpstream provider that served the request.
X-Monago-Latency-MsyesGateway-measured latency in milliseconds.
X-Monago-Pii-Detectedyestrue / false — whether any PII was detected.
X-Monago-Pii-Redactedwhen PII foundComma-separated list of PII types redacted before the request reached the provider (e.g. nik,phone_id,PERSON).
X-Monago-Policy-DecisionyesThe policy outcome, e.g. allowed.
X-Monago-Decision-SourceyesSource of the decision (policy engine identifier).
X-Monago-Cost-Idrwhen computedEstimated cost of the request in IDR.
X-Monago-Block-Reasonon blockPresent when a request is blocked by policy.

PII redaction happens before the provider sees the request. When X-Monago-Pii-Redacted lists a type, that data was replaced with a placeholder in the text sent upstream — the provider never receives the original value.


PII Enforcement Modes

How detected PII is handled is controlled by your policy's PII action. Each mode is a point on a strictness ladder; a stricter mode set at the tenant level cannot be overridden to a looser mode by a workspace.

ModeBehaviorSent to provider?
allowPass the request through unchanged.Original
warnPass through, but record detection in the audit log.Original
partial_maskRedact PII keeping a short tail (industry "card-number" style, e.g. ****-****-****-0123) so users can recognize their own data.Masked (tail kept)
full_maskRedact PII completely with no remainder (e.g. [REDACTED_NIK]).Masked (zero-tail)
blockReject the request. It is not forwarded to the provider.No

Strictness order: allow < warn < partial_mask < full_mask < block.

When a request is blocked, the API returns 403 with error code PII_BLOCKED (see Errors).


Streaming

Set "stream": true to receive a Server-Sent Events (SSE) stream. The stream interleaves governance events with content events, so you can observe policy, PII, routing, and provider stages in real time, followed by the model output.

curl -N -X POST https://api.monago.io/v1/chat/completions \
  -H "Authorization: Bearer sk-mng-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "gpt-4o-mini", "stream": true,
        "messages": [{ "role": "user", "content": "Hello" }] }'

Event types:

EventPayload
governanceA pipeline stage update: policy, pii, routing, provider, audit. Includes status and a short metadata summary.
contentAn OpenAI-style chat completion chunk with a delta.
doneFinal event with usage, audit_id, and session_id.

Example governance event:

event: governance
data: {"step": "pii", "status": "flagged", "meta": "3 types · PERSON, nik, phone_id",
       "detail": {"types": ["PERSON", "nik", "phone_id"]}}

On streaming responses, the audit reference is always available (in the audit governance event and the final done event). Some non-streaming headers are not repeated on the SSE channel by design.


Errors

Errors are returned with an appropriate HTTP status and a stable machine-readable code.

CodeHTTPMeaning
PII_BLOCKED403Request blocked because PII was detected and policy is set to block.
MODEL_NOT_ALLOWED403The requested model is not in your policy's allowlist.
OUTSIDE_ALLOWED_HOURS403Request made outside the policy's permitted hours.
MAX_TOKENS_EXCEEDED403Requested max_tokens exceeds the policy limit.
TRIAL_EXPIRED403The tenant's trial period has ended.
UNKNOWN_MODEL400The model identifier is not recognized.
NO_ACTIVE_CREDENTIAL400No active upstream provider credential is configured for this scope.
PROVIDER_NOT_CONFIGURED503The target provider is not configured.
PROVIDER_RATE_LIMITED429The upstream provider rate-limited the request.
PROVIDER_TIMEOUT504The upstream provider timed out.
PROVIDER_UPSTREAM_ERROR502The upstream provider returned an error.

All error responses include the governance audit headers where available, so a blocked or failed request is still traceable via X-Monago-Audit-Id.


SDK

The Python SDK wraps the API and surfaces governance results as first-class fields.

pip install monago-atrium
from monago_atrium import Monago

client = Monago(api_key="sk-mng-YOUR_KEY")

result = client.chat(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Help me write a customer reply."}],
)

# Model output
print(result.content)

# Governance surface
print(result.audit_id)      # signed audit reference
print(result.pii_detected)  # bool
print(result.pii_redacted)  # list of redacted PII types
print(result.policy_decision)
print(result.cost_idr)      # estimated cost, when available

The SDK is open-core and tracks the API's governance contract. Streaming and additional language SDKs are on the roadmap.


Compliance

Atrium is built to support enterprise data-governance obligations, particularly for regulated sectors such as banking and financial services in Indonesia.

  • Data minimization at the perimeter. PII can be redacted or blocked at the gateway, so sensitive identifiers need never leave your trust boundary toward a third-party LLM provider. This supports obligations under Indonesia's Personal Data Protection Law (UU PDP).
  • Top-down enforcement (Zero Trust aligned). Security policy is set at the tenant level as a non-negotiable floor; teams can apply stricter controls but cannot weaken the baseline. This follows the Zero Trust principle (NIST SP 800-207) of explicit, top-down policy enforcement at a dedicated enforcement point. (Atrium is aligned with these principles; "Zero Trust Architecture" is a framework, not a certification.)
  • Auditability. Every request produces a signed, tamper-evident audit record referenced by X-Monago-Audit-Id, supporting after-the-fact review and regulatory reporting.
  • Cost transparency. Per-request cost estimates support internal chargeback and budget governance.

For data-residency and on-premise deployment options, contact the Monago team.


Notes & Conventions

  • The chat endpoint is intentionally OpenAI-compatible: in most cases you can point an existing OpenAI client at the Atrium base URL and add your sk-mng- key.
  • Governance behavior (which models are allowed, how PII is handled, permitted hours) is determined by the policy bound to your API key's scope, configured in the Monago console — not by request parameters.
  • All timestamps are Unix epoch seconds; all monetary estimates in X-Monago-Cost-Idr are in Indonesian Rupiah and are estimates for reporting, not provider-billed amounts.