Quickstart Guide
Add usage tracking to your AI app in under 10 minutes.
1 Get your API key
Create an API key in Configure and keep it server-side only.
Security Notice
Never expose USAGETAP_API_KEY in the browser or commit it to git.
2 Install the SDK
Install the package with your preferred manager:
npm install @usagetap/sdk openai
# or
pnpm add @usagetap/sdk openai
# or
yarn add @usagetap/sdk openaiSDK enforces server-only usage (throws if imported client-side).
3 Configure environment variables
Add a .env.local file:
# .env.local
USAGETAP_API_KEY=ut_example_server_key
USAGETAP_BASE_URL=https://api.usagetap.com
OPENAI_API_KEY=sk-proj-example-openai-keySee .env.example for more options.
4 Add tracking
Choose your integration approach:
Option A: Explicit begin/end (learn the fundamentals)
Shows the complete flow: begin (with idempotency) → use entitlements → call vendor → end with usage. Always call end in finally block.
// app/api/chat/route.ts - Explicit begin/end with idempotency
import OpenAI from 'openai';
import { UsageTapClient } from '@usagetap/sdk';
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
export async function POST(req: Request) {
const { messages } = await req.json();
// 1. Begin call with idempotency
const begin = await usageTap.beginCall({
customerId: 'quickstart-user',
feature: 'chat.send',
requested: { standard: true, premium: true },
idempotencyKey: crypto.randomUUID(), // Safe retries
});
try {
// 2. Use entitlements to select model
const model = begin.data.allowed.premium ? 'gpt-4o' : 'gpt-4o-mini';
// 3. Call OpenAI
const completion = await openai.chat.completions.create({
model,
messages,
});
// 4. Report usage back
await usageTap.endCall({
callId: begin.data.callId,
modelUsed: model,
inputTokens: completion.usage?.prompt_tokens ?? 0,
responseTokens: completion.usage?.completion_tokens ?? 0,
});
return Response.json({ content: completion.choices[0]?.message?.content });
} catch (error) {
// Always call end, even on error
await usageTap.endCall({
callId: begin.data.callId,
error: { code: 'OPENAI_ERROR', message: String(error) },
});
throw error;
}
}Option B: withUsage wrapper (recommended for production)
Handles begin/end automatically with proper error handling and idempotency. Choose model from entitlements, set usage with ctx.setUsage().
// app/api/chat/route.ts - withUsage wrapper (recommended)
import OpenAI from 'openai';
import { UsageTapClient } from '@usagetap/sdk';
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
function selectCapabilities(allowed: {
standard?: boolean;
premium?: boolean;
reasoningLevel?: 'LOW' | 'MEDIUM' | 'HIGH' | null;
search?: boolean;
}) {
const tier = allowed.premium ? 'premium' : 'standard';
const model = tier === 'premium' ? 'gpt5' : 'gpt5-mini';
const reasoningEffort =
allowed.reasoningLevel === 'HIGH'
? 'high'
: allowed.reasoningLevel === 'MEDIUM'
? 'medium'
: allowed.reasoningLevel === 'LOW'
? 'low'
: undefined;
return {
model,
reasoning: reasoningEffort ? { effort: reasoningEffort } : undefined,
tools: allowed.search ? [{ type: 'web_search' as const }] : undefined,
};
}
export async function POST(req: Request) {
const { messages } = await req.json();
const prompt = messages
.map((m: { role: string; content: string }) => `${m.role{'}'}: ${m.content{'}'}`)
.join('
');
const completion = await usageTap.withUsage(
{
customerId: 'quickstart-user',
feature: 'chat.send',
requested: { standard: true, premium: true, search: true, reasoningLevel: 'HIGH' },
},
async ({ begin, setUsage }) => {
const { model, reasoning, tools } = selectCapabilities(begin.data.allowed);
const response = await openai.responses.create({
model,
input: prompt,
reasoning,
tools,
});
setUsage({
modelUsed: model,
inputTokens: response.usage?.input_tokens ?? response.usage?.prompt_tokens ?? 0,
responseTokens: response.usage?.output_tokens ?? response.usage?.completion_tokens ?? 0,
reasoningTokens: reasoning ? response.usage?.reasoning_tokens ?? 0 : 0,
searches: tools?.length ? response.usage?.web_search_queries ?? 0 : 0,
});
return response;
},
);
return new Response(completion.output_text ?? '', {
headers: { 'Content-Type': 'text/plain' },
});
}Option C: wrapOpenAI (zero-boilerplate)
Automatically handles begin/end, idempotency, and entitlement-aware model selection. Just use OpenAI SDK normally.
// app/api/chat/route.ts - Minimal wrapOpenAI
import OpenAI from 'openai';
import { UsageTapClient } from '@usagetap/sdk';
import { wrapOpenAI } from '@usagetap/sdk/openai';
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
const ai = wrapOpenAI(openai, usageTap, {
defaultContext: {
customerId: 'quickstart-user',
feature: 'chat.send',
requested: { standard: true, premium: true, search: true, reasoningLevel: 'HIGH' },
},
});
export async function POST(req: Request) {
const { messages } = await req.json();
// wrapOpenAI automatically selects gpt5 (premium) or gpt5-mini (standard)
// based on begin.data.allowed
const completion = await ai.chat.completions.create(
{ messages },
{
usageTap: {
requested: { standard: true, premium: true, search: true, reasoningLevel: 'MEDIUM' },
},
},
);
return new Response(completion.choices[0]?.message?.content ?? '', {
headers: { 'Content-Type': 'text/plain' },
});
}- Replace
quickstart-userwith a real authenticated user ID - Use
crypto.randomUUID()foridempotencyKeyto ensure safe retries - Always call
endCall()in a finally block, even on errors
5 Test locally
Start dev server and send a test request:
curl -X POST http://localhost:3000/api/chat
-H "Content-Type: application/json"
-d '{"messages":[{"role":"user","content":"Hello!"}]}'Look for a streaming response and UsageTap logging.
Verify in dashboard
- Real-time token & cost metrics
- Per-customer feature usage
- Quota and plan attribution
- Error vs success breakdown
Success!
You're now tracking AI usage with UsageTap.
Next steps
📚 SDK Documentation
Explore wrapOpenAI, Express middleware, React hooks, streaming helpers, checkUsage(), and advanced overrides.
🤖 LLM Prompt Kits
Copy-paste instructions for Cursor / Copilot to scaffold integrations fast.
📊 Usage Plans
Configure quota tiers and feature flags for customers.
💬 Get Help
Questions? Reach out via support or community.
Ensure the customer subscription exists first
Call createCustomer() (or POST /customers) before call_begin. The endpoint is fully idempotent and returns the same snapshot as checkUsage().
const customer = await usageTap.createCustomer({
customerId: "cust_123",
customerFriendlyName: "Acme AI",
customerEmail: "billing@acme.ai",
});
console.log("New customer?", customer.data.newCustomer);
console.log("Plan:", customer.data.plan);
console.log("Allowed:", customer.data.allowed);Repeat calls reuse the same subscription and flip newCustomer to false, so it is safe to run on every login or provisioning hook.
Change a customer's plan
Use changePlan() to switch customers between usage plans with flexible timing strategies:
const result = await usageTap.changePlan({
customerId: "cust_123",
planId: "plan_premium_v2",
strategy: "IMMEDIATE_RESET", // or "IMMEDIATE_PRORATED" or "AT_NEXT_REPLENISH"
});
console.log("Plan changed:", result.data.success);
console.log("New subscription:", result.data.subscription);Strategies: IMMEDIATE_RESET switches immediately and resets usage counters, IMMEDIATE_PRORATED switches immediately and prorates existing usage, AT_NEXT_REPLENISH (default) schedules the change for the next billing cycle.
Check usage without creating a call
Need to display current quota status or plan details without tracking a vendor call? Use checkUsage():
const usageStatus = await usageTap.checkUsage({ customerId: "cust_123" });
console.log("Meters:", usageStatus.data.meters);
console.log("Allowed:", usageStatus.data.allowed);
console.log("Plan:", usageStatus.data.plan);
console.log("Balances:", usageStatus.data.balances);Returns the same rich snapshot as call_begin (meters, entitlements, subscription details, plan info, balances) but without creating a call record. Perfect for dashboard widgets, pre-flight checks, or displaying quota status.
`call_begin` envelope (live response)
UsageTap now responds with the canonical { result, data, correlationId } envelope. The SDK sends the required Accept: application/vnd.usagetap.v1+json header for you:
{
"result": {
"status": "ACCEPTED",
"code": "CALL_BEGIN_SUCCESS",
"timestamp": "2025-10-04T18:21:37.482Z"
},
"data": {
"callId": "call_123",
"startTime": "2025-10-04T18:21:37.482Z",
"policy": "DOWNGRADE",
"newCustomer": false,
"canceled": false,
"allowed": {
"standard": true,
"premium": true,
"audio": false,
"image": false,
"search": true,
"reasoningLevel": "MEDIUM"
},
"entitlementHints": {
"suggestedModelTier": "standard",
"reasoningLevel": "MEDIUM",
"policy": "DOWNGRADE",
"downgrade": {
"reason": "PREMIUM_QUOTA_EXHAUSTED",
"fallbackTier": "standard"
}
},
"meters": {
"standardCalls": {
"remaining": 12,
"limit": 20,
"used": 8,
"unlimited": false,
"ratio": 0.6
},
"premiumCalls": {
"remaining": null,
"limit": null,
"used": null,
"unlimited": true,
"ratio": null
},
"standardTokens": {
"remaining": 800,
"limit": 1000,
"used": 200,
"unlimited": false,
"ratio": 0.8
}
},
"remainingRatios": {
"standardCalls": 0.6,
"standardTokens": 0.8
},
"subscription": {
"id": "sub_123",
"usagePlanVersionId": "plan_2025_01",
"planName": "Pro",
"planVersion": "2025-01",
"limitType": "DOWNGRADE",
"reasoningLevel": "MEDIUM",
"lastReplenishedAt": "2025-10-04T00:00:00.000Z",
"nextReplenishAt": "2025-11-04T00:00:00.000Z",
"subscriptionVersion": 14
},
"models": {
"standard": ["gpt5-mini"],
"premium": ["gpt5"]
},
"idempotency": {
"key": "call_123",
"source": "derived"
}
},
"correlationId": "corr_abc123"
}The envelope now includes richer metadata: entitlementHints summarizes the recommended model tier and downgrade rationale, meters provides per-counter snapshots with remaining quotas and ratios, remainingRatios offers compact lookups, subscription contains plan identity and replenishment timestamps, and models surfaces vendor hints (standard vs premium model shortlists). data.idempotency.key always matches callId. When you omit idempotencyKey in the request, the backend derives a deterministic hash from organization, customer, feature, and requested entitlements.
Unified `/call` shortcut
Prefer a one-shot API flow without the SDK? Call the public `/call` endpoint: UsageTap runs begin → optional vendor call → end for you and returns both envelopes in one response.
const baseUrl = process.env.USAGETAP_BASE_URL ?? "https://api.usagetap.com";
const res = await fetch(`${baseUrl}/call`, {
method: 'POST',
headers: {
Authorization: 'Bearer ' + process.env.USAGETAP_API_KEY,
Accept: 'application/vnd.usagetap.v1+json',
'Content-Type': 'application/json',
},
body: JSON.stringify({
customerId: 'cust_demo',
requested: { standard: true },
feature: 'chat.completions',
vendor: {
url: 'https://api.openai.com/v1/chat/completions',
method: 'POST',
headers: {
Authorization: 'Bearer ' + process.env.OPENAI_API_KEY,
'Content-Type': 'application/json',
},
body: {
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'Hi!' }],
},
responseType: 'json',
},
}),
});
const envelope = await res.json();
if (envelope.result.code !== 'CALL_SUCCESS') {
console.error('UsageTap call failed', envelope);
}Omit the vendor block if you just want begin → end with your own usage numbers. Vendor errors downgrade to `CALL_VENDOR_WARNING`, but UsageTap still records the end-of-call telemetry so quotas stay accurate.
`call_end` success response
When you finalize usage, the envelope mirrors the same structure and surfaces a `metered` summary derived from the Dynamo counters:
{
"result": {
"status": "ACCEPTED",
"code": "CALL_END_SUCCESS",
"timestamp": "2025-10-04T18:21:52.103Z"
},
"data": {
"callId": "call_123",
"costUSD": 0,
"metered": {
"tokens": 768,
"calls": 1,
"searches": 1
}
},
"correlationId": "corr_abc123"
}metered is derived from the raw Dynamo deltas and reports the amounts consumed. Additional meters (audio seconds, reasoning tokens, balances) will populate in later phases without breaking the contract. BLOCK policy violations still return HTTP 429 with an error envelope.
Premium detection: UsageTap automatically classifies calls as premium when the model's output token price exceeds $4 per million. You can override this by passing isPremium: true or isPremium: false in your call_end request.