Setting Usage Limits and Budgets

This guide explains how administrators configure token usage limits per user and team, allocate budgets across the organization, and monitor consumption in SecureAI.

Prerequisites

Before configuring usage limits, ensure you have:

Admin access to your SecureAI instance.
An understanding of your organization's LLM provider contract, including pricing tiers and any vendor-imposed rate limits.
A plan for how you want to distribute capacity -- by team, by role, or evenly across all users.

How Usage Limits Work

SecureAI tracks token consumption at three levels:

Level	What It Controls	Example
Organization	Total token budget across your entire SecureAI instance	10 million tokens per month
Team	Budget allocated to a department or group	Parts team gets 4M tokens/month
User	Per-user cap within their team's allocation	Each parts analyst gets up to 500K tokens/month

When a user reaches their limit, SecureAI blocks new conversations until the limit resets or an administrator raises the cap. Existing conversations remain visible but cannot accept new messages.

Accessing Usage Settings

Log in to SecureAI as an administrator.
Navigate to Admin Panel > Settings > Usage & Budgets.
You will see three tabs: Organization Budget, Team Allocations, and User Limits.

Setting Organization-Level Budgets

The organization budget is the top-level cap for all token consumption.

Open the Organization Budget tab.
Set the Monthly Token Budget -- this is the total number of tokens available across all users and teams for the billing period.
Optionally set a Daily Limit to prevent runaway usage from exhausting the monthly budget early. A reasonable default is your monthly budget divided by 22 (working days).
Choose the Budget Reset Schedule:
- Monthly -- resets on the 1st of each month (most common).
- Weekly -- resets every Monday.
- Custom -- set a specific reset date to align with your billing cycle.
Click Save.

Soft Limits vs. Hard Limits

Type	Behavior
Soft limit	Users see a warning banner but can continue sending messages. Usage is flagged for admin review.
Hard limit	Users are blocked from sending new messages once the limit is reached.

You can configure a soft limit threshold as a percentage of the hard limit. For example, set a soft limit at 80% so users are warned before they hit the cap.

Under Organization Budget, enable Soft Limit Alerts.
Set the Alert Threshold (e.g., 80%).
Choose notification targets -- users themselves, their team leads, or both.

Allocating Budgets to Teams

Team budgets let you distribute the organization's total capacity across departments.

Open the Team Allocations tab.
Click Add Team Allocation.
Select a team from the dropdown (teams are managed in the User Management settings).
Set the team's Monthly Token Budget.
Optionally enable Rollover -- unused tokens carry over to the next period (up to a configurable maximum).
Click Save.

Important: The sum of all team allocations cannot exceed the organization budget. SecureAI will warn you if allocations exceed available capacity.

Unallocated Pool

Tokens not assigned to any team form the Unallocated Pool. Users who are not assigned to a team draw from this pool. You can set a cap on the unallocated pool or leave it as the remainder of the organization budget.

Example Allocation

Team	Monthly Budget	Rationale
Parts Lookup	4,000,000	High-volume daily queries for part numbers and compatibility
Technical Support	3,000,000	Diagnostic conversations tend to be longer
Sales	1,500,000	Moderate usage for customer-facing product info
Training	500,000	Periodic use for onboarding and learning
Unallocated	1,000,000	Buffer for ad-hoc users and overflow

Setting Per-User Limits

Per-user limits prevent any single user from consuming a disproportionate share of their team's budget.

Open the User Limits tab.
Choose how to set limits:
- Uniform -- all users get the same cap (e.g., 500K tokens/month).
- Role-based -- different caps for different roles (e.g., admins get unlimited, standard users get 500K).
- Custom -- set individual limits per user.
Set the Default User Limit for new users.
Click Save.

Overriding Limits for Specific Users

To raise or lower the limit for a specific user:

Go to Admin Panel > Users.
Select the user.
Under Usage Limits, toggle Custom Limit.
Enter the new monthly token cap.
Click Save.

Custom limits override both the team default and the uniform setting.

Model-Specific Limits

Different models have different costs. You may want to restrict access to expensive models while keeping cheaper models freely available.

Navigate to Admin Panel > Settings > Usage & Budgets > Model Controls.
For each model tier, configure:
- Access -- who can use this model (all users, specific teams, specific roles).
- Per-User Token Limit -- a separate cap for this model tier specifically.
- Daily Request Limit -- maximum number of conversations per day using this model.

Example Model Controls

Model Tier	Access	Per-User Monthly Limit	Daily Request Limit
Fast	All users	No limit	No limit
Balanced	All users	300,000	100
Advanced	Technical Support, Admins	100,000	20

This setup lets everyone use fast models freely while reserving expensive models for teams that need them.

Monitoring Usage

Usage Dashboard

The admin usage dashboard provides real-time visibility into token consumption.

Navigate to Admin Panel > Analytics > Usage Dashboard.
View usage by:
- Organization total -- overall consumption against the monthly budget.
- Team breakdown -- which teams are consuming the most.
- User breakdown -- individual user consumption rankings.
- Model breakdown -- consumption by model tier.
- Time series -- daily and weekly trends.

Setting Up Alerts

Configure alerts to be notified before limits are reached.

Go to Admin Panel > Settings > Usage & Budgets > Alerts.
Click Add Alert.
Configure:
- Scope -- organization, specific team, or specific user.
- Threshold -- percentage of budget consumed (e.g., 75%, 90%, 100%).
- Channel -- email, in-app notification, or both.
- Recipients -- the alert target (admin, team lead, user, or a custom email list).
Click Save.

Recommended alert setup:

Threshold	Scope	Recipients	Purpose
75%	Organization	All admins	Early warning to review consumption trends
90%	Each team	Team lead + admins	Team approaching limit, time to adjust if needed
100%	Each user	The user + their team lead	User has been blocked, may need limit raised

Usage Reports

Export usage data for accounting, chargeback, or vendor reconciliation.

Go to Admin Panel > Analytics > Usage Reports.
Select the Date Range and Grouping (by team, user, or model).
Click Export CSV or Export PDF.

Reports include: total tokens consumed, estimated cost (based on configured model pricing), number of conversations, and average tokens per conversation.

Rate Limiting

In addition to budget-based limits, you can set rate limits to prevent short-term abuse or accidental flooding.

Setting	Description	Recommended Default
Requests per minute (per user)	Max messages a single user can send per minute	10
Requests per minute (organization)	Max messages across all users per minute	200
Max input tokens per message	Reject messages exceeding this token count	50,000
Max file upload size	Limit file uploads that would generate excessive tokens	25 MB

Configure these under Admin Panel > Settings > Usage & Budgets > Rate Limits.

When a rate limit is hit, the user sees a "Please wait and try again" message. Rate limit events are logged in the admin audit log.

Common Scenarios

A team runs out of budget mid-month

Check the Usage Dashboard to see what drove the spike.
If justified (e.g., seasonal demand), increase the team allocation under Team Allocations. Draw from the unallocated pool or reduce another team's allocation.
If caused by a single user, review their usage and consider whether their workflows can be optimized.

A new department needs access

Create the team in User Management if it does not exist.
Add a team allocation under Team Allocations.
Set appropriate per-user defaults.
Notify the team lead about their budget and any model restrictions.

You need to reduce costs across the organization

Review the Model Breakdown on the usage dashboard. If advanced models are being used for simple queries, restrict access or add guidance.
Lower per-user limits to encourage more efficient usage.
Enable soft limit warnings at a lower threshold (e.g., 60%) so users self-moderate earlier.
Share the Understanding Token Usage and Costs article with users so they understand how their behavior affects consumption.

Seasonal adjustments

Some months see higher demand (e.g., when new vehicle model years launch and parts catalogs need cross-referencing). Plan for this by:

Setting higher limits for the high-demand months.
Using the Rollover feature so quieter months build buffer for busier ones.
Reviewing and adjusting allocations quarterly based on usage trends.

Best Practices

Start generous, then tighten. Set higher limits initially and monitor actual usage for 2--3 months before optimizing. Overly restrictive limits frustrate users and discourage adoption.
Use soft limits first. Warnings are less disruptive than hard blocks. Most users self-moderate once they see they are approaching a limit.
Align budgets with business value. Teams that drive revenue or reduce costs through AI usage should get larger allocations.
Review monthly. Usage patterns shift as users become more proficient and workflows change. Revisit allocations at least monthly.
Communicate limits to users. Users should know their caps exist, what happens when they are reached, and who to contact for increases. Transparency prevents frustration.
Set model-specific limits before broad rollouts. When enabling a new, expensive model tier, start with tight limits for a pilot group before opening access to everyone.