This guide explains how administrators configure token usage limits per user and team, allocate budgets across the organization, and monitor consumption in SecureAI.
Prerequisites
Before configuring usage limits, ensure you have:
- Admin access to your SecureAI instance.
- An understanding of your organization's LLM provider contract, including pricing tiers and any vendor-imposed rate limits.
- A plan for how you want to distribute capacity -- by team, by role, or evenly across all users.
How Usage Limits Work
SecureAI tracks token consumption at three levels:
| Level | What It Controls | Example |
|---|---|---|
| Organization | Total token budget across your entire SecureAI instance | 10 million tokens per month |
| Team | Budget allocated to a department or group | Parts team gets 4M tokens/month |
| User | Per-user cap within their team's allocation | Each parts analyst gets up to 500K tokens/month |
When a user reaches their limit, SecureAI blocks new conversations until the limit resets or an administrator raises the cap. Existing conversations remain visible but cannot accept new messages.
Accessing Usage Settings
- Log in to SecureAI as an administrator.
- Navigate to Admin Panel > Settings > Usage & Budgets.
- You will see three tabs: Organization Budget, Team Allocations, and User Limits.
Setting Organization-Level Budgets
The organization budget is the top-level cap for all token consumption.
- Open the Organization Budget tab.
- Set the Monthly Token Budget -- this is the total number of tokens available across all users and teams for the billing period.
- Optionally set a Daily Limit to prevent runaway usage from exhausting the monthly budget early. A reasonable default is your monthly budget divided by 22 (working days).
- Choose the Budget Reset Schedule:
- Monthly -- resets on the 1st of each month (most common).
- Weekly -- resets every Monday.
- Custom -- set a specific reset date to align with your billing cycle.
- Click Save.
Soft Limits vs. Hard Limits
| Type | Behavior |
|---|---|
| Soft limit | Users see a warning banner but can continue sending messages. Usage is flagged for admin review. |
| Hard limit | Users are blocked from sending new messages once the limit is reached. |
You can configure a soft limit threshold as a percentage of the hard limit. For example, set a soft limit at 80% so users are warned before they hit the cap.
- Under Organization Budget, enable Soft Limit Alerts.
- Set the Alert Threshold (e.g., 80%).
- Choose notification targets -- users themselves, their team leads, or both.
Allocating Budgets to Teams
Team budgets let you distribute the organization's total capacity across departments.
- Open the Team Allocations tab.
- Click Add Team Allocation.
- Select a team from the dropdown (teams are managed in the User Management settings).
- Set the team's Monthly Token Budget.
- Optionally enable Rollover -- unused tokens carry over to the next period (up to a configurable maximum).
- Click Save.
Important: The sum of all team allocations cannot exceed the organization budget. SecureAI will warn you if allocations exceed available capacity.
Unallocated Pool
Tokens not assigned to any team form the Unallocated Pool. Users who are not assigned to a team draw from this pool. You can set a cap on the unallocated pool or leave it as the remainder of the organization budget.
Example Allocation
| Team | Monthly Budget | Rationale |
|---|---|---|
| Parts Lookup | 4,000,000 | High-volume daily queries for part numbers and compatibility |
| Technical Support | 3,000,000 | Diagnostic conversations tend to be longer |
| Sales | 1,500,000 | Moderate usage for customer-facing product info |
| Training | 500,000 | Periodic use for onboarding and learning |
| Unallocated | 1,000,000 | Buffer for ad-hoc users and overflow |
Setting Per-User Limits
Per-user limits prevent any single user from consuming a disproportionate share of their team's budget.
- Open the User Limits tab.
- Choose how to set limits:
- Uniform -- all users get the same cap (e.g., 500K tokens/month).
- Role-based -- different caps for different roles (e.g., admins get unlimited, standard users get 500K).
- Custom -- set individual limits per user.
- Set the Default User Limit for new users.
- Click Save.
Overriding Limits for Specific Users
To raise or lower the limit for a specific user:
- Go to Admin Panel > Users.
- Select the user.
- Under Usage Limits, toggle Custom Limit.
- Enter the new monthly token cap.
- Click Save.
Custom limits override both the team default and the uniform setting.
Model-Specific Limits
Different models have different costs. You may want to restrict access to expensive models while keeping cheaper models freely available.
- Navigate to Admin Panel > Settings > Usage & Budgets > Model Controls.
- For each model tier, configure:
- Access -- who can use this model (all users, specific teams, specific roles).
- Per-User Token Limit -- a separate cap for this model tier specifically.
- Daily Request Limit -- maximum number of conversations per day using this model.
Example Model Controls
| Model Tier | Access | Per-User Monthly Limit | Daily Request Limit |
|---|---|---|---|
| Fast | All users | No limit | No limit |
| Balanced | All users | 300,000 | 100 |
| Advanced | Technical Support, Admins | 100,000 | 20 |
This setup lets everyone use fast models freely while reserving expensive models for teams that need them.
Monitoring Usage
Usage Dashboard
The admin usage dashboard provides real-time visibility into token consumption.
- Navigate to Admin Panel > Analytics > Usage Dashboard.
- View usage by:
- Organization total -- overall consumption against the monthly budget.
- Team breakdown -- which teams are consuming the most.
- User breakdown -- individual user consumption rankings.
- Model breakdown -- consumption by model tier.
- Time series -- daily and weekly trends.
Setting Up Alerts
Configure alerts to be notified before limits are reached.
- Go to Admin Panel > Settings > Usage & Budgets > Alerts.
- Click Add Alert.
- Configure:
- Scope -- organization, specific team, or specific user.
- Threshold -- percentage of budget consumed (e.g., 75%, 90%, 100%).
- Channel -- email, in-app notification, or both.
- Recipients -- the alert target (admin, team lead, user, or a custom email list).
- Click Save.
Recommended alert setup:
| Threshold | Scope | Recipients | Purpose |
|---|---|---|---|
| 75% | Organization | All admins | Early warning to review consumption trends |
| 90% | Each team | Team lead + admins | Team approaching limit, time to adjust if needed |
| 100% | Each user | The user + their team lead | User has been blocked, may need limit raised |
Usage Reports
Export usage data for accounting, chargeback, or vendor reconciliation.
- Go to Admin Panel > Analytics > Usage Reports.
- Select the Date Range and Grouping (by team, user, or model).
- Click Export CSV or Export PDF.
Reports include: total tokens consumed, estimated cost (based on configured model pricing), number of conversations, and average tokens per conversation.
Rate Limiting
In addition to budget-based limits, you can set rate limits to prevent short-term abuse or accidental flooding.
| Setting | Description | Recommended Default |
|---|---|---|
| Requests per minute (per user) | Max messages a single user can send per minute | 10 |
| Requests per minute (organization) | Max messages across all users per minute | 200 |
| Max input tokens per message | Reject messages exceeding this token count | 50,000 |
| Max file upload size | Limit file uploads that would generate excessive tokens | 25 MB |
Configure these under Admin Panel > Settings > Usage & Budgets > Rate Limits.
When a rate limit is hit, the user sees a "Please wait and try again" message. Rate limit events are logged in the admin audit log.
Common Scenarios
A team runs out of budget mid-month
- Check the Usage Dashboard to see what drove the spike.
- If justified (e.g., seasonal demand), increase the team allocation under Team Allocations. Draw from the unallocated pool or reduce another team's allocation.
- If caused by a single user, review their usage and consider whether their workflows can be optimized.
A new department needs access
- Create the team in User Management if it does not exist.
- Add a team allocation under Team Allocations.
- Set appropriate per-user defaults.
- Notify the team lead about their budget and any model restrictions.
You need to reduce costs across the organization
- Review the Model Breakdown on the usage dashboard. If advanced models are being used for simple queries, restrict access or add guidance.
- Lower per-user limits to encourage more efficient usage.
- Enable soft limit warnings at a lower threshold (e.g., 60%) so users self-moderate earlier.
- Share the Understanding Token Usage and Costs article with users so they understand how their behavior affects consumption.
Seasonal adjustments
Some months see higher demand (e.g., when new vehicle model years launch and parts catalogs need cross-referencing). Plan for this by:
- Setting higher limits for the high-demand months.
- Using the Rollover feature so quieter months build buffer for busier ones.
- Reviewing and adjusting allocations quarterly based on usage trends.
Best Practices
- Start generous, then tighten. Set higher limits initially and monitor actual usage for 2--3 months before optimizing. Overly restrictive limits frustrate users and discourage adoption.
- Use soft limits first. Warnings are less disruptive than hard blocks. Most users self-moderate once they see they are approaching a limit.
- Align budgets with business value. Teams that drive revenue or reduce costs through AI usage should get larger allocations.
- Review monthly. Usage patterns shift as users become more proficient and workflows change. Revisit allocations at least monthly.
- Communicate limits to users. Users should know their caps exist, what happens when they are reached, and who to contact for increases. Transparency prevents frustration.
- Set model-specific limits before broad rollouts. When enabling a new, expensive model tier, start with tight limits for a pilot group before opening access to everyone.