Each license comes with a monthly allotment of requests. Once you’ve reached your monthly limit, you can continue to use Cursor’s Auto model selection without incurring additional usage-based pricing. If you have reached your monthly limit and want to specifically work with a model of your choice, you can do this through usage-based pricing.
Normal
Requests per model/messageIdeal for everyday coding tasks, recommended for most users.
A request represents a single message sent to most models, which includes your message, any relevant context from your codebase, and the model’s response. Thinking models represent two requests. MAX Mode costs calculated based on tokens, further explained below.
In normal mode, each message costs a fixed number of requests based solely on the model you’re using, regardless of context. We optimize context management without it affecting your request count.For example, let’s look at a conversation using Claude Sonnet 4:
| Role | Message | Cost per message | |:--------|:--------|:---------| | User
| Create a plan for this change (using a more expensive model) | 0 | | Cursor
| I’ll analyze the requirements and create a detailed implementation plan… |
1 | | User | Implement the changes with TypeScript and add error handling | 0
| | Cursor (Thinking) | Here’s the implementation with type safety and error
handling… | 2 | | Total | | 3 requests |
Once you’ve reached your monthly request limit, you can continue to use Cursor by enabling Auto without incurring additional cost. Auto configures Cursor to select premier models based off of open capacity.MAX Mode is not available to turn on for Auto requests.
Once you’ve reached your monthly request limit, you can enable usage-based pricing in your dashboard settings. Any additional usage (normal or MAX Mode) is available at API rates plus 20%.
In MAX Mode, pricing is calculated based on tokens, with a 20% markup over the model provider’s API price. This includes all tokens from your messages, code files, folders, tool calls, and any other context provided to the model.We use the same tokenizers as the model providers (e.g. OpenAI’s tokenizer for GPT models, Anthropic’s for Claude models) to ensure accurate token counting. You can see an example using OpenAI’s tokenizer demo.Here’s an example of how pricing works in MAX Mode:
| Role | Message | Tokens | Cost per message |
|:--------|:--------|:--------|:---------| | User | Create a plan for this
change (using a more expensive model) | 193k | 2.7 requests | | Cursor | I’ll
analyze the requirements and create a detailed implementation plan… | 62k |
1.23 requests | | User | Implement the changes with TypeScript and add error
handling | 193k | 2.7 requests | | Cursor | Here’s the implementation with
type safety and error handling… | 62k | 1.23 requests | | Total | |
434k | 7.86 requests |