What To Do With Your Extra Tokens

March 29, 2026

When you subscribe to an AI coding tool like Claude Code or Codex, you're allotted a certain number of tokens every few hours. Finish a feature early and those tokens expire unused. That's money on the table.

There are four categories where you can spend your tokens:

Here is the ultimate checklist for spending them.

building featuresurplus windowshippedresetSecurityPerformanceArchitectureTestingAccessibilitywasted

Tokens expire unused at the end of the reset window.

1. Domain-specific code reviews

This is where you'll burn the most tokens and get the most value. Have your agent act as a staff engineer reviewing through a specific lens.

The key insight: a targeted review ("find race conditions in my payment handlers") outperforms a generic review ("review this code") because specific vocabulary forces the model's attention toward the exact failure domain you care about. Vague prompts spread attention thin across formatting, naming, and syntax. Specific prompts concentrate it where bugs actually live. (I wrote a full breakdown of why this works in Domain Code Reviews.)

Syntax & formattingLogic errorsArchitectural issuesSecurity vulnerabilitiesConcurrency & race conditionsreviewdepth

Generic reviews catch surface-level issues. Deeper problems go undetected.

Security & data integrity

Have the AI sweep for input validation gaps, authentication flaws, exposed secrets, and injection vectors. Don't stop at application code. Dump your database security rules - Supabase RLS policies, Postgres row-level security, Firebase rules - and ask it to find tenant isolation gaps. Ask the agent to review your account deletion logic and ask whether orphaned records or PII leaks survive. Check that backend roles (Admin, Editor, Viewer) are enforced at the query level, not just hidden in the UI.

Prompt: Security Sweep

Review this code for security vulnerabilities. Examine:

  • Input validation and sanitization (SQL injection, XSS, command injection)
  • Authentication and authorization flaws
  • Secrets management (hardcoded credentials, API keys, tokens)
  • Data exposure risks (logging sensitive data, error messages leaking internals)
  • CSRF, SSRF, and other request-based vulnerabilities
  • Insecure deserialization and dependency vulnerabilities
  • PII handling (encryption at rest, retention policies, access controls)

For each issue found, explain the attack vector and provide a specific remediation.

Architecture & code quality

Have the AI evaluate structural decisions: separation of concerns, dependency injection, composition vs. inheritance, and whether any god classes are doing too much. This is also the time to ask for code simplification - eliminate redundant logic, flatten deep nesting, replace nested ternaries with explicit statements, and consolidate related code that has drifted apart.

Prompt: Architecture Review

Evaluate the architecture and code quality:

  • Is there clear separation between UI, business logic, and data layers?
  • Are dependencies injected rather than instantiated directly?
  • Are there classes or components doing too much (fetching, transforming, rendering, navigating)?
  • Is composition favored over deep inheritance?
  • Is naming consistent and descriptive across files, folders, and modules?
  • Does the folder structure communicate system boundaries and scale well?
  • Are there repeated patterns that should be shared utilities?
  • How difficult would it be for a new developer to know where to add a feature?

Propose specific structural improvements with rationale.

Performance & scalability

Analyze algorithmic complexity, memory allocation patterns, and database access. The targets: N+1 queries, missing indexes, blocking async operations, unnecessary computations inside loops, and absent caching. Then zoom out and ask what breaks first under 10x load. Also review any Redis or Memcached logic for stale data scenarios, and check that third-party API calls implement exponential backoff with jitter.

Prompt: Performance Audit

Analyze this code for performance and scalability issues:

  • Algorithmic complexity (Big O) - are there unnecessary O(n²) or worse operations?
  • Memory leaks or excessive allocation
  • N+1 query problems and inefficient database access patterns
  • Missing or improper caching (and cache invalidation gaps)
  • Blocking operations that should be async
  • Missing database indexes or inefficient data structures
  • Rate limiting and backoff strategies for external API calls
  • What are the bottlenecks as load increases 10x? 100x?

Quantify the impact where possible and suggest specific optimizations.

Edge cases & resilience

Challenge the code's assumptions. What happens with null values, unicode boundaries, empty collections, zero-length strings? What if the network is hostile - calls time out, fail silently, or return unexpected shapes? This is where you find the bugs that only surface in production at 2am.

Prompt: Edge Cases & Resilience

Challenge every assumption in this code:

  • What happens with empty collections, null values, zero values?
  • How are boundary conditions handled (max int, empty string, unicode)?
  • What timezone, locale, or encoding assumptions exist?
  • What if network calls are slow, fail, or return unexpected data?
  • What if a dependency is completely unavailable?
  • Are errors caught at appropriate levels or swallowed silently?
  • Is there a consistent error handling strategy (error types, codes, formats)?
  • Are external service failures handled with timeouts, retries, and backoff?
  • Is data validated at system boundaries?
  • Are database transactions used where needed for consistency?

List every assumption and evaluate whether it's safe.

Concurrency & state safety

If your application handles concurrent operations - webhooks, background jobs, parallel requests - this review pays for itself. Feed the AI your payment webhook handlers and ask: "What happens if the provider sends the invoice.paid event twice in 50ms? Is this truly idempotent?" Review subscription lifecycle logic for edge cases: what happens if a user downgrades mid-cycle but the payment fails on the next billing attempt?

Prompt: Concurrency & Idempotency

Analyze this code for concurrency and state management issues:

  • Are there race conditions in shared state access?
  • Is mutable state properly synchronized?
  • Are there potential deadlocks in lock ordering?
  • Is async/await used correctly without blocking?
  • Are webhook and event handlers truly idempotent?
  • What happens if this operation is called twice simultaneously?
  • Are there thread pool exhaustion risks under high concurrency?
  • Could concurrent operations corrupt data across related tables?

Identify specific scenarios that could cause concurrency bugs.

Technical debt & refactoring

This is strategic work, not tactical. First, have the AI map your codebase: what are the main entry points, which modules are highly coupled, where does authentication start and end? Then have it rank technical debt by impact and effort. The output is a prioritized remediation plan you can feed directly into sprint planning.

Prompt: Tech Debt Assessment

Analyze this project for technical debt and refactoring opportunities:

  • What repeated patterns should be abstracted into shared utilities?
  • Is there copy-pasted code that has drifted apart and should be unified?
  • Are there classes or modules with too many responsibilities?
  • What shortcuts are now causing friction?
  • Are there outdated patterns that predate better solutions in the language or framework?
  • Is there dead code, unused parameters, or vestigial logic from removed features?
  • What areas are hard to test because of tight coupling?
  • What are the main entry points and which modules are highly coupled?

Rank every issue by impact (bugs, slow development, confusion) and effort. Present as a prioritized table with a phased remediation plan.

Testing

Dump your most complex files and ask for comprehensive test suites. Focus the AI on testing behavior, not implementation details. The highest-value tests cover critical paths and edge cases that are currently untested. Ask the AI to generate the actual test code, not just descriptions.

Prompt: Test Generation

Evaluate testing coverage and generate missing tests:

  • What critical paths are currently untested?
  • Are existing tests testing behavior or implementation details?
  • What edge cases and error conditions need coverage?
  • Are there integration tests for critical workflows?
  • Are tests deterministic or do they depend on time, randomness, or network?
  • Generate a complete test suite for the untested critical paths, including edge cases.

Focus on tests that catch real bugs, not tests that pad coverage numbers.

DevOps & infrastructure

Use remaining tokens for operational work. Draft CI/CD pipelines, optimize Dockerfiles, write Terraform or infrastructure-as-code scripts, or build quality-of-life bash scripts that automate repetitive tasks. Generate massive, realistic SQL seed files or JSON fixtures for local testing. Run a dependency audit and flag outdated or vulnerable packages. Draft or update your OpenAPI specs to match actual implementation.

SecurityPerformanceArchitectureTestingAccessibilityResilience

Click a dimension to invest in it.

2. UI/UX

Switch the AI's persona to a frontend performance and accessibility expert.

Render optimization. Analyze components for expensive operations inside render cycles, unnecessary re-renders, and missing memoization. Check that async operations are properly cleaned up on component unmount.

Accessibility. This is consistently under-invested and high-impact. Feed frontend code and demand a strict audit: missing ARIA labels, semantic HTML gaps, keyboard navigation failures, color contrast violations.

Error UX. Add error boundaries, user-facing error messages, and graceful degradation for failure modes. The goal is that no user ever sees a white screen or a raw stack trace.

Design variations. Ask the AI to generate 3 distinct layout variations for a screen you're not satisfied with. Even if you don't use them directly, they break you out of design fixation.

Prompt: Accessibility & Frontend Audit

Perform a strict accessibility and frontend performance audit:

  • Are ARIA labels, roles, and live regions used correctly?
  • Is semantic HTML used throughout (nav, main, article, section)?
  • Does keyboard navigation work for all interactive elements?
  • Are focus states visible and in logical order?
  • Are there color contrast failures (WCAG AA minimum)?
  • Are expensive computations happening inside render methods?
  • Are components unnecessarily re-rendered?
  • Are event listeners and subscriptions cleaned up to prevent memory leaks?

For each issue, explain the impact on users and provide the fix.

3. Product

Leverage the AI's holistic view of your codebase to think strategically about what to build next.

Gap analysis. Have the AI evaluate your app and identify standard features, user flows, or admin tools that are missing compared to industry standards. The model has seen thousands of similar products in its training data. Use that breadth.

Map of the unknown. Ask it to find areas where user intent might be failing, or where a small addition could unlock a disproportionately large new use case.

Open source contributions. Run code quality or security audits on open-source libraries you depend on, then open thoughtful GitHub issues or pull requests. It's a low-stakes contribution that improves the ecosystem your product relies on.

Prompt: Product Gap Analysis

Evaluate this application holistically:

  • What standard features or user flows are missing compared to similar products?
  • Where might users get stuck, confused, or frustrated?
  • What administrative or operational tools are absent?
  • Where could a small feature addition unlock a disproportionately large new use case?
  • What are the most common support requests for apps like this, and does this app handle them?

Prioritize by user impact and implementation effort.

4. Marketing

Turn dev tokens into growth leverage.

Deep research. Set up research agents to build foundational company knowledge docs - keyword gaps, competitor deep-dives, buyer personas. Stack specialized research on top of this foundation so each session builds on the last.

Content that compounds. Use deep research to brainstorm content that serves genuine searcher intent. Every piece should be substantially better than what currently ranks for that topic. If it's not meaningfully better, don't publish it. Avoid SEO fluff.

Automated pipelines. Generate full email sequences, social media threads, newsletter outlines, or a 30-day content calendar tied to recent feature releases.

Technical SEO. Audit your marketing site's meta tags, structured data, semantic HTML, and sitemap. This is mechanical, detail-oriented work that AI handles well and humans find tedious.

The pattern across all of this is the same: instead of asking the AI to do something broad, give it a specific lens and a specific domain. The more targeted the prompt, the deeper the analysis.

0255075100S1S2S3S4S5S6S7S8SprintsCodebase healthprod incidentsecurity patchrewrite neededcaught N+1 queryfixed auth bypasseliminated tech debtShip and move onInvest surplus

Don't let your tokens expire. Don't start work you're not ready for. Use the surplus to make what you've already built stronger, safer, and faster. That's not idle time. That's compounding.