Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Token Optimization Folklore

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Token Optimization Folklore

As a tester, I aspire for more empirical evidence than what I accrue as anecdotal evidence of what seems to work for me and my colleagues. Yet, the folklore I am capable of distributing is a first step to getting corrected, and learning better ways. And as much with *tongue in cheek* as I approach this, I've been on this train of learning since 2021 collecting and correcting myself.

With prices going up, skills need to go up as well.

Avatar for Maaret Pyhäjärvi

Maaret Pyhäjärvi

June 04, 2026

More Decks by Maaret Pyhäjärvi

Other Decks in Programming

Transcript

  1. © 2026 CGI Inc. 1 Token optimization folklore Maaret Pyhäjärvi

    June 2026 How to survive with token you have or get more
  2. © 2026 CGI Inc. 2 This lottery runs on tokens:

    focus on optimizing quality AI autocomplete Basic use in editor. Light on tokens. Prompting agentic Asking initiates doing. One-shot tasks. Individual. Reactive agentic Agent/skills for specific repeatable task. Shared for team. Proactive agentic Automate tasks on triggers. Dark factories Work initiated to go away for something else for full flows of value. Agen t gambli ng (lack of cont ext engi neering ) is not susta inable w ith scale . AI ASSISTED ENGINEER Works w ith one agent, most ly sync AI ENG INEER Orch estrato r of mult iple asyn c agent s
  3. © 2026 CGI Inc. 3 3 Exper imentat ion vs.

    Build ing with intent We do t he best work withi n known boun daries we can rene gotiate on valu e 1. Learning 2. Building solo 3. Building scaled Defa ult boun daries Github Copilot: 3000 + 5000 AI credits ChatGPT+Codex: 3000 credits, 700 per week Model sufficiency is experimentation.
  4. © 2026 CGI Inc. 4 5 things you can start

    doing today SUMMARY 1 2 3 4 5 Choose right model for the right task. Clear guidance on your intent with stop signals in your prompt. Research – Plan – Implement. Provide deterministic controls (tests, linters, audits) Maintain concise, human-worthy instructions hierarchy. Improve as agent-miss log and trim outputs.
  5. © 2026 CGI Inc. 5 Agentic frameworks expose what was

    always true ARCHITECTURE-AWARE AI Struc tured c ontext i s the ke y to AI s uccess. Without i t, agen ts fail. Anatomy of an AI agent Language model Models reason over context. Instructions Goals, constraints, what ‘good’ looks like Memory Short-term and long-term state across turns. Tools MCP servers, APIs, CMDs, files, stores, other agents. Oversight and harness Humans-in-the loop, policy gates, harness. THE LESSON Every agent that ‘just works’ is running on curated context. Agents don’t usually fail because the model is weak. They fail because context is stale, thin or contradictory. Context is not documentation for the model; it is the environment model runs in.
  6. © 2026 CGI Inc. 6 Signal beats volume. More tokens

    is not more context. CONTEXT CURATION: LESS INFORMATION, MORE MEANING Curate don’t dump Give the agent the 3 things it needs, not the 300 things that exists. Surgical relevance is a design choice. Filter before you feed. Structure beats prose Schemas, specs, tables and examples outperform long paragraphs. Structure makes meaning machine-readable. If a human has to skim it, the agent will miss it. Freshness matters Stale content silently poisons the output. Treat context like a running system, not a one- time artifact. Refresh context intentionally. Context is a living asset, not a file drop
  7. © 2026 CGI Inc. 7 Optimizing token usage TIPS Right

    model for task. Github Copilot automode tries to automate this for you since 05/26. Thinking intensity. Control over the effort means control over cost. Plan is most intense. Scripts and tools. Your assets to support AI use. Solve AI with adjacent technologies. Preprocessing, deterministic controls. 3rd party memory, context optimization. Tests, audits. Context engineering. Agents/skills and files. Minimize. Guide choices. Task-specific context. 1 2 3 4 5 Research – Plan - Fleet. Enable separation of tasks. Gemini 2.5 – Opus 4.7 – GPT 5.4 6 Excha nging d eveloper time to tokens is a layer ed pro blem we are solv ing.
  8. © 2026 CGI Inc. 8 LLM SPECIFIC CONTEXT LIMIT Context

    window and tokens AGENTS WORKING WITH LLMS 1st loop 2nd loop Response Input tokens Output tokens Prompt File System prompt & tools Response Prompt File System prompt & tools Response Output tokens Input tokens Prompt File Cache Input tokens (not guaranteed) Context rot 1. Lost in the middle <50% Models bias context in the beginning and end of context 2. Recency bias 50% Models bias end of context New c ontext w indow wh en you switch ta sks. Plans drive s witching tasks.
  9. © 2026 CGI Inc. 9 Configs as mechanism to manage

    context TIPS Persistent instructions: agent.md, copilot- instructions.md • Non-negotiables of your project • Log reoccuring agent misses • Statement to trim output “Be concise” • Iterate, maintain, recreate. • Consider writing these yourself. Add conditional capabilities: Skills Design workflows and behaviors: Custom agents Persistent instructions copilot-instructions.md Custom agents .github/agents/*.agent.md Skills .github/skills/*/SKILL.md MCP Subagents Scoped instructions .github/instructions/*.instructions.md Prompt files .github/prompts/*.prompt.md Copilot memory CONFIGS
  10. © 2026 CGI Inc. 10 Avoid these TIPS Replacing all

    traditional IDE features of refactoring with AI requests. Giving tasks that should be a deterministic script to AI. Just prompting, so that your team context does not improve. Running everything on Claude Sonet / Opus if on Github Copilot. Using MCPs (verbose) when CLI (terse) is available. Just-in-case context that is not required or necessary.
  11. © 2026 CGI Inc. 11 Power user guidance TIPS Think

    in code - prefer creating scripts to analyze code Consider CLI vs. MCP – scenario specific but famously known for playwright Optimize with 3rd party components • Shell outputs can be very long e.g https://github.com/rtk.ai/rtk • Context can be locally optimized before LLM e.g. https://github.com/chopratejas/headroom • Team repeats same things e.g. https://github.com/ruvnet/ruvector https://github.com/RBKunnela/ALMA-memory • Tool calls are verbose and can be collapsed e.g https://github.com/jsturtevenat/copilot-codeact-plugin • Token profiling locally https://github.com/getagentseal/codeburn Consider CLI vs. MCP Github Copilot CLI “/chronicle tips” to learn what to change in your configs based on past Model specific context optimization – models behave differently and can be tweaked
  12. © 2026 CGI Inc. 12 Using AI better FIVE MOVES.

    NO NEW TOOLS. MEASURABLY BETTER RESULTS. Audit your prompts. Pick three prompts you use often. Where is the context leaking? What is the model having to guess? Write intent brief. For each recurring task, state audience, output format, constraints and what good looks like. Reuse it. Measure before / after. Track rework cycles and time-to-good-draft. Context improvements show up fast in these two numbers. Make context portable. Put reusable context into files you can hand to any agent tool. Curate, don’t dump. Swap long docs for structured notes. Give the agent the 3 things it needs, not the 300 it might want. 1 2 3 4 5
  13. © 2026 CGI Inc. 13 Insights you can act on

    Founded in 1976, CGI is among the largest IT and business consulting services firms in the world. We are insights-driven and outcomes-focused to help accelerate returns on your investments. Across hundreds of locations worldwide, we provide comprehensive, scalable and sustainable IT and business consulting services that are informed globally and delivered locally. cgi.com