Feature: Token reporting usage #46

fmunteanu · 2025-03-28T13:33:00Z

First, thank you for the great product, it works wonders. I was wondering if is possible to implement a token reporting usage, on each prompt? Something like Zed does, see 13/200k screenshot.

Claude's Suggested Implementation

My ask:

Please review https://github.com/rusiaaman/wcgw and determine if is possible to implement a token reporting usage, within Claude Desktop:
* Add logging within the wcgw Python code (e.g., in `/src/wcgw/client/mcp_server`)
* Estimate token counts locally using a tokenizer (e.g., Anthropic’s tokenizer or a compatible one) for inputs and outputs processed through the MCP
* Output the usage result into each Claude Desktop conversation prompt

As a side note, it might be beneficial to enable Discussions into your repository, for general questions.

The text was updated successfully, but these errors were encountered:

rusiaaman · 2025-03-28T13:51:12Z

I'm glad you like the project.

Token counting is possible and will be helpful.

There are two ways we can report the results back:

As tool output that Claude can also read
As a resource that only user can open.

I like the second approach better as it doesn't hinder the user conversation flow with the model. What is your opinion? I'm not sure if by "Claude Desktop conversation prompt" you mean the tool output or resource?

rusiaaman · 2025-03-28T13:52:42Z

It's an interesting idea, also I've enabled the discussions.

fmunteanu · 2025-03-28T14:14:49Z

I'm glad you like the project.

Token counting is possible and will be helpful.

There are two ways we can report the results back:

As tool output that Claude can also read

As a resource that only user can open.

I like the second approach better as it doesn't hinder the user conversation flow with the model. What is your opinion? I'm not sure if by "Claude Desktop conversation prompt" you mean the tool output or resource?

I updated the OP with additional details and findings. I'm inclined also to select the second approach. Related to your question about conversation prompt, what I like in Zed editor is that you can see into upper right corner the token usage, so it does not hinder the conversation. We could have a switch like token_usage=false to disable it, should be enabled by default. I'll trust your judgment, since you have way more experience with MCP's. 😊

Edit: @rusiaaman Note Claude's suggestion for implementation: Add token usage display in Claude Desktop interface.

That means is possible to display the token usage into Claude Desktop interface, like Zed does it. See below what Clause suggests. We might need to use the first approach, as tool output that Claude can also read, in order to display it into Claude Desktop interface.

Second conversation ask:

Related to: Add token usage display in Claude Desktop interface
Where would be a possible place to display the token usage in Claude Desktop, next to other buttons into chat text input area?

Which option would you pick?

Bottom-right corner of chat input area, next to send button
Status bar beneath text input field, alongside character count
As a small badge in the right sidebar

About the suggested format 123 in | 456 out | 579 total, we should use Zed's approach 579 / 200k, we only care about the total, not the in/out numbers.

fmunteanu · 2025-03-29T04:09:02Z

@rusiaaman I'm almost done with a PR, I will finish it tomorrow. This is what the implementation will look like, Claude does not render its Desktop well, the implementation will look much better live:

IMO this is the best location for counter (next to Send button), as is non-invasive for end-user.

I also added into Initialize MCP Tools a "continue": False key/value, default False. If passed as True into Initialize, the MCP server will automatically submit a Continue response, when Claude hit the max length for a message and has paused its response message is detected into Claude Desktop UI:

The MCP server stops sending the Continue response when the token usage reaches or exceeds the token_threshold. By default, this threshold is set to 90% (0.9). All implementation details will be detailed into PR.

For dependencies, we use anthropic_tokenizer with fallback to tiktoken:

# Fallback to tiktoken, if anthropic_tokenizer not available
try:
    from anthropic_tokenizer import count_tokens
except ImportError:
    import tiktoken
    def count_tokens(text: str) -> int:
        """Count tokens using tiktoken fallback"""
        enc = tiktoken.get_encoding("cl100k_base")
        return len(enc.encode(text))

rusiaaman · 2025-03-31T09:53:30Z

It's somewhat a lot to review for me.

On your suggestions

First things first, there's no way for an MCP server like wcgw to control or change the UI of an MCP client like claude desktop.

Claude doesn't have innate knowledge of MCP and that's why it has given you wrong information on what's possible.

What's possible

As an MCP server we can return output in tool call or we can attach resources. The tool output is meant for claude to know the result of a tool execution and it isn't readily consumable by the user. It's also not obvious what it's impact will be on Claude's subsequent response. In zed, the LLMs aren't shown the token count information.

We could use resources, which are also meant to be shared with user. Using resource, it would look like the following (see the counter)

In order to read the tokens, we'll have to click it. It then gets attached as a pasted text, which you can then open and read.

Fresh thoughts on this feature

All of the UI decisions don't matter because of the following things I realised.

Unfortunately in MCP server there's no way to know when a new conversation starts. So the counter will continue in a new conversation.

We also don't know if the server has restarted. So in middle of conversation the token counter can reset.

In both of these cases our token counting will be way off.

Since we also don't know about the number of tokens outside the MCP servers, like what user has has pasted or written, or what claude has shared in form of artifact or other mcp calls, the tokens we'll show won't be representative of real conversation size.

Current outlook

Given these important limitations I don't think it's worth adding this feature to an MCP server. This has to be handled by the client. MCP clients like Zed and Cline already show this information and Claude should also show it.

In its current form, it's introducing complexity without proportional benefits.

fmunteanu mentioned this issue Mar 29, 2025

feat: token counter #48

Closed

rusiaaman closed this as completed Apr 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Token reporting usage #46

Feature: Token reporting usage #46

fmunteanu commented Mar 28, 2025 •

edited

Loading

rusiaaman commented Mar 28, 2025

rusiaaman commented Mar 28, 2025

fmunteanu commented Mar 28, 2025 •

edited

Loading

fmunteanu commented Mar 29, 2025 •

edited

Loading

rusiaaman commented Mar 31, 2025

Feature: Token reporting usage #46

Feature: Token reporting usage #46

Comments

fmunteanu commented Mar 28, 2025 • edited Loading

Claude's Suggested Implementation

rusiaaman commented Mar 28, 2025

rusiaaman commented Mar 28, 2025

fmunteanu commented Mar 28, 2025 • edited Loading

fmunteanu commented Mar 29, 2025 • edited Loading

rusiaaman commented Mar 31, 2025

On your suggestions

What's possible

Fresh thoughts on this feature

Current outlook

fmunteanu commented Mar 28, 2025 •

edited

Loading

fmunteanu commented Mar 28, 2025 •

edited

Loading

fmunteanu commented Mar 29, 2025 •

edited

Loading