The server returns the token usage when asked nicely (with some json). IMO, it would be very useful to inform the user about the costs of any LLM operation.
The server returns the token usage when asked nicely (with some json).
IMO, it would be very useful to inform the user about the costs of any LLM operation.