Debugging Record: OpenRouter gpt-oss-120b Model Does Not Support Chinese Requests
Categories:
While using the free model API provided by OpenRouter, I encountered a confusing issue. With the exact same request structure, simply changing the language of the prompt resulted in completely different outcomes.
Problem Reproduction
I used the openai/gpt-oss-120b:free model for testing. The only difference between the two requests was the language of the prompt. The first request used a Chinese prompt:
curl https://openrouter.ai/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-or-v1-xxxxxxxxxxxxxxxxxxxxxx" \
-d '{
"model": "openai/gpt-oss-120b:free",
"messages": [
{
"role": "user",
"content": "你是一个专业的本地化翻译专家"
}
]
}'
This request always returned a 429 status code, indicating that the request rate was too high or the quota limit was exceeded. However, when I used an English prompt:
curl https://openrouter.ai/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-or-v1-xxxxxxxxxxxxxxxxxxxxxx" \
-d '{
"model": "openai/gpt-oss-120b:free",
"messages": [
{
"role": "user",
"content": "You are a professional localization translation expert"
}
]
}'
The request responded normally and returned the expected model output.
Debugging Process
This inconsistent behavior was puzzling. A 429 error usually means a rate limit, but the issue is that the two requests were sent almost simultaneously, so there shouldn’t be a rate limit problem. Therefore, I started systematically investigating possible causes.
I first checked the API key’s quota limits and confirmed that I hadn’t exceeded them. Then I verified the request frequency and found that only a small number of requests were sent within a short period, which shouldn’t trigger any rate limiting mechanism. After ruling out these common causes, I noticed that the only variable was the language of the prompt.
When seeking help from more powerful AI models, I consulted Opus 4.6 Max and GPT-5.2 Extra High. Although they are among the most advanced language models currently available, neither was able to clearly point out the root cause of this bug. This indicates that some edge cases or specific restrictions might only be discoverable through actual testing.
Manual Verification
Since automated debugging tools failed to provide an answer, I decided to manually verify various hypotheses. I tested different Chinese content, including simple greetings, technical questions, and long texts. All Chinese requests returned 429 errors. Conversely, English requests of the same length responded normally.
This phenomenon points to a clear conclusion: the openai/gpt-oss-120b:free model does not support Chinese requests. Processing Chinese content by the model likely triggers an undocumented restriction mechanism, causing the API to return a 429 error directly rather than a more friendly error message.
Experience Summary
There are several noteworthy points from this debugging experience. First, API error messages can be misleading. A 429 error usually indicates a rate limit, but in some cases, it may hide other restrictions. Second, while automated debugging tools are powerful, they are not omnipotent. Some restrictions specific to models or platforms can only be discovered through actual testing.
Another important lesson is the necessity of verifying hypotheses. When multiple advanced AI models fail to find the problem, manual systematic testing remains the most reliable method. By controlling variables and verifying one by one, we can ultimately locate the root cause of the problem.
For applications that need to handle multilingual content, this also reminds us to carefully review documentation or conduct thorough testing when selecting models. Free models often have various restrictions, which may not be explicitly stated in the main documentation.
Related Tools
When handling multilingual content translation, I developed a VS Code extension Project Translator, specifically for project multilingual localization workflows. It can automatically identify files that need translation, integrate multiple translation services, and maintain translation context consistency.
The initial design intention of this extension was to solve the pain points of multilingual processing encountered in actual projects. By using automation to reduce the workload of manual translation while ensuring translation quality. During the development process, I also encountered various API limitations and edge cases, each of which required careful debugging and verification.
Conclusion
In the process of technical debugging, unexpected problems are always encountered. The key is to remain patient, systematically investigate possible causes, and not overlook any details. Sometimes the most advanced tools can’t help, and instead, the most basic verification methods are needed.
OpenRouter provides a rich selection of models and flexible APIs, which is its advantage. However, it is also important to note that different models may have different restrictions and characteristics, so it is best to conduct thorough testing before use. This is especially true for free models, whose restrictions are often stricter and less transparent.