narrow batch-size benchmark except to (RuntimeError, ValueError, TypeError, OSError)#396
narrow batch-size benchmark except to (RuntimeError, ValueError, TypeError, OSError)#396HrachShah wants to merge 2 commits into
Conversation
…lueError, TypeError) — self.generate raises RuntimeError on dtype/memory issues, ValueError on invalid config, TypeError on type mismatches; catching all Exception masked these specific failures
…Error, OSError) model.get_responses() walks the same generate() path that's already guarded by the narrower except (RuntimeError, ValueError, TypeError) clause in model.py:137 — CUDA OOM, dtype mismatches, bad config, and type errors are exactly what surfaces from self.model.generate(), not arbitrary Exception. Catching all Exception here also masked KeyboardInterrupt-style failures (the existing test loop relies on the benchmark break-out path).
There was a problem hiding this comment.
Code Review
This pull request refactors exception handling in main.py and model.py by replacing broad except Exception blocks with specific exception types. The reviewer suggests also catching OSError in model.py to ensure the robustness of the dtype fallback mechanism when loading models.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| max_new_tokens=1, | ||
| ) | ||
| except Exception as error: | ||
| except (RuntimeError, ValueError, TypeError) as error: |
There was a problem hiding this comment.
In src/heretic/model.py, the from_pretrained call can raise an OSError (for example, due to a network timeout, DNS failure, or a missing/corrupted file in the local cache). If the user has configured multiple dtypes (e.g., ["bfloat16", "float16"]) and the first one fails to load due to an OSError (e.g., because its files are not cached and the system is offline), the program will crash immediately instead of falling back to the next dtype (which might be fully cached and working). To ensure the robustness of the dtype fallback mechanism, OSError should also be caught here, similar to how it was added to the exception handler in src/heretic/main.py.
| except (RuntimeError, ValueError, TypeError) as error: | |
| except (RuntimeError, ValueError, TypeError, OSError) as error: |
How do you know this, given the enormous complexity of inference, and that the API makes no guarantee whatsoever regarding the exceptions that can be raised? This is certainly false for models requiring remote code, which are free to raise whatever errors they want in their custom code paths, and there is nothing you can do to narrow it down. |
model.get_responses() walks the same generate() path that's already guarded by the narrower
except (RuntimeError, ValueError, TypeError)clause in model.py:137 — CUDA OOM, dtype mismatches, bad config, and type errors are exactly what surfaces from self.model.generate(), not arbitrary Exception. Catching all Exception here also masked KeyboardInterrupt-style failures (the existing test loop relies on the benchmark break-out path).