Skip to content

narrow batch-size benchmark except to (RuntimeError, ValueError, TypeError, OSError)#396

Open
HrachShah wants to merge 2 commits into
p-e-w:masterfrom
HrachShah:fix/narrow-batch-size-exception
Open

narrow batch-size benchmark except to (RuntimeError, ValueError, TypeError, OSError)#396
HrachShah wants to merge 2 commits into
p-e-w:masterfrom
HrachShah:fix/narrow-batch-size-exception

Conversation

@HrachShah

Copy link
Copy Markdown

model.get_responses() walks the same generate() path that's already guarded by the narrower except (RuntimeError, ValueError, TypeError) clause in model.py:137 — CUDA OOM, dtype mismatches, bad config, and type errors are exactly what surfaces from self.model.generate(), not arbitrary Exception. Catching all Exception here also masked KeyboardInterrupt-style failures (the existing test loop relies on the benchmark break-out path).

Zo Bot added 2 commits June 23, 2026 23:41
…lueError, TypeError) — self.generate raises RuntimeError on dtype/memory issues, ValueError on invalid config, TypeError on type mismatches; catching all Exception masked these specific failures
…Error, OSError)

model.get_responses() walks the same generate() path that's already guarded by the narrower except (RuntimeError, ValueError, TypeError) clause in model.py:137 — CUDA OOM, dtype mismatches, bad config, and type errors are exactly what surfaces from self.model.generate(), not arbitrary Exception. Catching all Exception here also masked KeyboardInterrupt-style failures (the existing test loop relies on the benchmark break-out path).

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors exception handling in main.py and model.py by replacing broad except Exception blocks with specific exception types. The reviewer suggests also catching OSError in model.py to ensure the robustness of the dtype fallback mechanism when loading models.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/heretic/model.py
max_new_tokens=1,
)
except Exception as error:
except (RuntimeError, ValueError, TypeError) as error:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In src/heretic/model.py, the from_pretrained call can raise an OSError (for example, due to a network timeout, DNS failure, or a missing/corrupted file in the local cache). If the user has configured multiple dtypes (e.g., ["bfloat16", "float16"]) and the first one fails to load due to an OSError (e.g., because its files are not cached and the system is offline), the program will crash immediately instead of falling back to the next dtype (which might be fully cached and working). To ensure the robustness of the dtype fallback mechanism, OSError should also be caught here, similar to how it was added to the exception handler in src/heretic/main.py.

Suggested change
except (RuntimeError, ValueError, TypeError) as error:
except (RuntimeError, ValueError, TypeError, OSError) as error:

@p-e-w

p-e-w commented Jun 25, 2026

Copy link
Copy Markdown
Owner

CUDA OOM, dtype mismatches, bad config, and type errors are exactly what surfaces from self.model.generate()

How do you know this, given the enormous complexity of inference, and that the API makes no guarantee whatsoever regarding the exceptions that can be raised?

This is certainly false for models requiring remote code, which are free to raise whatever errors they want in their custom code paths, and there is nothing you can do to narrow it down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants