Upgrading transformers version to 4.51.3 to support recent models#175
Conversation
make UnlearnTrainer implementation more future proof make other necessary changes to be compatible with new transformers version
fix leaderboard docs wider gitignore
|
I was actually planning to upgrade transformers to But I'd rather wait until you merge or at least review this upgrade to |
|
Looks great!! Can you clean up the linter errors? |
|
Ah, right, fixed now. |
|
Seems the tests are still failing -- maybe a version mismatch? Our instructions for formatting are here. Let me know if you are still stuck! |
|
Ah I see locally it passed for me because I had a newer ruff version. I reformatted to the one that the github action uses, so should be good now |
|
Hey @filyp Really appreciate all the work you put into upgrading the Transformers version. This is a pretty substantial change. I left a few comments; most are minor, except for one issue I called out here that I think we should address before merging. Once that’s resolved, I’m happy to merge. This upgrade will make it much easier for folks to try the latest models and benefit from the newer bug fixes. On a side note: I’d love to hear your thoughts on #145 and if it’s easy to incorporate the fix into this PR, that would be great. |
Dornavineeth
left a comment
There was a problem hiding this comment.
Few changes. Overall, pretty impressive work! Thank you so much.
| self.log(eval_metrics) | ||
| return eval_metrics | ||
|
|
||
| if eval_dataset is None or eval_dataset == "dummy": |
There was a problem hiding this comment.
Why is this hardcoded? Should we remove this?
eval_dataset == "dummy"
There was a problem hiding this comment.
I'm not fully happy with that fix, but that's related to this line in train.py:
eval_dataset=data.get("eval", "dummy"), # None would trigger Trainer exceptionIt's just that in the new transformers version, the trainer asserts that eval is not None, if we tell it to do evaluations, which breaks our custom evaluators setup. LMK if you see a better solution.
There was a problem hiding this comment.
Okay. Seems reasonable.
Can you just define a variable at the top _EVAL_PLACEHOLDER = "_EVAL_PLACEHOLDER" and use this variable at all places?
| labels = nested_detach(tuple(inputs.get(name) for name in self.label_names)) | ||
| if len(labels) == 1: | ||
| labels = labels[0] | ||
| def compute_loss(self, model, inputs, **kwargs): |
There was a problem hiding this comment.
I see why you renamed compute_loss to compute_unlearn_loss, but this could be a breaking change for anyone pulling the update into their forks; it could lead to runtime errors like compute_unlearn_loss is not defined.
I’d prefer to keep the backward compatibility. Can we keep the previous pattern (per the earlier docstrings): have prediction_step and then call super().compute_loss() for the actual loss computation.
NOTE: You need to copy the prediction_step from the transformers 4.51.3
There was a problem hiding this comment.
Yeah, good point. I committed a 3rd alternative now, take a look at current base.py. It preserves backwards compatibility and also is interoperable with other transformers versions.
I regression tested on both tofu unlearning, and some prediction, and it works the same.
| else: | ||
| if has_labels or loss_without_labels: | ||
| with self.compute_loss_context_manager(): | ||
| ### Call compute_loss of super class since overridden compute_loss is not be applicable to eval_dataset. |
There was a problem hiding this comment.
This is the line I am referring here:
https://github.com/locuslab/open-unlearning/pull/175/changes#r2837139317
| ignore_keys: Optional[List[str]] = None, | ||
| ) -> Tuple[Optional[torch.Tensor], Optional[torch.Tensor], Optional[torch.Tensor]]: | ||
| """ | ||
| The only change to this function is calling the Trainer's compute_loss, as it's often overridden by unlearning methods, and we want to maintain the Trainer's evaluation setup. |
There was a problem hiding this comment.
Can you also include these docstrings in the updated code including prediction_step
| ... | ||
|
|
||
| def compute_loss(self, model, inputs, return_outputs=False): | ||
| def compute_unlearn_loss(self, model, inputs, return_outputs=False, num_items_in_batch=None): |
There was a problem hiding this comment.
I would like to retain it as compute_loss
see https://github.com/locuslab/open-unlearning/pull/175/changes#r2837139317
| train_dataset=data.get("train", None), | ||
| eval_dataset=data.get("eval", None), | ||
| tokenizer=tokenizer, | ||
| eval_dataset=data.get("eval", "dummy"), # None would trigger Trainer exception |
There was a problem hiding this comment.
Remove the "dummy" and make it
eval_dataset=data.get("eval", None),
There was a problem hiding this comment.
The thing is that using None triggers a Trainer exception (see #175 (comment))
There was a problem hiding this comment.
FYI, I moved this hacky fix into the FinetuneTrainer to hide it more. I don't see any better way to fix it.
There was a problem hiding this comment.
Looks like these additions in gitignore are specific to your usecases.
can you revert this?
| conda create -n unlearning python=3.11 | ||
| conda activate unlearning | ||
| pip install .[lm_eval] | ||
| pip install . |
There was a problem hiding this comment.
Any reason you removed this?
I would like to give this as an option instead of keeping it requirements.txt because the lm_eval harness is a more involved build.
There was a problem hiding this comment.
Ah, I see why I didn't notice lm_eval heaviness, I think I was using version 0.4.11 which installs in 20s (I tested it now in a fresh venv), while the older 0.4.8 in 70s. Should I bump it then? Or revert?
(And the reason I moved it is to simplify the installation.)
There was a problem hiding this comment.
You can bump the version and still keep it optional for people having
pip install .[lm_eval]
| huggingface-hub==0.36.0 | ||
| transformers==4.51.3 | ||
| hf-xet==1.2.0 | ||
| lm-eval==0.4.8 |
There was a problem hiding this comment.
| @@ -17,13 +17,10 @@ | |||
| packages=find_packages(), | |||
| install_requires=requirements, # Uses requirements.txt | |||
| extras_require={ | |||
There was a problem hiding this comment.
|
Thanks! I'll go through the review soon. First, about the #145 you mentioned: It looks pretty complicated, I left a comment there with some alternative simpler fix. But I'd rather keep it separate from this PR, because in my setup I can't test #145 easily. (Unless you want to go with the simple fix, then I can add it, but I still can't test it fully.) |
|
I also committed now Besides that, I addressed all your comments. |
|
All good! Left comments for 2 minor nits. Once fixed, I will merge it. |
|
Hey, somehow I can't see these 2 new comments. Could you link them? (Or maybe you have a "review" started and didn't "submit" the comments, and then only you see them. I remember once having this problem because the UI is quite confusing.) |
|
Here are the comments: |
|
Hm, I'm still having problems accessing these comments for some reason :/ |
|
@molereddy yes I see the same ones. I'm understanding that @Dornavineeth added some new ones that we both don't see? And these unresolved ones are either outdated or I left a question in them. |
|
Yes, the links @Dornavineeth shard don't open up any comment for me. |
Dornavineeth
left a comment
There was a problem hiding this comment.
Please see the comments.
There was a problem hiding this comment.
Can we remove this just for the sake of simplicity?
There was a problem hiding this comment.
Done, removed. (Let me know if you'd like me to include that runner someplace else, but ok if not.)
|
|
||
|
|
||
| class UnlearnTrainer(FinetuneTrainer): | ||
| def prediction_step(self, *args, **kwargs): |
There was a problem hiding this comment.
Can we just copy the orginal transformers code of prediction_step and make edits just wherever required?
There was a problem hiding this comment.
Just to not accidentally break any other functionalities included in the hf prediction_step.
There was a problem hiding this comment.
Done — copied prediction_step from transformers 4.51.3 with the one change of calling super().compute_loss() instead of self.compute_loss().
(Note though that this implementation could potentially break at some future bump because of these cherrypicked imports at the top. But for the current bump, I regression tested and it's fine.)
| self.log(eval_metrics) | ||
| return eval_metrics | ||
|
|
||
| if eval_dataset is None or eval_dataset == "dummy": |
There was a problem hiding this comment.
Okay. Seems reasonable.
Can you just define a variable at the top _EVAL_PLACEHOLDER = "_EVAL_PLACEHOLDER" and use this variable at all places?
| conda create -n unlearning python=3.11 | ||
| conda activate unlearning | ||
| pip install .[lm_eval] | ||
| pip install . |
There was a problem hiding this comment.
You can bump the version and still keep it optional for people having
pip install .[lm_eval]
|
Ok, I applied all these changes. I also regression tested on tofu unlearning and on some code that uses prediction_step and it's unchanged. So I think it's all done now. |
|
Great Work. Thank you so much for this PR. |

What does this PR do?
num_items_in_batchtocompute_losssignatureeval_dataset=Nonetrainer.processing_classinstead oftrainer.tokenizer(it's depracated andtransformers==5removes it)Additionally it:
Related issues: #173 and #155
Before submitting
Tests
I manually tested the new evaluation, with removed prediction_step from UnlearninTrainer, and it works the same.
When I tested unlearning, the unlearning trajectories are exactly the same when using gradient_accumulation_steps=1. But when it's =/=1, the upgrade changes the scale of the logged training/loss (it's 4x higher.), and subtly changes the unlearning trajectory. This most likely comes from the gradient accumulation fix in transformers=4.46.
(Tested unlearning with this command:
Where tofu_simple is a config with fewer eval metrics.)
compute_lossdocstring now states:When I tried setting
trainer.model_accepts_loss_kwargs = Falseit restores the previous scale of the training loss, but it doesn't affect the unlearning trajectory at all, only that logged loss scale.