fix(CubeMaster): preserve template network rules and resource defaults#581
fix(CubeMaster): preserve template network rules and resource defaults#581zyl1121 wants to merge 1 commit into
Conversation
| if resourceIn.Limit != nil { | ||
| resourceOut.Limit = resourceIn.Limit | ||
| } |
There was a problem hiding this comment.
Shallow copy of Resource.Limit — pointer aliasing with template
This assigns the pointer directly (line 198), not a deep copy:
if resourceIn.Limit != nil {
resourceOut.Limit = resourceIn.Limit
}After this, the container's Resources.Limit and the template's Resources.Limit point to the same RequestLimit struct. If the template's Limit is ever mutated downstream (even unintentionally), the container's Limit would change too.
The rest of this PR carefully deep-copies egress rule fields (using cloneStringPtr, struct-value copies, etc.), so this inconsistency stands out. A deep copy would be:
if resourceIn.Limit != nil {
resourceOut.Limit = &types.RequestLimit{
Cpu: resourceIn.Limit.Cpu,
Mem: resourceIn.Limit.Mem,
}
}How to fix: Replace the pointer assignment with a struct copy on line 198. This is a pre-existing issue, but since applyTemplateResources is being refactored in this PR anyway, it's the right time to close the gap.
| if len(in.Action.Inject) > 0 { | ||
| action.Inject = make([]*types.EgressRuleInject, 0, len(in.Action.Inject)) | ||
| for _, inj := range in.Action.Inject { | ||
| if inj == nil { | ||
| continue | ||
| } | ||
| cloned := &types.EgressRuleInject{ | ||
| Header: inj.Header, | ||
| Secret: inj.Secret, | ||
| Format: cloneStringPtr(inj.Format), | ||
| } | ||
| action.Inject = append(action.Inject, cloned) | ||
| } |
There was a problem hiding this comment.
Field-by-field construction vs struct-copy pattern diverges from the sibling implementation
This clones EgressRuleInject fields one by one:
cloned := &types.EgressRuleInject{
Header: inj.Header,
Secret: inj.Secret,
Format: cloneStringPtr(inj.Format),
}The identical function in cubeboxutil.go:355-367 uses a struct-value copy instead:
cp := *inj
if inj.Format != nil {
format := *inj.Format
cp.Format = &format
}
action.Inject = append(action.Inject, &cp)The struct-copy pattern is more resilient: if a new field is added to EgressRuleInject (which carries credentials), the struct-copy version would automatically pick it up, while the field-by-field version would silently drop it. This is a maintenance hazard — someone adding a field to the type might not know to update all the clone functions.
How to fix: Align this with the cubeboxutil.go pattern by using a struct-value copy instead of field-by-field assignment.
Review: PR #581SummaryThree correctness fixes in the template/sandbox materialization path. Deep-copy helpers, regression tests, and validation-order fix are all appropriate. Inline comment items
Additional
|
| } | ||
| if resourceIn.Limit != nil { | ||
| resourceOut.Limit = resourceIn.Limit | ||
| } |
There was a problem hiding this comment.
Shallow copy of Resource.Limit — pointer aliasing with template
This assigns the pointer directly (line 198), not a deep copy:
if resourceIn.Limit != nil {
resourceOut.Limit = resourceIn.Limit
}After this, the container's Resources.Limit and the template's Resources.Limit point to the same RequestLimit struct. If the template's Limit is ever mutated downstream (even unintentionally), the container's Limit would change too.
The rest of this PR carefully deep-copies egress rule fields (using cloneStringPtr, struct-value copies, etc.), so this inconsistency stands out. A deep copy would be:
if resourceIn.Limit != nil {
resourceOut.Limit = &types.RequestLimit{
Cpu: resourceIn.Limit.Cpu,
Mem: resourceIn.Limit.Mem,
}
}How to fix: Replace the pointer assignment with a struct copy on line 198. This is a pre-existing issue, but since applyTemplateResources is being refactored in this PR anyway, it's the right time to close the gap.
| cpu.MilliValue()) | ||
| } | ||
| if mem.Cmp(config.GetConfig().Scheduler.MaxMvmMemoryRes()) >= 0 { | ||
| if err == nil && mem.Cmp(config.GetConfig().Scheduler.MaxMvmMemoryRes()) >= 0 { |
There was a problem hiding this comment.
Let's return early instead of nil check here.
There was a problem hiding this comment.
Let's return early instead of nil check here.
Done, I updated this in a new commit to return early.
| ctr.Resources = &types.Resource{} | ||
| } | ||
| applyTemplateResources(templateCtr.Resources, ctr.Resources) | ||
| ctr.Resources = applyTemplateResources(templateCtr.Resources, ctr.Resources) |
There was a problem hiding this comment.
It seems like changes to this file is unnecessary.
There was a problem hiding this comment.
It seems like changes to this file is unnecessary.
Good point. I added this as defensive hardening for the case where the destination Resources is nil, but the current call sites already handle allocation. To keep this PR focused, I will drop this change.
| out := &types.CubeNetworkConfig{ | ||
| AllowOut: append([]string(nil), in.AllowOut...), | ||
| DenyOut: append([]string(nil), in.DenyOut...), | ||
| Rules: cloneEgressRules(in.Rules), |
There was a problem hiding this comment.
Currently, there is no way to pass egress rules when create template.
There was a problem hiding this comment.
Currently, there is no way to pass egress rules when create template.
I think this is true for cubemastercli today, but not for the HTTP API.
CreateTemplateFromImageReq already includes cube_network_config.rules, and the /cube/template/from-image handler unmarshals the request body directly into that type.
I reproduced this on a v0.4.0 one-click deployment by sending cube_network_config.rules in the raw POST body. The request was accepted, allowInternetAccess was preserved, but rules was missing from the stored template create_request.
So the issue here seems to be that raw API callers can already pass rules, but the template request clone path drops them.
If the intended scope is to only support fields currently exposed by cubemastercli, I can split the rules preservation into a follow-up and keep this PR focused on other fixes.
There was a problem hiding this comment.
Good point. I think we should also revise mergeEgressRules(). The per sandbox rules should prepend to templates rules.
There was a problem hiding this comment.
Good point. I think we should also revise
mergeEgressRules(). The per sandbox rules should prepend to templates rules.
Updated this part.
mergeEgressRules() now prepends per-sandbox/request rules before template rules, so first-match-wins evaluation gives request-side rules precedence.
I also added focused tests for name-based overrides, no-conflict merge order, empty-side handling, and nil rule entries.
22e1921 to
9bc295d
Compare
| // first-match-wins, per-sandbox/request rules must come before template rules. | ||
| // Template rules whose Name is overridden by the request side are skipped. | ||
| func mergeEgressRules(base []*types.EgressRule, extra []*types.EgressRule) []*types.EgressRule { | ||
| if len(extra) == 0 { |
There was a problem hiding this comment.
Minor: mergeEgressRules now always clones (defensive, correct)
The new code on both early-return paths (len(extra)==0 → returns cloneEgressRules(base), and len(base)==0 → returns cloneEgressRules(extra)) always returns a deep copy. The old code returned base directly when no extra rules were provided.
This pairs with mergeCubeNetworkConfigs (line 267+279): out is already a clone of templateCfg, but mergeEgressRules re-clones it. When both sides have rules, template rules are cloned twice — once in cloneCubeNetworkConfig and again inside mergeEgressRules. Consider passing templateCfg.Rules directly to mergeEgressRules to avoid the redundant clone.
| // first-match-wins, per-sandbox/request rules must come before template rules. | ||
| // Template rules whose Name is overridden by the request side are skipped. | ||
| func mergeEgressRules(base []*types.EgressRule, extra []*types.EgressRule) []*types.EgressRule { | ||
| if len(extra) == 0 { |
There was a problem hiding this comment.
Review Summary
This PR fixes three data-loss bugs in the template/sandbox request materialization path. The changes are well-scoped and the reasoning is clearly articulated. Below are the findings I consider noteworthy.
Pre-existing bug: parse errors silently dropped in getReqResource (Not PR-introduced)
File: CubeMaster/pkg/service/sandbox/util.go:90-98
The resource.ParseQuantity calls use := which declares a new for-loop-scoped err, shadowing the named return err from the function signature. After break, the for-loop-scoped err goes out of scope and line 104 reads the named return err which is still nil.
Suggestion: Use = instead of := to write to the named return err.
Docs describe wrong rule merge ordering
Files:
- docs/guide/egress-network-policy.md (English, line 197)
- docs/zh/guide/egress-network-policy.md (Chinese, line 197)
Both state that request rules with new names are appended after template rules. The mergeEgressRules implementation puts all request rules first, then template rules minus overridden ones. The code is correct; the docs should match.
Minor: redundant clone when both sides have egress rules
When both templateCfg and requestCfg contain rules (cubeboxutil.go:267,279), template rules are deep-cloned twice — once in cloneCubeNetworkConfig and again in mergeEgressRules.
Positive observations
- The getReqResource early-return fix is correct and unambiguous.
- Test coverage verifies pointer independence at all nesting levels.
- Nil entry handling in mergeEgressRules is robust.
152d6af to
f632d73
Compare
|
@zyl1121 Please resolve conflicts and squash your commits. |
Preserve egress rules when cloning template create requests so accepted network policy is not lost during template creation from image. Also preserve the first resource validation error when both CPU and memory exceed configured limits. Update egress rule merge order to honor request-side overrides under first-match-wins semantics, and keep the English and Chinese network policy docs aligned with the implemented behavior. Signed-off-by: zhengyilei <zheng_yilei@qq.com>
f632d73 to
d4d91c4
Compare
|
@chenhengqi Done. The merge conflict is resolved, and the commits have been squashed into one. |
| // mergeEgressRules combines template + request rules. Because egress rules are | ||
| // first-match-wins, per-sandbox/request rules must come before template rules. | ||
| // Template rules whose Name is overridden by the request side are skipped. | ||
| func mergeEgressRules(base []*types.EgressRule, extra []*types.EgressRule) []*types.EgressRule { |
There was a problem hiding this comment.
Design consideration: name-based dedup removes template rules entirely, not just deprioritizes them
Request rules are placed before template rules, which is correct for first-match-wins precedence. However, the additional name-based dedup (overridden map, line ~341) removes template rules from the output entirely when their name matches a request rule, rather than just letting them be deprioritized by position.
This means a sandbox user who knows a template rule's name (e.g., "block-metadata") can craft a request rule with the same name and a permissive action to bypass that template security rule. The template rule would be gone from the evaluated set — not just out-prioritized.
The old code (in-place overwrite in the base list) had the same concern, and at minimum the behavior should be documented as a known trait. If template rules are meant to be mandatory security boundaries, consider using position-only precedence and removing the name-based dedup.
// Option: position-only precedence, no name-based removal
out := make([]*types.EgressRule, 0, len(extra)+len(base))
for _, r := range extra {
out = append(out, cloneEgressRule(r))
}
for _, r := range base {
out = append(out, cloneEgressRule(r))
}| @@ -134,6 +134,7 @@ func cloneCubeNetworkConfig(in *types.CubeNetworkConfig) *types.CubeNetworkConfi | |||
| out := &types.CubeNetworkConfig{ | |||
There was a problem hiding this comment.
Missing AllowPublicTraffic field — silent data loss risk
This cloneCubeNetworkConfig deep-copies AllowOut, DenyOut, and Rules, plus AllowInternetAccess, but does not clone AllowPublicTraffic. The equivalent code in cubeboxutil.go:cloneCubeNetworkConfigBase (lines 316-319) does copy it:
if in.AllowPublicTraffic != nil {
allowPublicTraffic := *in.AllowPublicTraffic
out.AllowPublicTraffic = &allowPublicTraffic
}If template creation from image ever starts using AllowPublicTraffic, this omission would silently drop that field. Either copy AllowPublicTraffic too, or add a comment explaining why it's intentionally excluded.
| } | ||
| if mem.Cmp(config.GetConfig().Scheduler.MaxMvmMemoryRes()) >= 0 { | ||
| err = ret.Errorf(errorcode.ErrorCode_MasterParamsError, "request Resources mem[%dKB] is invalid", | ||
| return cpu, mem, ret.Errorf(errorcode.ErrorCode_MasterParamsError, "request Resources mem[%dKB] is invalid", |
There was a problem hiding this comment.
Minor: double space in error message
"request Resources mem[%dKB] is invalid" has an extra space between "Resources" and "mem". The CPU message above (line 110) uses a single space: "request Resources cpu[%dm] is invalid".
Motivation
The template / sandbox request materialization path currently has a few correctness gaps where accepted or configured fields are not preserved consistently.
The most user-visible issue is in template creation from image:
cube_network_config.rulescan be submitted successfully, but the stored templatecreate_requestdrops therulesarray while keeping the rest ofcube_network_config.This PR fixes that confirmed regression and also tightens two closely related resource-handling cases in the same request materialization path:
Resourcesfield is initially nilReproduction
I first observed the network rules regression on a one-click deployment running
v0.4.0.Steps:
allowInternetAccessand an egressrulesentry insidecube_network_config.Before this fix, the stored
create_request.cube_network_configkeeps:{ "allowInternetAccess": true }but silently drops the submitted
rulesarray.That means the request is accepted, but the resulting template no longer carries the intended egress rule set forward into later sandbox creation.
What This PR Changes
Preserve
cube_network_config.ruleswhen cloning a template create request fromCreateTemplateFromImageReqintoCreateCubeSandboxReq.pkg/templatecenter/template_request.goPreserve the first resource validation error when both CPU and memory exceed configured limits.
pkg/service/sandbox/util.goPreserve template CPU / memory defaults when merging template resources into a request with
Resources == nil.pkg/service/httpservice/cube/cubeboxutil.goapplyTemplateResources()return the effective*types.Resource, and callers keep the returned pointerAdd focused regression tests for all three cases.
Validation
Manual validation on the deployed cluster:
POST /cube/template/from-imageacceptscube_network_config.rules, butGET /cube/template?...&include_request=truedropsruleswhile keepingallowInternetAccess.POST /cube/sandboxwithcpu=101, mem=2000Miis rejected with a CPU validation error.POST /cube/sandboxwith bothcpu=101andmem=301Gireturns the later memory validation error, which matches the validation-order issue fixed by this PR.Focused regression tests on this branch:
The focused
pkg/service/sandboxtests cover both thecpu overflow + mem validcase and thecpu overflow + mem overflowcase.Scope
This PR is limited to request correctness in the template / sandbox materialization path.
It does not change scheduler limits, template policy semantics, or introduce any new API fields.
The main behavior change is that already accepted request fields and already configured template resource defaults are now preserved consistently when requests are cloned, validated, and merged.