🐛 ensure mongodb secondaries always get votes:1 and priority:1 by DarkIsDude · Pull Request #2449 · scality/Zenko

DarkIsDude · 2026-06-26T08:25:01Z

What does this PR do, and why do we need it?

It fixes a bug in the MongoDB sharded image bootstrap (libmongodb.sh) that can leave every replica set with only one voting member, turning that single node into a hard single point of failure: if it goes down, the shard (and therefore the datastore) goes read‑only and Zenko is down.

A 30‑second MongoDB primer (for non‑Mongo readers)

Our data lives in replica sets: groups of MongoDB nodes (here, 3 per shard and 3 for the config server) that each hold a copy of the data.

One node is the PRIMARY (takes writes); the others are SECONDARIES (copies).
To stay alive, a replica set must keep a majority of votes online and be able to elect a PRIMARY. Two settings per member matter:
- votes (0 or 1): can this member vote in an election? Majority is counted over the sum of votes.
- priority (0+): can this member be elected PRIMARY? priority: 0 means "never become PRIMARY".
A healthy 3‑node set has 3 members with votes: 1, priority: 1. It survives losing any one node: the remaining 2 votes are still a majority, and a surviving member can be elected PRIMARY.

A member that is only partially configured — votes: 1, priority: 0, or worse votes: 0, priority: 0 — still holds data but cannot help keep the set alive.

The bug

When a SECONDARY first joins, the bootstrap script does this (mongodb_configure_secondary):

rs.add(... votes: 0, priority: 0) — add the node without voting power so it doesn't disturb the existing majority while it copies data (this is MongoDB's recommended safe procedure).
Confirm the node is now listed in the replica set (mongodb_node_currently_in_cluster).
Wait for it to finish its initial sync (reach SECONDARY state).
Grant voting rights: reconfigure it to votes: 1, priority: 1.

The confirmation in step 2 reads rs.status() and greps the output for the node:

result=$(mongodb_execute ... <<< "rs.status().members")
grep -q "'$node:$port'" <<<"$result"

The problem: with the broken image, mongodb_execute is a thin wrapper around debug_execute, which throws away stdout unless BITNAMI_DEBUG=true (it isn't, by default):

debug_execute() { if is_boolean_yes "${BITNAMI_DEBUG:-false}"; then "$@"; else "$@" >/dev/null 2>&1; fi; }

So result is always empty, the grep always fails, and mongodb_node_currently_in_cluster always returns false. That makes step 2 (mongodb_wait_confirmation) time out and the bootstrap aborts with:

ERROR ==> Unable to confirm that <node> has been added to the replica set!

The container exits, Kubernetes restarts the pod, and on the second boot the data directory already exists, so the bootstrap takes the "deploy with persisted data" path and skips replica set configuration entirely. The node is left frozen at votes: 0, priority: 0. Steps 3–4 never run, so the SECONDARY is never promoted.

Net result: only the bootstrap PRIMARY (*-0) ends up with a vote → 1 voter per replica set → single point of failure.

Root cause: an incomplete fork of the Bitnami image

Bitnami stopped publishing the mongodb-sharded image, so we vendored its scripts into the repo (ZENKO-5110, #2366). The vendoring happened in two commits, and they are not equal:

Commit	Branch	`libmongodb.sh`	`mongodb_execute()` defs	Result
`0ae8c8a3`	development/2.14	1712 lines	2	✅ works (→ 2.14.5)
`2f6c7e42`	development/2.15	1669 lines	1	❌ broken (→ 2.15.1)

Compare: the only relevant difference is the last 43 lines of libmongodb.sh.

Upstream's libmongodb.sh is assembled by concatenating script fragments, and it deliberately defines mongodb_execute twice:

Early in the file — the legacy wrapper that discards stdout (debug_execute mongodb_execute_print_output "$@").
As a final appended fragment (it carries its own # Copyright … header and # shellcheck disable=SC2148 — the tell‑tale "no shebang" marker of a separate concatenated file) — the real version that calls mongosh directly and returns output.

In bash the last definition wins, so upstream's effective mongodb_execute is #2 (returns output) — which is exactly what mongodb_node_currently_in_cluster needs.

The 2.15 re‑vendoring (2f6c7e42) truncated the file at 1669 lines and dropped that final fragment. Only the output‑discarding wrapper was left, silently reverting mongodb_execute and breaking the confirmation check. The 2.14 vendoring had copied the whole file, so 2.14.5 worked. We didn't add a fix in 2.14 — we just vendored completely there, and lost it in 2.15.

It was easy to miss because dropping a duplicate function definition leaves valid bash that runs fine; the only signal was the line count (1712 vs 1669), and the failure only surfaces as a silent bootstrap race that is invisible until a node dies.

How we confirmed it (two clusters, same Mongo version)

A cluster on image base 2.14.5 was healthy (3 voters); a cluster on 2.15.1 was broken (1 voter). We confirmed the chain end‑to‑end from the live rs.conf() (broken cluster: secondaries at votes: 0, priority: 0; healthy cluster: votes: 1) and the pod boot logs (broken cluster fails at "Unable to confirm…"; healthy one gets past it). The diff between the two images' libmongodb.sh was exactly the 43‑line fragment above.

The fix

Two commits:

🐛 add missing libmongo.sh fork — restores the dropped 43‑line fragment, so the file matches upstream again. mongodb_execute is once more the output‑returning definition, mongodb_node_currently_in_cluster can read rs.status(), and the bootstrap no longer aborts before granting voting rights. This is the root‑cause fix (faithful re‑sync with upstream).
🐛 ensure secondary keeps votes and priority after restart — makes the voting‑rights grant idempotent and re‑runnable, so a node can no longer be left stranded without votes:
- new helper mongodb_secondary_node_has_voting_rights checks whether the member already has votes > 0 && priority > 0;
- mongodb_configure_secondary now grants voting rights whenever they are missing — even if the node is already in the cluster, instead of only on the freshly‑added path. So if a previous attempt added the node at votes/priority 0 and then failed (or was restarted) before promotion, the next run finishes the job and converges it to votes: 1, priority: 1;
- the misleading error message ("did not get marked as secondary" printed for the voting step) is corrected to "did not get granted voting rights".

After this change a fresh deployment reliably ends with all members at votes: 1, priority: 1 (true HA), and a member that is already in the replica set but under‑privileged is repaired rather than silently left non‑voting.

Which issue does this PR fix?

Fixes ZENKO-5302.

Special notes for your reviewers:

Already‑broken clusters won't fully self‑heal from the image alone. On a pod restart the bootstrap takes the persisted‑data path, which skips replica‑set (re)configuration entirely, so an existing cluster already stranded at votes: 0 still needs a one‑time manual rs.reconfig() on each replica set (shard-N and configsvr) to set members 1 and 2 to votes: 1, priority: 1. The script change guarantees correct behaviour for new bootstraps and for any path where mongodb_configure_secondary runs against a member that lacks voting rights.
Should we keep the second fix ? We can use a dedicated PR for that as we already have this issue on older cluster (4.2.3 have it).

bert-e · 2026-06-26T08:25:05Z

Hello darkisdude,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options

name	description	privileged	authored
`/after_pull_request`	Wait for the given pull request id to be merged before continuing with the current one.
`/bypass_author_approval`	Bypass the pull request author's approval	⭐
`/bypass_build_status`	Bypass the build and test status	⭐
`/bypass_commit_size`	Bypass the check on the size of the changeset `TBA`	⭐
`/bypass_incompatible_branch`	Bypass the check on the source branch prefix	⭐
`/bypass_jira_check`	Bypass the Jira issue check	⭐
`/bypass_peer_approval`	Bypass the pull request peers' approval	⭐
`/bypass_leader_approval`	Bypass the pull request leaders' approval	⭐
`/approve`	Instruct Bert-E that the author has approved the pull request.		✍️
`/create_pull_requests`	Allow the creation of integration pull requests.
`/create_integration_branches`	Allow the creation of integration branches.
`/no_octopus`	Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
`/unanimity`	Change review acceptance criteria from `one reviewer at least` to `all reviewers`
`/wait`	Instruct Bert-E not to run until further notice.

Available commands

name	description	privileged
`/help`	Print Bert-E's manual in the pull request.
`/status`	Print Bert-E's current status in the pull request `TBA`
`/clear`	Remove all comments from Bert-E from the history `TBA`
`/retry`	Re-start a fresh build `TBA`
`/build`	Re-start a fresh build `TBA`
`/force_reset`	Delete integration branches & pull requests, and restart merge process from the beginning.
`/reset`	Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

bert-e · 2026-06-26T08:25:10Z

Incorrect fix version

The Fix Version/s in issue ZENKO-5302 contains:

None

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

2.15.2

Please check the Fix Version/s of ZENKO-5302, or the target
branch of this pull request.

Issue: ZENKO-5302

delthas

LGTM, very clear writeup.

bert-e · 2026-06-26T13:22:00Z

Waiting for approval

The following approvals are needed before I can proceed with the merge:

the author
2 peers

francoisferrand

The "fix partial import" LGTM : this is upstream, this is how it was already running → not much risk there, and needs to be merged ASAP.

On the "secondary node voting rights" fix, I did not fully review but I am however much less confident: it is not upstream (hence not "proven" by experience), changes the whole flow of secondary process, and not tested... Also we did not see it in the field AFAIK, so no pressure : thus I would rather we move that improvement/fix to a separate PR, ideally upstreaming the change first, so we get an acknowledgment from bitnami, and can take the time to fully test this and ensure it won't create issue before we ship it.

bert-e · 2026-06-26T14:05:09Z

Waiting for approval

The following approvals are needed before I can proceed with the merge:

the author
2 peers

The following reviewers are expecting changes from the author, or must review again:

@francoisferrand

DarkIsDude self-assigned this Jun 26, 2026

🐛 add missing libmongo.sh fork

528728f

Issue: ZENKO-5302

DarkIsDude force-pushed the bugfix/ZENKO-5302/mongodb-secondary-voting-rights branch from 026ff16 to 528728f Compare June 26, 2026 09:30

claude Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread solution-base/images/mongodb-sharded/debian-12/rootfs/opt/bitnami/scripts/libmongodb.sh

claude Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread solution-base/images/mongodb-sharded/debian-12/rootfs/opt/bitnami/scripts/libmongodb.sh

🐛 ensure secondary keeps votes and priority after restart

8993623

Issue: ZENKO-5302

DarkIsDude requested review from a team, benzekrimaha and maeldonn June 26, 2026 09:52

DarkIsDude marked this pull request as ready for review June 26, 2026 09:53

delthas approved these changes Jun 26, 2026

View reviewed changes

francoisferrand requested changes Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 ensure mongodb secondaries always get votes:1 and priority:1#2449

🐛 ensure mongodb secondaries always get votes:1 and priority:1#2449
DarkIsDude wants to merge 2 commits into
development/2.15from
bugfix/ZENKO-5302/mongodb-secondary-voting-rights

DarkIsDude commented Jun 26, 2026 •

edited

Loading

Uh oh!

bert-e commented Jun 26, 2026

Uh oh!

bert-e commented Jun 26, 2026

Uh oh!

Uh oh!

Uh oh!

delthas left a comment

Uh oh!

bert-e commented Jun 26, 2026

Uh oh!

francoisferrand left a comment •

edited

Loading

Uh oh!

bert-e commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

DarkIsDude commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

A 30‑second MongoDB primer (for non‑Mongo readers)

The bug

Root cause: an incomplete fork of the Bitnami image

How we confirmed it (two clusters, same Mongo version)

The fix

Uh oh!

bert-e commented Jun 26, 2026

Hello darkisdude,

Uh oh!

bert-e commented Jun 26, 2026

Incorrect fix version

Uh oh!

Uh oh!

Uh oh!

delthas left a comment

Choose a reason for hiding this comment

Uh oh!

bert-e commented Jun 26, 2026

Waiting for approval

Uh oh!

francoisferrand left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bert-e commented Jun 26, 2026

Waiting for approval

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

DarkIsDude commented Jun 26, 2026 •

edited

Loading

francoisferrand left a comment •

edited

Loading