Skip to content

Vendor EPEL: add update_epel_pes.py, refresh PES data#196

Open
yuravk wants to merge 6 commits intoAlmaLinux:devel-ng-0.24.0from
yuravk:devel-ng-0.24.0
Open

Vendor EPEL: add update_epel_pes.py, refresh PES data#196
yuravk wants to merge 6 commits intoAlmaLinux:devel-ng-0.24.0from
yuravk:devel-ng-0.24.0

Conversation

@yuravk
Copy link
Copy Markdown
Collaborator

@yuravk yuravk commented Apr 29, 2026

Addupdate_epel_pes.py to refresh the EPEL Vendor PES template.
Refresh epel_pes.json_template for 8to9 and 9to10 from current EPEL repodata.
Fix path to epel.gpg for AlmaLinux 10 and Kitten x86_64_v2 repos.
CI: upload leapp-data SRPM and noarch RPM as workflow artifact.

yuravk added 5 commits April 29, 2026 15:28
…mplate

Until now vendors.d/epel_pes.json_template (the source of EPEL Vendor PES
data shipped to leapp during ELevate upgrades) had to be curated by hand.
This change introduces a script that refreshes it from the live EPEL
repodata, so a single command can keep the template aligned with what
maintainers actually publish on dl.fedoraproject.org.

How it works
------------
For each requested upgrade path (7to8, 8to9, 9to10) the script:

* Downloads (and caches under tools/.cache/repodata/) the source-side
  and target-side primary.xml feeds for every requested arch. Handles
  gzip, xz, and zstd compression (the latter via python zstandard /
  pyzstd, falling back to the system zstd CLI - EPEL 10 ships .zst).
  Uses requests so SSL verification works out of the box on macOS.
* Builds per-PESID indexes of binary package names plus a reverse
  Obsoletes graph from the target side.
* Synthesizes PES events from the diff between source and target,
  picking the source PESID via a small preference list (epel >
  extras > base) so EL7 base/extras content is attributed correctly
  on the 7to8 path.
* Merges the synthesized events into the existing template by event
  signature: matching events only have their architectures list
  refreshed; new events get globally unique id/set_id values
  allocated above the max found across files/<distro>/pes-events.json
  and every vendors.d/*pes*.json*. The temporary backup is written to
  $TMPDIR so the validators do not pick it up as a duplicate source.
* Refreshes the file's timestamp and provided_data_streams (mirrored
  from files/almalinux/pes-events.json), then runs validate_json.py
  and validate_ids.py against the same set of files check.sh exercises;
  on failure the previous file is restored unless --force is given.

Action emission policy
----------------------
The default policy is conservative on purpose - vendor PES data should
not over-aggressively rewrite the user's package set:

* REPLACED (3), SPLIT (4) and MERGED (5) - emitted when the target
  side Obsoletes a source-side name AND the source-side name is
  itself absent on the target side. The Obsoletes-driven candidates
  are grouped by their target packageset, then the action is chosen:
  - 1 source -> 1 target  =>  REPLACED
  - 1 source -> N targets =>  SPLIT
  - N >= 2 sources -> any =>  MERGED  (e.g. EPEL 9 `tmt` Obsoletes
    its four `tmt-report-*` subpackages, which collapse to one
    MERGED instead of four separate REPLACED).
  When the source name still exists on the target side, dnf will
  upgrade it in place and any sibling Obsoletes is almost always a
  soft hygiene marker (e.g. NetworkManager-openconnect-gnome carrying
  `Obsoletes: NetworkManager-openconnect < 1.2.3-0`, or a versioned
  cleanup left from a prior in-distro bump). Treating those as
  REPLACED would force-pull the obsoleter (e.g. the GUI flavour onto
  a server that only had the base package), so they are deliberately
  not turned into PES events.
* MOVED (6) - skipped by default. When a package keeps the same name
  across the two EPEL majors, dnf already resolves the repo move on
  its own (verified empirically with the 3cpio package on AlmaLinux
  9->10), so a PES event would only bloat the template.
* REMOVED (1) - skipped by default. EPEL repos, especially EL10, are
  still being populated by maintainers; a name that is missing from
  the target today is more often "not yet rebuilt" than "permanently
  dropped". Emitting REMOVED would have leapp uninstall those
  packages during the upgrade, which is too aggressive for vendor PES
  data.

Both skip behaviours are governed by module-level toggles
(DEFAULT_INCLUDE_MOVED, DEFAULT_INCLUDE_REMOVED) and matching CLI flags
(--include-moved, --include-removed) for one-off audits.

CLI:
    update_epel_pes.py [--paths 7to8,8to9,9to10] [--archs ...]
                       [--cache-dir DIR] [--dry-run] [--force]
                       [--include-moved] [--include-removed]

Also adds tools/.cache/repodata/.gitignore so cached repodata stays out
of the working tree.
Paths refreshed: 8to9
Architectures: x86_64, aarch64, ppc64le, s390x
Data timestamp 202604290842Z
Data stream version ['4.3']

Bump the package release.
Paths refreshed: 9to10
Architectures: x86_64, aarch64, ppc64le, s390x
Data timestamp 202604290845Z
Data stream version ['4.3']

Bump the package release.
Switch the gpgkey path in the el10 almalinux and almalinux-kitten
x86_64_v2 EPEL repo files to the standard epel.gpg key shipped under
/etc/leapp/files/vendors.d/rpm-gpg/, replacing the
RPM-GPG-KEY-AlmaLinux-$releasever_major-EPEL-AltArch reference.
When `leapp-data-git: true`, build the SRPM alongside binary RPMs (-ba),
marshal them under /vagrant/ inside the guest, then pull them out to
the runner workspace with vagrant scp (mirroring the leapp-logs pattern,
since the /vagrant share is not reliably visible host-side).

Add a new `${variant_long}-leapp-data-rpms` artifact upload step
containing `leapp-data-*.src.rpm` and the
`leapp-data-${target_distro}-*.noarch.rpm` produced by `leapp-data-rpm.sh`
during the "Build leapp-data package from Git, and install it" step.

Beautify the artifact name prefix: dash-separate the release number from
the distro in `${variant_long}` so the x86_64_v2 variants no longer
glue `_v2` to the trailing `10` (e.g. `almalinux-x86_64_v2-10` instead
of `almalinux-x86_64_v210`).

Also document all three workflow artifacts (leapp logs, serial console
log, leapp-data RPMs) in README.md under the CI section.
Re-enable the Microsoft vendor (including the CI), and bundle
both Microsoft signing keys in vendors.d/rpm-gpg/microsoft.gpg:
BE1229CF ("Microsoft (Release signing)", used for el8/el9 packages)
and F748182B ("Microsoft Corporation - General GPG Signer", used for el10 packages).

Also switch microsoft.repo.el{8,9,10} to reference the bundled
key via a local file:// URL instead of pulling it from
packages.microsoft.com at upgrade time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant