Releases: patcon/valency-anndata
Releases · patcon/valency-anndata
v0.3.0
Added
val.pp.filter_participants()— filter participants (rows) by minimum number of statements voted on. Counts non-NaNentries (real votes), correctly treating-1,0, and+1as votes.val.pp.filter_statements()— filter statements (columns) by minimum number of participants who voted. Counts non-NaNentries (real votes), correctly treating-1,0, and+1as votes.val.datasets.vtaiwan()— load any of four Polis conversations from Taiwan's vTaiwan civic policymaking process, selectable bytopic=keyword ("uber","airbnb","online_alcohol","caning"). Topic parameter usesLiteraltype hint for IDE/notebook autocomplete.val.datasets.american_assembly()— load Polis conversations run by the American Assembly in Kentucky cities, selectable bycity=keyword ("bowling_green","louisville"). City parameter usesLiteraltype hint for IDE/notebook autocomplete.val.datasets.bg2050()— load the BG 2050 community visioning conversation from Bowling Green and Warren County, Kentucky (~7,900 participants).val.datasets.cuba_protest()— load any of three Polis conversations run around Cuba's planned 15N march (November 2021), selectable byperiod=keyword ("before_1","before_2","after"). Period parameter usesLiteraltype hint for IDE/notebook autocomplete.val.datasets.japanchoice()— load any of eight Polis conversations from Japan Choice (four policy topics × two election years: 2025 and 2026), selectable by positionaltopicargument.val.datasets.klimarat()— load any of the five Polis conversations from Austria's Citizens' Climate Council (Klimarat), selectable bytopic=keyword.- Five Klimarat datasets added to the docs overview table with fingerprints.
scripts/generate_fingerprint_heatmap.py— generates a square RdYlGn vote-matrix heatmap from any Polis report URL.docs/api/datasets.yml— machine-readable registry of reference datasets rendered into an overview table.- Two reference datasets added to the docs table: Aufstehen and Chile Protests.
mkdocs-glightbox— clicking a fingerprint thumbnail opens the full-size image in a lightbox popup.- New Labs page in docs.
make strip-notebook-widgets— strips ipywidget metadata from notebooks so they render correctly on GitHub.
Changed
val.preprocessing.impute()now usessklearn.impute.SimpleImputerfor"zero","mean", and"median"strategies. Addsstrategy="knn"backed bysklearn.impute.KNNImputer.val.preprocessing.highly_variable_statements()defaults changed:variance_modeis now"valence",bin_byis now"p_engaged", andn_binsis now10.
Fixes
val.preprocessing.highly_variable_statements()no longer emitsRuntimeWarning: Degrees of freedom <= 0 for slicewhen a statement column has fewer than 2 non-NaN votes.- Bugfix: scaling factors in
recipe_poliswere dividing instead of multiplying!
v0.2.0
Added
hf:andhuggingface:source prefixes forval.datasets.polis.load()— load any HuggingFace-hosted Polis export as a one-liner, e.g.load("hf:patcon/polis-aufstehen-2018")(#81).CLAUDE.mdguidance file for Claude Code contributors (#58).- Pytest infrastructure and test suite for
datasets.polis.load(#59).- 29 unit + local-fixture tests; 4 opt-in live network tests (
make test-live). - Synthetic and real CSV fixtures checked in under
tests/fixtures/. make testandmake test-livetargets added to Makefile.
- 29 unit + local-fixture tests; 4 opt-in live network tests (
- Unit and integration tests for
tools.kmeans(#63).- 22 mocked unit tests + 1 real-clustering integration test.
- 3
k-means++smoke tests.
val.tl.recipe_polis2_statements()— embeds and clusters statements (var axis) via polismath (#44).- New
polis2optional-dependency group (pip install valency-anndata[polis2]). - 13 unit tests with all polismath helpers mocked.
- Noise/unassigned cluster labels (
-1) inevoc_polis2_topare stored asNAso scanpy renders them as lightgray by default. show_progress=False(the default) now fully silences HF download progress bars and mlx model-load stdout.- "Polis 2.0 Pipeline" tutorial added to docs nav.
- New
val.preprocessing.highly_variable_statements()— identify highly variable statements in vote matrices (#52).- Analogous to scanpy's highly_variable_genes for single-cell data.
- Supports multiple variance modes (overall, valence, engagement) and binning strategies.
key_addedparameter allows running multiple times with different settings.val.viz.highly_variable_statements()plotting function for visualizing dispersion metrics.mask_varparameter added toval.tools.recipe_polis(),val.tools.pacmap(), andval.tools.localmap()for filtering statements before dimensionality reduction.
val.write()— export AnnData to h5ad with automatic sanitization for webapp compatibility (#57).includeparameter for selective export using glob-style"namespace/key"paths (e.g."obsm/X_*").
make lintandmake fmttargets for ruff.- Claude Code skill for guided Polis conversation exploration (#42).
- Interactive prompts for projection selection (PaCMAP, LocalMAP, UMAP, t-SNE) and QC annotation selection.
- Fixed CLI plotting to support multi-color
val.viz.embedding()calls.
- Cache downloaded Polis report files locally for 24 hours using
platformdirs(#70).skip_cacheparameter onval.datasets.polis.load()to bypass the cache.- Smart cache revalidation using
last_vote_timestampfrom the Polis math endpoint — stale cache is reused without re-fetching when no new votes have been cast (#78).
mask_obsparameter onval.tools.kmeans()for clustering a subset of participants (#77).val.datasets.polis.export_csv()— export an AnnData object to Polis CSV format (votes.csv+comments.csv).include_huggingface_metadataparameter onval.datasets.polis.export_csv()— opt-in generation of a HuggingFace dataset card (README.mdwith YAML frontmatter) alongside the CSV export.show_progressparameter onval.datasets.polis.load()— displays a tqdm progress bar when fetching votes per-participant from the API; auto-detects notebooks vs terminal (#79).
Fixes
- Fixed
uns["statements"]havingcomment-idas both index and column, which prevented h5ad serialization (#57). - Fixed API vote sign inversion — the Polis API returns inverted vote signs vs the CSV export convention; votes are now negated on ingest so
+1= agree and-1= disagree everywhere. - Replaced deprecated
use_highly_variable=Falsewithmask_var=Noneinrecipe_polisPCA call to eliminateFutureWarningfrom scanpy (#82).
v0.1.1
v0.1.0
Initial release includes:
val.viz.schematic_diagram()helper to showing data structure and visual diffs.- Helper methods for PaCMAP and LocalMAP dimensional reduction.
- Langevitour visualisation for exploring high dimensional space.
- Basic Jupyter Scatter support for up to 1M participants, including animations.
- Import of Polis conversation data.
- Basic Polis v1 pipeline support.
- Added
val.tools.kmeans(). - Large reference datasets for Aufstehen political party consultation (33k participants) and the #ChileDesperto protest (3k).
val.viz.voter_vignette_widget()for exploring data stories of random individuals.- Comprehensive documentation website.
- Wrappers for various scanpy methods.