Skip to content

v0.2.0

Choose a tag to compare

@patcon patcon released this 17 Feb 04:38
· 84 commits to main since this release

Added

  • hf: and huggingface: source prefixes for val.datasets.polis.load() — load any HuggingFace-hosted Polis export as a one-liner, e.g. load("hf:patcon/polis-aufstehen-2018") (#81).
  • CLAUDE.md guidance file for Claude Code contributors (#58).
  • Pytest infrastructure and test suite for datasets.polis.load (#59).
    • 29 unit + local-fixture tests; 4 opt-in live network tests (make test-live).
    • Synthetic and real CSV fixtures checked in under tests/fixtures/.
    • make test and make test-live targets added to Makefile.
  • Unit and integration tests for tools.kmeans (#63).
    • 22 mocked unit tests + 1 real-clustering integration test.
    • 3 k-means++ smoke tests.
  • val.tl.recipe_polis2_statements() — embeds and clusters statements (var axis) via polismath (#44).
    • New polis2 optional-dependency group (pip install valency-anndata[polis2]).
    • 13 unit tests with all polismath helpers mocked.
    • Noise/unassigned cluster labels (-1) in evoc_polis2_top are stored as NA so scanpy renders them as lightgray by default.
    • show_progress=False (the default) now fully silences HF download progress bars and mlx model-load stdout.
    • "Polis 2.0 Pipeline" tutorial added to docs nav.
  • val.preprocessing.highly_variable_statements() — identify highly variable statements in vote matrices (#52).
    • Analogous to scanpy's highly_variable_genes for single-cell data.
    • Supports multiple variance modes (overall, valence, engagement) and binning strategies.
    • key_added parameter allows running multiple times with different settings.
    • val.viz.highly_variable_statements() plotting function for visualizing dispersion metrics.
    • mask_var parameter added to val.tools.recipe_polis(), val.tools.pacmap(), and val.tools.localmap() for filtering statements before dimensionality reduction.
  • val.write() — export AnnData to h5ad with automatic sanitization for webapp compatibility (#57).
    • include parameter for selective export using glob-style "namespace/key" paths (e.g. "obsm/X_*").
  • make lint and make fmt targets for ruff.
  • Claude Code skill for guided Polis conversation exploration (#42).
    • Interactive prompts for projection selection (PaCMAP, LocalMAP, UMAP, t-SNE) and QC annotation selection.
    • Fixed CLI plotting to support multi-color val.viz.embedding() calls.
  • Cache downloaded Polis report files locally for 24 hours using platformdirs (#70).
    • skip_cache parameter on val.datasets.polis.load() to bypass the cache.
    • Smart cache revalidation using last_vote_timestamp from the Polis math endpoint — stale cache is reused without re-fetching when no new votes have been cast (#78).
  • mask_obs parameter on val.tools.kmeans() for clustering a subset of participants (#77).
  • val.datasets.polis.export_csv() — export an AnnData object to Polis CSV format (votes.csv + comments.csv).
  • include_huggingface_metadata parameter on val.datasets.polis.export_csv() — opt-in generation of a HuggingFace dataset card (README.md with YAML frontmatter) alongside the CSV export.
  • show_progress parameter on val.datasets.polis.load() — displays a tqdm progress bar when fetching votes per-participant from the API; auto-detects notebooks vs terminal (#79).

Fixes

  • Fixed uns["statements"] having comment-id as both index and column, which prevented h5ad serialization (#57).
  • Fixed API vote sign inversion — the Polis API returns inverted vote signs vs the CSV export convention; votes are now negated on ingest so +1 = agree and -1 = disagree everywhere.
  • Replaced deprecated use_highly_variable=False with mask_var=None in recipe_polis PCA call to eliminate FutureWarning from scanpy (#82).