Agency/docs/ARCHITECTURE.md
Ryan Schultz 37b5eb4609 docs: ADR 008 — platform weight governance
Defines how platform weights are set and changed: founding vote
(initiator defines eligible members, averaged proposals), annual
community vote (all platforms simultaneously, median of submitted
distributions), and structural change tier.

Updates ADR 002 and ADR 007 to reflect the new mechanism, and
ARCHITECTURE.md to mark weight governance as resolved rather than
a future direction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 20:31:32 -05:00

207 lines
9.9 KiB
Markdown

# Architecture Overview
This document describes how the Agency system works, why it is structured the way it is,
and how to reason about it. It is updated with each meaningful change to the codebase.
---
## What this system does
Agency is a participation signal system for open source communities.
It was built with OSArch as the first community, but is designed from the start to be
adopted, forked, and adapted by any community that wants the same thing:
a visible, legible signal of who is contributing and where.
It observes where community members are active (forum, code, wiki, chat, funding) and
produces a ranked participation signal. Each community controls its own weights, data
sources, and interpretation of the output.
It is not a governance engine. It does not make decisions. It makes existing human activity
legible so that people can act with more confidence, knowing their contributions are seen.
The core idea: **legitimacy comes from participation, not structure.**
> OSArch is the reference implementation. If you are from another community reading this,
> everything here is designed to be forked. Start with `config.yaml`.
---
## Data flow
```
data/*.json → src/scoring/score.py → src/outputs/table.py
(participation data) (weighted formula) (ranked table)
config.yaml
(adjustable weights)
```
1. A JSON file holds raw participation counts per user per platform
2. `config.yaml` defines how much each platform counts toward the score
3. `score.py` applies the weights and returns a numeric signal per user
4. `aggregate.py` collects all users, sorts by score
5. `table.py` renders the result as a human-readable ranked list
---
## Key design principles
### Correctability over precision
Weights are in `config.yaml`, not hardcoded. Anyone can fork the repo, change the weights,
and run a different interpretation. Disagreement is a feature, not a bug.
### Fake data first, real integrations later
The system starts with hand-crafted sample data in `data/sample.json`. Real API integrations
come one at a time, after the model is stable. This prevents over-engineering before the
signal itself is validated.
### Git as the distributed database
All participation data lives in version-controlled JSON files. The entire history of the
system — data, weights, and scoring logic — is visible in git. Forking the repo forks the
system. This is also how another community adopts it: fork, replace the data, adjust the
weights, run it for their context.
### Public data only
Collectors may only use publicly accessible data — no API keys, authentication, or
platform permission required. If the data isn't public, it isn't in scope. This keeps
the system independent, immediately forkable, and auditable: anyone can verify what
is being collected by visiting the same public URLs the collectors use.
See [ADR 003](decisions/003-public-data-only.md) for the full reasoning.
### No black boxes
Every scoring decision is visible in plain text. A newcomer (human or AI) should be able
to read `config.yaml` and `score.py` and understand exactly how a score is produced.
### OSArch is the reference implementation, not the intended audience
All documentation — ADRs, architecture notes, templates — is written for any community
adopting this system, not specifically for OSArch. OSArch examples are framed explicitly
as examples. A contributor from a community that has never heard of OSArch should be
able to read any document in this repo and act on it. See [docs/STYLE.md](STYLE.md) for
the full documentation conventions.
---
## Directory structure
```
agency/
main.py entry point, CLI
config.yaml scoring weights (community-adjustable)
requirements.txt Python dependencies
data/
sample.json mock participation data (starting point)
src/
collectors/ future: one file per data source (forum, github, wiki, etc.)
scoring/
score.py weighted scoring formula for a single user
aggregate.py applies score() across all users, returns ranked dict
outputs/
table.py renders ranked scores as a CLI table
docs/
ARCHITECTURE.md this file
decisions/ one file per significant decision (ADR format)
```
---
## Future directions (not yet built)
### Distributed database / tamper-evident records
The long-term goal is a data layer where no single party can retroactively alter participation
history. Blockchain is one path to this. The challenge with public chains (Ethereum, etc.) is
gas costs per write and the wallet/token barrier, which would exclude most OSArch contributors.
More accessible alternatives to evaluate when the time comes:
- **IPFS + content-addressed JSON** — immutable, distributed, no fees, no wallets required
- **Hypercore / Dat protocol** — append-only logs with cryptographic integrity, peer-to-peer
- **Signed append-only log** — GPG-signed JSON commits; tamper-evident without any chain
- **Private/consortium blockchain** — full blockchain properties without public gas costs,
but reintroduces a trust question about who runs the nodes
The git-tracked JSON approach used today already provides a weak form of this: history is
visible and forks are public. The upgrade path is additive, not a rewrite.
See [ADR 001](decisions/001-python-and-json.md) for why git-tracked JSON was chosen to start.
### Funding signals: merged or separate?
Currently all funding activity is collapsed into a single low-weighted signal (`funding_activity: 0.1`).
An open question for when real funding collectors are built:
- Should all funding platforms (Open Collective, GitHub Sponsors, Patreon, etc.) roll into
one combined `funding_activity` score — money is money regardless of platform?
- Or should they be distinct signals, because the *transparency* of the funding act matters
as much as the act itself? Open Collective is fully public; other platforms are not.
A public, traceable funding contribution may be meaningfully different from a private one.
This is unresolved. The decision should be made when the first funding collector is built,
not before.
### Should funding have representation beyond a score weight?
Funding currently contributes to agency scores at a low weight (`0.1`). An open question
is whether funders — particularly sustained, transparent funders — should have a distinct
form of representation beyond that score contribution.
The case for: sustained funding is a form of commitment. Excluding it from representation
entirely may signal that financial support is unwelcome, which could affect the project's
long-term sustainability.
The case against: the moment funding buys representation, a wealthy actor who never
participates could outweigh someone who has contributed for years. That is the exact
capture problem this system is designed to avoid.
A possible middle path: weight transparent, sustained funding more generously than
one-off donations within the existing score — giving it more voice without creating a
separate governance track. But whether that is sufficient, or whether funders deserve
distinct representation of some kind, is an open question.
This should be a community decision before any funding collector is built.
### Community weight governance
Platform weights are determined by community vote, not by maintainers or direct
`config.yaml` edits. The mechanism — founding vote, annual adjustment via averaged
proposals, and structural change process — is defined in
[ADR 008](decisions/008-platform-weight-governance.md). The goal is to keep
meta-governance from being captured by whoever scores highest.
### How should the community decide which sites to collect from?
The list of platforms the system collects from is not a technical decision — it defines
what "participation" means. Adding a platform amplifies activity there; removing one
diminishes it. This is a form of power that should be community-governed.
This is an open question. One possible approach:
*Criteria-first, then process.* Before debating any specific platform, the community
agrees on a set of inclusion criteria. A candidate site would need to meet all of them:
- Data is publicly accessible (required — see ADR 003)
- Platform is actively used by a meaningful portion of the community
- Activity on the platform represents genuine effort, not easily gamed
- Platform is stable enough to depend on
- Platform is relevant to the community's actual work, not peripheral activity
With clear criteria, the process becomes more mechanical: open a proposal using the
[PROPOSAL_TEMPLATE.md](sites/PROPOSAL_TEMPLATE.md) (a file in `docs/sites/proposed/`),
a defined discussion period, then a vote. Voting would require a minimum agency score
threshold — proving you are an active participant before having a say in what participation
means. Above that threshold, every vote counts equally regardless of rank. Approved sites move to `docs/sites/active/`. Modifications to existing sites
(scope changes, weight adjustments) use [CHANGE_TEMPLATE.md](sites/CHANGE_TEMPLATE.md).
Removed sites move to `docs/sites/retired/` using [CHANGE_TEMPLATE.md](sites/CHANGE_TEMPLATE.md)
with a reason recorded.
Retired sites stay in the record permanently. If a platform was removed because it was
being gamed or because the community migrated away, that history matters for future decisions.
This is one possible path forward. The right process should be decided by the community
before the first real collector is built and the list becomes consequential.
---
## Current limitations (known, intentional)
- Scores are not normalized — raw weighted sums, not percentages
- No deduplication across platforms (same person with different usernames counts separately)
- Data is hand-entered, not yet pulled from live APIs
- No time windowing — all activity is treated as equally recent
These are not oversights. They are deliberate starting points. See [docs/decisions/](decisions/) for
the reasoning behind each.