Protocol5 Local Memory Export For WordPress
I do **not** have access to the Protocol5 SQL schema, the actual embedding tables, or the field definitions behind Category.IotaEmbeddingRecords, Categories, Words, and ISO10646. This report therefore treats t...
Metadata
| Field | Value |
|---|---|
| Source site | ɩ.com / JustAnIota.com |
| Source URL | https://justaniota.com/ |
| Canonical AIWikis URL | https://aiwikis.org/justaniota/uai-system/files/raw-system-archives-justaniota-intake-processing-2026-05-04-protocol5-wo-c45cab80/ |
| Source reference | raw/system-archives/justaniota/intake-processing/2026-05-04-protocol5-wordpress-memory-export/agent-file-handoff/Improvement/Protocol5 Local Memory Export for WordPress.md |
| File type | md |
| Content category | memory-file |
| Last fetched | 2026-05-15T00:23:56.0837262Z |
| Last changed | 2026-05-04T23:19:24.4358637Z |
| Content hash | sha256:c45cab8047c2a29ff5a2597814997f401b20d6cc8cc920e4967de94c5cd70c67 |
| Import status | unchanged |
| Raw source layer | data/sources/justaniota/raw-system-archives-justaniota-intake-processing-2026-05-04-protocol5-wordpress-memory-export-ag-c45cab8047c2.md |
| Normalized source layer | data/normalized/justaniota/raw-system-archives-justaniota-intake-processing-2026-05-04-protocol5-wordpress-memory-export-ag-c45cab8047c2.txt |
Current File Content
Structure Preview
- Protocol5 Local Memory Export for WordPress
- Scope and recommended architecture
- Flat-file format decision
- Why the canonical export should be NDJSON or JSON Lines
- Why SQLite is a good secondary artifact, not the primary contract
- Why CSV plus sidecar metadata should not be the main format
- Why compressed JSON arrays are acceptable but still worse than NDJSON
- Flat-file export contract
- Contract principles
- Recommended record shape
- Field-by-field recommendations
- Recommended vector encoding
- Recommended manifest shape
- Optional sidecar for precomputed relatedness
- WordPress storage and query strategy
- Files under uploads or a protected plugin data directory
- Custom WordPress database tables
- Options and transients
- Static JSON endpoints
- Recommended hybrid strategy
- Vector search realities and REST surface
- The most important practical reality
- Can WordPress and PHP do vector similarity locally
- Best fallback strategies
Raw Version
This public page shows a bounded preview of a large source file. The complete source remains in the raw and normalized source layers named in metadata, with the SHA-256 hash above for verification.
- Source characters:
45000 - Preview characters:
11999
# Protocol5 Local Memory Export for WordPress
## Scope and recommended architecture
I do **not** have access to the Protocol5 SQL schema, the actual embedding tables, or the field definitions behind `Category.IotaEmbeddingRecords`, `Categories`, `Words`, and `ISO10646`. This report therefore treats the field list in your prompt as the contract boundary and recommends an exporter that flattens the real joins into a single, explicit export rowset. That constraint is important: the safest design is one that depends on a **stable export contract**, not on WordPress guessing undocumented SQL Server structure.
The public Protocol5 site positions Protocol5 as the .NET implementation and distribution hub, while UAIX remains the normative standards authority. UAIX’s public changelog also distinguishes reviewable conformance evidence from certification or endorsement. So the WordPress package described here should be presented as an **implementation-level memory/export mechanism**, not as official UAI conformance, certification, registry status, or hosted validation. citeturn0search0turn0search1turn18search0
The best first-version architecture is a **hybrid package-and-import model**:
1. A local C# exporter reads the Protocol5 SQL Server database and emits a **portable package** consisting of a manifest plus gzipped NDJSON/JSONL records.
2. A WordPress plugin accepts a **manual admin upload** of that package.
3. The plugin validates schema version, checksums, record counts, vector dimensions, and public-safety rules.
4. The plugin imports **searchable public metadata** into one or more **custom WordPress database tables**.
5. The plugin keeps the original package file as the portable source artifact, but serves search queries from the indexed WordPress tables so it never has to scan the full flat file on every request.
6. Public REST endpoints return only a **whitelisted safe projection** of each record, while vectors and raw package files remain private by default. citeturn4search0turn3search1turn0search2turn2search3turn14search2
That architecture is stronger than “flat-file only” for WordPress because it preserves portability **without** forcing every search request to parse large files, and it avoids live SQL Server access from the public site. It also fits ordinary WordPress hosting because it relies on standard WordPress plugin APIs, the site’s existing MySQL/MariaDB database, and ordinary file uploads. citeturn4search0turn3search1turn11search2turn0search3
## Flat-file format decision
### Why the canonical export should be NDJSON or JSON Lines
For the **canonical transport format**, I recommend **gzipped NDJSON/JSONL**. JSON Lines defines UTF-8 text, one valid JSON value per line, and disallows a BOM. The NDJSON spec describes essentially the same line-delimited JSON model and recommends newline-separated UTF-8 JSON texts. That makes the format easy to stream from C#, easy to validate one record at a time, easy to diff during development, and easy to import incrementally in PHP. PHP’s zlib support and gzip functions make `.gz` packaging straightforward on the WordPress side as well. citeturn5search0turn5search1turn7search0turn7search7
For this use case, NDJSON/JSONL is better than a single giant JSON array because standard JSON is one serialized value, which encourages whole-document parsing, while line-delimited JSON is naturally record-oriented. When a package grows, WordPress can process a `.ndjson.gz` file line by line and fail fast on a bad row instead of loading the entire corpus into memory first. citeturn8search1turn5search0turn5search1turn9search0
My recommendation is to standardize the package around:
- `manifest.json`
- one or more `records-*.ndjson.gz` shards
- optional `neighbors-*.ndjson.gz`
- optional `checksums.json`
That gives you a durable interchange surface while leaving room for future shards, derived indexes, or rollbacks.
### Why SQLite is a good secondary artifact, not the primary contract
A **SQLite file** is the best **secondary artifact**, not the primary interchange contract. On the plus side, PHP supports SQLite through both `PDO_SQLITE` and the `SQLite3` class, and the PDO SQLite driver is enabled by default in PHP builds according to the manual. A plain SQLite file is therefore a credible option for a more capable hosting environment or for a private/staged deployment that wants one self-contained queryable file. citeturn1search0turn1search1
The problem is portability at the edge of the feature set. As soon as you rely on FTS or vector extensions, you move from “ordinary file” to “server capabilities question.” SQLite supports run-time loadable extensions, and PHP’s `SQLite3::loadExtension()` expects an extension library located in the configured extension directory. That is much less universal than plain WordPress + MySQL/MariaDB + PHP file I/O. SQLite FTS5 itself can be compiled into SQLite or built as a loadable extension, which again means availability depends on how the environment was built. citeturn5search5turn17search1turn5search4
So my practical recommendation is:
- **Canonical package**: gzipped NDJSON/JSONL.
- **Optional derived artifact**: SQLite for controlled hosting, staging, or future acceleration.
- **Do not make WordPress depend on SQLite vector extensions for version one.**
### Why CSV plus sidecar metadata should not be the main format
CSV is fine for diagnostics, not for the canonical package. RFC 4180 documents CSV as `text/csv`, but CSV has no native structure for nested metadata, typed objects, or dense vector payloads. You can force vectors into quoted strings or external sidecar files, but at that point you are rebuilding a weaker JSON package with more parsing edge cases. JSON is explicitly designed for structured objects and arrays, which is exactly what this dataset needs. citeturn8search3turn8search1
I would therefore use CSV only for very small debug exports such as “show me the first hundred public-safe records,” not for the real transport contract.
### Why compressed JSON arrays are acceptable but still worse than NDJSON
A gzip-compressed single JSON file is workable, and PHP can create and consume gzip-compatible data. But a monolithic JSON array is still less operationally friendly than NDJSON because appending, sharding, partial reprocessing, corruption recovery, and per-record validation are all worse. NDJSON keeps the same JSON semantics where they matter while making operational handling much simpler. citeturn7search0turn7search7turn8search1turn5search0
## Flat-file export contract
### Contract principles
Because the source schema is not available here, the export contract should be defined as a **flattened, implementation-neutral row contract**. In other words, the SQL Server side can join `Category.IotaEmbeddingRecords` with `Categories`, `Words`, `ISO10646`, or any other local tables however it needs to, but the exported package should hand WordPress a stable, explicit shape that does **not** require any SQL knowledge to consume.
The contract should also be **public-safe by design**. WordPress plugin privacy guidance emphasizes collection limitation, data minimization, retention limits, and carefully controlling what appears in the front end and in REST endpoints. So the exporter should use an **allowlist** for `public_meta`, not a blocklist. citeturn14search2turn14search5
### Recommended record shape
I recommend a top-level flattened record with a nested `vector` object and a nested `public_meta` object:
```json
{
"id": "Categories:42:sha256-5e86c1...",
"source_table": "Categories",
"source_key": "42",
"descriptor_text": "Artificial intelligence memory category",
"embedding_version": "protocol5-local-v1",
"embedding_model": "nomic-embed-text-v1.5",
"embedding_dimensions": 768,
"text_hash": "sha256:5e86c16f0c0d7c6e3df8...",
"updated_utc": "2026-05-04T14:25:19.381Z",
"visibility": "public",
"public_meta": {
"kind": "category",
"title": "AI Memory",
"slug": "ai-memory",
"locale": "en-US",
"tags": ["memory", "search", "protocol5"]
},
"vector": {
"encoding": "f32-base64le",
"data": "AAAAQJqZmT8AAABAKC4..."
}
}
```
That shape is intentionally boring. Boring is good. It is easy to serialize from C#, easy to validate in PHP, easy to mirror into MySQL/MariaDB columns, and easy to redact if you later decide to suppress vectors or certain metadata fields.
### Field-by-field recommendations
`id` should be a **stable deterministic identifier**, not an import-time surrogate key. I recommend deriving it from `source_table`, `source_key`, `embedding_version`, and `text_hash`. That makes re-import idempotent and lets WordPress replace or upsert records safely.
`source_table` should stay as a string such as `Categories`, `Words`, or `ISO10646`. This matters for filtering, routing, and debugging.
`source_key` should be a **string**, not necessarily an integer, because the real source keys may be composite, symbolic, or formatted values.
`descriptor_text` should be the exact normalized text that was embedded. If the source row contains HTML or a rich object, the exporter should resolve that into one canonical descriptor string before hashing or packaging.
`embedding_version` should identify the **embedding pipeline version** and not just the model. That lets you change chunking, normalization, or serialization later without pretending nothing changed.
`embedding_model` should preserve the concrete LM Studio model identity that generated the vector.
`embedding_dimensions` should be stored on every record even if the manifest also provides a default. WordPress should reject imports where dimensions do not match the package declaration.
`text_hash` should be a hash of the normalized `descriptor_text`. That gives you a portable integrity signal and an inexpensive way to detect unchanged rows on refresh.
`updated_utc` should be serialized in UTC using an ISO 8601 / RFC 3339 style timestamp.
`vector` should be an object that declares its own encoding, not a raw anonymous field.
`public_meta` should contain only fields explicitly approved for WordPress display or safe API use. The exporter should never dump entire source rows into that object. citeturn14search2turn14search5
### Recommended vector encoding
For version one, I recommend `f32-base64le` as the canonical vector encoding inside the package. It is compact enough, deterministic, and still portable across languages. For example, if a vector has 768 dimensions, raw float32 storage is `768 × 4 = 3,072` bytes; base64 packaging raises that to roughly `4,096` bytes before JSON framing, which is still far smaller than a verbose JSON float array. A JSON float array is still useful as an optional **debug export**, but I would not make it the main transport form.
If you want the most WordPress-friendly first cut, you can support two encodings:
- `f32-base64le` as the canonical export
- `json-f32` as an optional debug mode for test fixtures
That gives you compactness in production and readability in tests.
### Recommended manifest shape
```json
{
"schema_version": "p5-memory-package-v1",
"package_id": "p5mem-2026-05-04T142519Z",
"source_system": "Protocol5 local SQL Server",
"exported_utc": "2026-05-04T14:25:19.381Z",
"authority_note": "UAIX is the normative standards authority; this package is an implementation export.",
"public_contract_only": true,
"defaults": {
"embedding_version": "protocol5-local-v1",
"embedding_model": "nomic-embed-text-v1.5",
"embedding_dimensions": 768,
"vector_encoding": "f32-base64le"
},
"files": [
{
"path": "records-0001.ndjson.gz",
"kind": "records",
"record_count": 18234,
"sha256": "0c6f9d..."
},
{
"path": "neighbors-0001.ndjson.gz",
"kind": "neighbors",
"record_count": 18234,
"sha256": "a1b1e4..."
Why This File Exists
This is a memory-system evidence file from ɩ.com / JustAnIota.com. It is shown here because AIWikis.org is demonstrating the real source files that make the UAIX / LLM Wiki memory system work, not only summarizing those systems after the fact.
Role
This file is memory-system evidence. It records source history, archive transfer, intake disposition, or another piece of provenance that should be retrievable without becoming an unsupported public claim.
Structure
The file is structured around these visible headings: Protocol5 Local Memory Export for WordPress; Scope and recommended architecture; Flat-file format decision; Why the canonical export should be NDJSON or JSON Lines; Why SQLite is a good secondary artifact, not the primary contract; Why CSV plus sidecar metadata should not be the main format; Why compressed JSON arrays are acceptable but still worse than NDJSON; Flat-file export contract. Those headings are retrieval anchors: a crawler or LLM can decide whether the file is relevant before reading every line.
Prompt-Size And Retrieval Benefit
Keeping this material in a separate file reduces prompt pressure because an agent can load this exact unit only when its role, source site, category, or hash is relevant. The surrounding index pages point to it, while this page preserves the full content for audit and exact recall.
How To Use It
- Humans should read the metadata first, then inspect the raw content when they need exact wording or provenance.
- LLMs and agents should use the source site, category, hash, headings, and related files to decide whether this file belongs in the active prompt.
- Crawlers should treat the AIWikis page as transparent evidence and follow the source URL/source reference for authority boundaries.
- Future maintainers should regenerate this page whenever the source hash changes, then review the explanation if the role or structure changed.
Update Requirements
When this source file changes, update the raw source layer, normalized source layer, hash history, this rendered page, generated explanation, source-file inventory, changed-files report, and any source-section index that links to it.
Related Pages
Provenance And History
- Current observation:
2026-05-15T00:23:56.0837262Z - Source origin:
current-source-workspace - Retrieval method:
local-source-workspace - Duplicate group:
sfg-610(primary) - Historical hash records are stored in
data/hashes/source-file-history.jsonl.
Machine-Readable Metadata
{
"title": "Protocol5 Local Memory Export For WordPress",
"source_site": "ɩ.com / JustAnIota.com",
"source_url": "https://justaniota.com/",
"canonical_url": "https://aiwikis.org/justaniota/uai-system/files/raw-system-archives-justaniota-intake-processing-2026-05-04-protocol5-wo-c45cab80/",
"source_reference": "raw/system-archives/justaniota/intake-processing/2026-05-04-protocol5-wordpress-memory-export/agent-file-handoff/Improvement/Protocol5 Local Memory Export for WordPress.md",
"file_type": "md",
"content_category": "memory-file",
"content_hash": "sha256:c45cab8047c2a29ff5a2597814997f401b20d6cc8cc920e4967de94c5cd70c67",
"last_fetched": "2026-05-15T00:23:56.0837262Z",
"last_changed": "2026-05-04T23:19:24.4358637Z",
"import_status": "unchanged",
"duplicate_group_id": "sfg-610",
"duplicate_role": "primary",
"related_files": [
],
"generated_explanation": true,
"explanation_last_generated": "2026-05-15T00:23:56.0837262Z"
} Next Useful Routes
- Start Here A task-first reading path for AIWikis.org, separating newcomer learning, source-memory lookup, maintainer workflow, and AI-agent retrieval.
- Topic Index A tag-oriented index for LLM Wiki, AI memory, UAI, source governance, crawling, and retrieval topics.
- Source Map AIWikis source-governed page for durable AI memory, evidence routing, and agent-readable retrieval.
- JustAnIota.com / ɩ.com Source Memory AIWikis source-governed page for durable AI memory, evidence routing, and agent-readable retrieval.
- JustAnIota Source Memory Guide AIWikis source-governed page for durable AI memory, evidence routing, and agent-readable retrieval.
- ɩ.com / JustAnIota.com UAI System Files Real current JustAnIota handoff, LLM Wiki, compact-message tooling, public-content, and source-archive evidence files.