canaad-core

Internal developer documentation for canaad-core v0.3.0 — the core library for AAD canonicalization per RFC 8785 (JCS).


Dependencies

Runtime

Crate Version Role
serde workspace Serialize / Deserialize derives and manual impls
serde_json workspace JSON parsing, Value type, duplicate-key visitor substrate
thiserror workspace #[derive(Error)] on AadError
serde_json_canonicalizer 0.2 RFC 8785 canonical serialization

Dev-only

Crate Version Role
sha2 0.10 SHA-256 for spec §10 known-answer vectors
hex 0.4 Hex encoding in test assertions
proptest 1.4 Property-based testing
serde_json 1.0 + preserve_order Enables IndexMap-backed Map so key emission order is observable in canonical-ordering tests

Note: The crate does not expose a test-helpers feature gate. Test utilities are internal to src/tests/. If downstream crates need shared fixtures, a test-helpers feature exporting builder helpers and known-answer vectors would be the idiomatic path — it does not exist yet.


Module map

canaad-core/src/
├── lib.rs                  # crate root, module declarations, public re-exports
├── api.rs                  # free-standing public functions: parse, validate, canonicalize, canonicalize_string
├── canon.rs                # RFC 8785 canonicalization: canonicalize_value, canonicalize_serializable, canonicalize_context
├── error.rs                # AadError enum, JsonType helper enum
├── context/
   ├── mod.rs              # AadContext struct, Serialize impl, TryFrom<ParsedAad>, methods
   ├── builder.rs          # AadContextBuilder with deferred validation
   └── tests.rs            # unit tests for AadContext and builder
├── parse/
   ├── mod.rs              # module root, re-exports parse_aad, CURRENT_VERSION, MAX_AAD_SIZE
   ├── aad.rs              # parse_aad(): field extraction, version/field-name/type validation
   └── scan.rs             # single-pass serde_json visitor for duplicate-key detection
├── types/
   ├── mod.rs              # module root, re-exports, integration tests
   ├── safe_int.rs         # SafeInt newtype (0..2^53-1), TryFrom impls, Serialize/Deserialize
   ├── string_types.rs     # Tenant (1-256 B), Resource (1-1024 B), Purpose (1+ B) newtypes
   ├── field_key.rs        # FieldKey newtype ([a-z_]+), RESERVED_KEYS, extension key validation
   └── extension.rs        # ExtensionValue enum (String | Integer), Extensions type alias
└── tests/
    ├── mod.rs              # test module root
    └── test_vectors/
        ├── mod.rs          # sub-module declarations
        ├── section_10.rs   # spec §10 known-answer vectors (§10.1–10.5)
        ├── negative.rs     # rejection / error-case inputs
        └── edge_cases.rs   # boundary and special-character tests

Constants

Constant Value Defined in Provenance
CURRENT_VERSION 1 parse/aad.rs AAD spec schema version; v field must equal this
MAX_AAD_SIZE 16_384 (16 KiB) parse/aad.rs Spec-mandated upper bound on serialized AAD size
MAX_SAFE_INTEGER 9_007_199_254_740_991 (2^53 − 1) types/safe_int.rs JavaScript Number.MAX_SAFE_INTEGER; cross-platform compatibility bound
RESERVED_KEYS ["v", "tenant", "resource", "purpose", "ts"] types/field_key.rs Field names that cannot be used as extension keys

All four are re-exported from the crate root via lib.rs.


AadError variants

Defined in error.rs. Every variant carries structured fields for programmatic matching.

Variant Trigger
IntegerOutOfRange { value, max } Any integer exceeds 2^53 − 1 (applies to v, ts, extension integers)
NegativeInteger { value } Negative i64 encountered where unsigned expected (e.g. "ts": -1)
EmptyFieldKey A JSON key is an empty string
InvalidFieldKey { key, reason } Key contains characters outside [a-z_]
ReservedKeyAsExtension { key } Caller passes a reserved key (v, tenant, etc.) to with_extension
InvalidExtensionKeyFormat { key, expected_pattern } Extension key does not match x_<app>_<field> pattern
FieldTooShort { field, min_bytes, actual_bytes } String field below minimum length (tenant, resource, purpose)
FieldTooLong { field, max_bytes, actual_bytes } String field exceeds maximum (tenant > 256 B, resource > 1024 B)
NulByteInValue { field } NUL byte (0x00) found in any string value
MissingRequiredField { field } Required field absent from JSON or builder
DuplicateKey { key } Duplicate key detected by the single-pass visitor
UnknownField { field, version } Non-reserved, non-x_ field in the object
UnsupportedVersion { version } v field is not 1
WrongFieldType { field, expected, actual } JSON value type does not match schema expectation
SerializedTooLarge { max_bytes, actual_bytes } Canonical output exceeds 16 KiB
InvalidJson { message } Malformed JSON syntax or canonicalization failure

ReservedKeyAsExtension vs InvalidExtensionKeyFormat (0.2.2 fix)

Prior to 0.2.2, with_extension called validate_as_extension() before is_reserved(). Reserved keys like "v" or "ts" would fail the x_<app>_<field> pattern check first, returning InvalidExtensionKeyFormat instead of the semantically correct ReservedKeyAsExtension.

The fix reordered the checks: is_reserved() is now evaluated first. The test test_with_extension_reserved_keys_return_correct_error in context/tests.rs guards this invariant for all five reserved keys.


AadContext / AadContextBuilder

AadContext

Fully validated, immutable AAD object. Three construction paths:

  1. AadContext::new(tenant, resource, purpose) — validates required fields immediately, returns Result<Self, AadError>
  2. AadContext::builder() — returns an AadContextBuilder for incremental construction
  3. parse(json) / validate(json) — parse from JSON string, returns Result<AadContext, AadError>

Chaining methods on AadContext

All consume and return self (builder-style chaining on an already-valid context):

Method Validates
.with_timestamp(ts: u64) ts ≤ 2^53 − 1
.with_extension(key, ExtensionValue) Key not reserved → key matches x_<app>_<field> → value valid
.with_string_extension(key, value) Delegates to with_extension after ExtensionValue::string() (rejects NUL)
.with_int_extension(key, value) Delegates to with_extension after ExtensionValue::integer() (rejects > 2^53 − 1)

AadContextBuilder

All setters store raw values with no validation. Validation fires entirely within .build().

Setter Field Required
.tenant(impl Into<String>) tenant yes
.resource(impl Into<String>) resource yes
.purpose(impl Into<String>) purpose yes
.timestamp(u64) ts no
.extension_string(key, value) x_* no
.extension_int(key, value) x_* no

build() checks required fields are present, then delegates to AadContext::new() and the with_* methods, surfacing the first error encountered.

Field constraints

Field Type Min bytes Max bytes NUL Integer bound
tenant string 1 256 forbidden
resource string 1 1024 forbidden
purpose string 1 unbounded forbidden
ts integer 0..2^53 − 1
extension string string 0 unbounded forbidden
extension integer integer 0..2^53 − 1

Serialization

AadContext implements Serialize manually to emit keys in strict lexicographic order: purpose, resource, tenant, ts (if present), v, then x_* extensions (already sorted via BTreeMap). This matches RFC 8785 requirements and is guarded by test_serialize_key_order.


Public functions

All re-exported from the crate root.

parse(json: &str) -> Result<AadContext, AadError>

Parses a JSON string through the duplicate-key scanner, extracts and validates all fields, returns a fully validated AadContext.

validate(json: &str) -> Result<AadContext, AadError>

Semantic alias for parse(). Same implementation, same return type. Use validate at call sites where you only care whether input is valid and want that intent explicit. Documented as an alias since 0.2.3.

canonicalize(json: &str) -> Result<Vec<u8>, AadError>

Parses, validates, then returns the RFC 8785 canonical byte form. Errors on invalid JSON, constraint violations, or output > 16 KiB.

canonicalize_string(json: &str) -> Result<String, AadError>

Same as canonicalize but returns a UTF-8 String. Additional error if canonical bytes are not valid UTF-8.


Duplicate-key detection

Defined in parse/scan.rs.

Problem

serde_json silently drops duplicate keys, keeping only the last value. The AAD spec requires duplicate keys to be rejected.

Solution

A single-pass custom serde Visitor (DupCheckVisitor) that:

  1. Deserializes the JSON stream into a serde_json::Value as normal
  2. Maintains a HashSet<String> per object level
  3. On each key, checks seen.insert(key) — if it returns false, emits a custom error with the sentinel prefix "duplicate key: "
  4. Recurses into nested values via DupCheckSeed to catch duplicates at any depth

The into_aad_error function strips the at line N column M suffix that serde_json appends to custom errors, then pattern-matches the sentinel prefix to produce AadError::DuplicateKey.

This replaced a prior implementation (pre-0.2.1) that allocated a Vec<char> and traversed the input twice.


test-helpers feature

The crate does not define a test-helpers feature in Cargo.toml. All test utilities are internal to src/tests/. If a test-helpers feature is added in the future, the expected pattern would be:

# In consuming crate's Cargo.toml
[dev-dependencies]
canaad-core = { version = "0.3", features = ["test-helpers"] }

Candidate exports: builder presets for common test contexts, known-answer canonical strings, and SHA-256 digests from the §10 vectors.


Test vectors

Located at src/tests/test_vectors/. Split from a single 448-line file in 0.3.0.

section_10.rs — spec §10 known-answer vectors

Covers spec sections 10.1 through 10.5, each tested via both JSON parse and builder construction:

Section Coverage
§10.1 Minimal required fields — canonical form, hex encoding, SHA-256 digest
§10.2 All fields including optional ts
§10.3 Unicode in values (CJK, emoji) — verifies characters are preserved, not \uXXXX-escaped
§10.4 Extension fields — canonical ordering of x_* keys
§10.5 JCS edge cases — \u000A\n normalization, escaped quotes, 2^53 − 1 precision boundary

negative.rs — rejection tests

Inputs that must fail validation. Covers:

  • Integer above MAX_SAFE_INTEGER, negative integer
  • Empty tenant, NUL byte in tenant
  • Unknown field, missing required field
  • Duplicate key
  • Invalid extension keys (single underscore, empty app, empty field)
  • Unsupported version (0, 2)
  • Wrong field types (v as string, tenant as number, ts as string)
  • Invalid JSON syntax, non-object root

edge_cases.rs — boundary tests

Covers:

  • Integer extension values, multiple extensions, multi-underscore field segments
  • Tenant and resource at exact max length and one byte over
  • Whitespace handling in input (canonical output must be whitespace-free)
  • Special characters in string values (slashes, backslashes, tabs)
  • Empty resource and empty purpose rejection