hyperion.adapters.serialization.avro¶
hyperion.adapters.serialization.avro
¶
Avro serialization adapter (requires the [catalog] extra -- fastavro).
This is the only module that imports :mod:fastavro. It was extracted out of
:class:hyperion.catalog.catalog.Catalog (DDD refactor F1 / Step 5) so that the
catalog depends on an injected serializer instead of reaching for fastavro at
module scope. Catalog default-constructs an :class:AvroSerializer when no
serializer is supplied; importing this module is what gates the fastavro
requirement, so the error below points at the extra to install.
AvroStreamWriter
¶
Bases: Protocol
Incremental avro writer (one record at a time, then flushed).
Structural type for the object :meth:AvroSerializer.streaming_writer
returns, so callers (e.g. AssetRepartitioner) need no fastavro import.
write
¶
AvroSerializer
¶
Encapsulates the catalog's fastavro encode/decode contract.
The encode parameters (deflate codec, compression level 7, validation,
lenient strictness) are pinned here so byte-level output is identical to the
pre-refactor inline implementation.
write
¶
Write records to fp as a single avro container file.
Source code in hyperion/adapters/serialization/avro.py
read
¶
Yield raw records decoded from the avro container in fp.
Type/shape validation of each row stays with the caller.
streaming_writer
¶
Open an incremental avro writer over fp (for repartitioning).
Mirrors the pre-refactor fastavro.write.Writer configuration (no
explicit compression level / strictness flags).