Newer
Older
TOSA Serialization Library
==========================
# Introduction
The *TOSA Serialization library* provides methods to read and write serialized
TOSA graphs (<https://developer.mlplatform.org/w/tosa/>). The library includes
a FlatBuffers schema and a C++ API for reading and writing TOSA graphs.
##### The *TOSA Serialization Library* Requires the following
* Python 3.9 or later (tested with 3.12.0)
* GNU Make 4.1 or later
* GCC (tested with 9.4.0) or Clang C++ compiler (tested with clang-10)
with C++17 support
For Microsoft Visual C++ (MSVC)
* MSVC build tools (tested with 19.41.34123.0)
##### Install Additional pip Packages (for unit tests)
* flatbuffers (tested with 24.3.25)
* numpy (tested with 2.1.1)
* ml_dtypes (tested with 0.5.0)
* pytest (tested with 8.3.3)
```bash
pip install flatbuffers==24.3.25 numpy==2.1.1 ml_dtypes==0.5.0 pytest==8.3.3
```
# Compilation
##### The *TOSA Serialization Library* Build can be prepared by the following
mkdir -p build
cd build
cmake ..
make
```
Windows (r)
```cmd
mkdir build
cd build
cmake -DWARNINGS_AS_ERRORS=ON ..
cmake --build . --config Release
```
This cmake step generates .sln and .vcxproj files which can be built
outside cmake using MSVC tools
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
# Flatbuffer Semantics
The purpose of the TOSA Flatbuffer format is to provide a common storage and sharing
mechanism of TOSA compliant graphs.
While the structure and naming is inspired and influenced by other representations,
such as MLIR, it does not strictly adhere to them. Instead aims it to be a solid
common ground that can be ingested by TOSA deserializers and compliant frontends.
> ### **ⓘ NOTE:**
> - It is important to not automatically attribute or assume properties from other
representations based on the naming and structure of the Flatbuffer schema.
>
> - The TOSA spec serves as a guideline but does not describe concrete implementations.
As such TOSA Flatbuffer representation may differ from what the spec outlines
in the interest of simplicity, efficiency or portability.
>
> - At present TOSA serialization library does not provide validation of the semantic and
syntactic validity of the serialized/deserialized graph, only that it is a
valid Flatbuffer, meeting the spec and version expected.
## TOSA Flatbuffer Graph Representation
### Structure and metadata
1. TOSA Flatbuffer representation takes inspiration from MLIR in the usage of Regions,
and BasicBlocks.
- (root)`TosaGraph` -> list[`TosaRegion`] -> list[`TosaBasicBlock`]
- Multiple `TosaRegions` are supported for future extensibility.
2. A `TosaGraph` contains a `version` field encoding the TOSA spec this Flatbuffer
representation is compliant with.
3. Each `TosaBasicBlock` contains all necessary data for its execution independent
of other `TosaBasicBlocks`.
- `TosaBasicBlocks` can be best described as "subgraphs" reachable via control
flow operators.
- There is no implicit capture, closure or a concept of "global" tensors.
- Tensors necessary for the execution of a `TosaBasicBlock` except the "main" one are
passed in as arguments.
4. All tensor metadata and data, where applicable, is stored as a list in separate field
(`tensors`) inside each `TosaBasicBlock`
- `TosaBasicBlocks` and `TosaOperators` reference input and output tensors from this
list by their `name`.
- `data` field is present when we know the initial values of a `TosaTensor` e.g. Constants.
4. Presently `TosaShapes` should be static and fully initialized.
- This may change in the future as the TOSA spec and this TOSA serializer library evolve.
5. When an operator requires multiple tensor inputs they are all flattened into a single
list with tensor names as strings.
- **NOTE:**<br>
At present there is one operator (`COND_IF`) which has to be handled with special care.
Its inputs should be serialized as the rule states above and deserialization should
consider the first element to be the singular input tensor and all elements after be the
list of tensors outlined by the TOSA spec.
6. All constant tensor and shape values are serialized as `[ubyte]`.
### Graph flow
1. The entry point to the graph is represented by a `TosaRegion` named "main" containing a
`TosaBasicBlock` named "main".<br>
- The inputs to the "main" `TosaBasicBlock` are the inputs to the graph as a whole,
excluding any constants.
- The outputs of the "main" `TosaBasicBlock` are the outputs of the graph as a whole.
- Other `TosaBasicBlocks` in the `TosaRegion` should not be implicitly executed after
the "main" one completes.
- Other `TosaRegions` should not be executed implicitly.
2. The only `TosaTensors` available in a block are the ones in the `tensors` section of
the block and all of those are either inputs and/or outputs, or local tensors not
accessible by other other blocks.
- Exception to this are variable declarations which should be considered available
across scope.
3. Control flow operators appear as `TosaOperators` with input/output tensors.
The branches/blocks of control flow logic are represented as additional `TosaBasicBlocks`
within the "main" `TosaRegion`.
- The control flow `TosaOperator` attribute references the `TosaBasicBlocks` by their name.
- Control flow `TosaBasicBlocks` DO NOT inherit or capture any `TosaTensors` or
`TosaShapes` from the "calling" `TosaBasicBlock`, this includes constants.
- Tensors necessary for the execution of the control flow `TensorBasicBlock` are
passed in via its input as referenced by name.
- After a control flow operator's logic is complete execution of the "calling" graph
continues as appropriate.
- Control flow operators: `COND_IF`, `WHILE_LOOP`.
4. The graph can be reconstructed by "linking" `TosaOperators` by their matching
output/input tensors, starting with the inputs to the "main" `TosaBasicBlock`.
### Operator Attributes
1. `TosaOperators` have an attribute meant to hold constant data needed for an
operator's execution.
2. Operators defined in the TOSA spec which do not have any attributes have an empty table
attribute defined in the Flatbuffer schema.
- This is done in order to facilitate easier schema evolution and compatibility in the
future should it be necessary.
> ### **ⓘ NOTE:**
>
> Variable operators defined in the TOSA spec are currently missing attribute values in the schema.
>
> - The `name`, `var_shape` and `type` attribute fields are redundant as that information
can be encoded into and inferred from the `VariableOp` data itself. And `data` should be
represented as a tensor in the `TosaBasicBlock` `tensors` list.
### Types and enums
1. TOSA spec types are represented as an `enum DType` in Flatbuffers.
- All type information should be serialized and deserialized to and from DType.
- This includes `acc_type_t` enum from the TOSA spec.
- At present it is the responsibility of projects consuming this library to provide
the validation of correct types in accordance with the TOSA spec before serialization
and after deserialization.
2. Other TOSA spec enums have concrete representations in the Flatbuffer schema.
# Usage
The section below describes serialization_lib API usage. For more
details, please refer to `include/tosa_serialization_handler.h`.
## TosaSerializationHandler
This is the top-level class that contains the entire TOSA graph. In
particular, it contains a vector of `TosaSerializationRegion` objects,
and provides API for file IO, region access, and version checking.
a. `LoadFileJson(filename)`:
Loads json-formatted file "filename" from disk, and initialize the
internal graph structure.
Requires the schema file to be loaded via `LoadFileSchema()`.
b. `SaveFileJson(filename)`:
Snapshots the internal graph structure and saves out JSON-formatted file
`filename` to disk.
Requires the schema file to be loaded via `LoadFileSchema()`.
c. `LoadFileTosaFlatbuffer(filename)`:
Loads serialized flatbuffer file "filename" from disk, and initialize the
internal graph structure.
d. `SaveFileTosaFlatbuffer(filename)`:
Snapshots the internal graph structure and saves out serialized
flatbuffer file `filename` to disk.
Returns TOSA version implemented by the serialization library.
Returns vector of `TosaSerializationRegion`. A valid graph must have
one `main` region as the first region being traversed.
Returns region whose name is 'name'. A valid graph must have one `main`
region as the first region being traversed.
i. `GetInputs()` / `GetOutputs()`:
Shortcut for `main` region's input/output tensor name. Input tensors of
the main block are usually treated as `tosa.PLACEHOLDER`. Output tensors
are the output of the entire graph and should be evaluated when graph
traversal has finished.
## TosaSerializationRegion
This is the region class. It contains vectors of `TosaSerializationBasicBlock` objects,
and provides API for block access.
a. `GetName()`:
Returns name of the region.
b. `GetBlocks()`:
Returns vector of TosaSerializationBasicBlock. A valid region must have
at least one block.
c. `GetBlockByName(name)`:
Returns the `TosaSerializationBasicBlock` with name `name`. Returns `nullptr`
if nothing matches.
## TosaSerializationBasicBlock
This is the basic-block class. It contains vectors of
`TosaSerializationOperator`, `TosaSerializationTensor` and `TosaSerializationShape`.
Returns name of the basic block.
b. `GetRegionName()`:
Returns name of the region containing the basic block.
Returns the `TosaSerializationTensor` with name `name`. Returns `nullptr`
Returns the array of input/output tensor name of the basic block.
## TosaSerializationOperator
The operator class contains (1) what TOSA Op, (2) attribute (compile-time-
b. `GetAttribute()` / `GetAttributeType()`:
`GetAttribute()` returns the base object of attribute.
`GetAttributeType()` returns which type of attribute the base object
needs to be casted to. Type of attribute is defined in `tosa.fbs` and
`include/attribute.def`.
Returns the array of input/output tensor names of the operator.
## TosaSerializationTensor
The tensor class contains (1) name, (2) shape, (3) data type, (4) data value, and
properties
a. `GetName()` / `SetName(name)`:
`GetName()` returns the name of the tensor. `SetName()` sets the name
of the tensor.
b. `GetShape()`:
Returns the shape of the tensor as `vector<int32_t>`.
c. `GetDtype()` / `SetDtype(dtype)`:
`GetDtype()` returns the data type of the tensor. `SetDtype()` sets the
data type of the tensor. DType is defined in `tosa.fbs`.
d. `GetData()` / `SetData(data)`:
`GetData()` returns a vector of `uint8_t` values which stores the constant
value for a constant tensor, or the initialization value for a variable tensor.
`SetData()` sets the constant value for a constant tensor, or the initialization
value for a variable tensor.
e. `GetVariable()`:
f. `GetIsUnranked()` / `SetIsUnranked(value)`:
`GetIsUnranked()` returns whether tensor is an unranked tensor.
`SetIsUnranked()` sets whether tensor is an unranked tensor.
When a tensor is unranked, its shape should be ignored.
Returns the variable name for a Tosa Variable tensor.
Returns the empty string "" for tensors that are not Variable tensors.
## TosaSerializationShape
The shape class contains (1) name, (2) rank and (3) data value.
a. `GetName()`:
`GetName()` returns the name of the tensor.
b. `GetRank()`:
Returns the rank of the shape as `uint32_t`.
c. `GetData()` / `SetData(data)`:
`GetData()` returns a vector of `uint8_t` values which stores the constant
value for a constant tensor, or the initialization value for a variable tensor.
`SetData()` sets the constant value for a constant tensor, or the initialization
value for a variable tensor.
The *TOSA Serialization Library*'s C++ and Python versions can be tested with GoogleTest and
PyTest, respectively. After building, unit tests can be run with the following commands.
- `ctest` from the project's build directory
- `pytest` from the project's root directory
- `pytest --leave-tmp` preserves temporary files at `python/pytests/tmp/` for debugging.
# Pre Commit Checks
Before pushing a commit, pre commit checks must be run to ensure conformity.
##### Prerequisites
* Do as instructed in the main [Prerequisites section](#prerequisites) and [Compilation section](#compilation)
##### Install Additional pip Package
* pre-commit (tested with 3.8.0)
* clang-format (tested with 14)
pip install pre-commit==3.8.0 clang-format==14
```
##### Run Pre Commit Checks
Note: regenerate-headers is only currently supported on POSIX compliant
platforms.
If changing the schema, regenerate on POSIX to get the new headers
and .py files
# License
The *TOSA Serialization Library* is licensed under Apache-2.0.
Third party projects are referenced as in the CMakeLists.txt file and as such,
are licensed under the licenses stated in their projects.