Skip to content
README.md 8.08 KiB
Newer Older
TOSA Serialization Library
==========================

# Introduction

The *TOSA Serialization library* provides methods to read and write serialized
TOSA graphs (<https://developer.mlplatform.org/w/tosa/>).  The library includes
a FlatBuffers schema and a C++ API for reading and writing TOSA graphs.

# Prerequisites
##### The *TOSA Serialization Library* Requires the following
* Python 3.9 or later (tested with 3.12.0)
* CMake version 3.16 or later
For Linux
* GNU Make 4.1 or later
* GCC (tested with 9.4.0) or Clang C++ compiler (tested with clang-10)
  with C++17 support
For Microsoft Visual C++ (MSVC)
* MSVC build tools (tested with 19.41.34123.0)

##### Install Additional pip Packages (for unit tests)
* flatbuffers (tested with 24.3.25)
* numpy (tested with 2.1.1)
* ml_dtypes (tested with 0.5.0)
* pytest (tested with 8.3.3)

Any platform
```bash
pip install flatbuffers==24.3.25 numpy==2.1.1 ml_dtypes==0.5.0 pytest==8.3.3
```

# Compilation

##### The *TOSA Serialization Library* Build can be prepared by the following
Linux
```bash
mkdir -p build
cd build
cmake ..
make
```

Windows (r)
```cmd
mkdir build
cd build
cmake -DWARNINGS_AS_ERRORS=ON ..
cmake --build . --config Release
```

This cmake step generates .sln and .vcxproj files which can be built
outside cmake using MSVC tools

# Usage

The section below describes serialization_lib API usage. For more
details, please refer to `include/tosa_serialization_handler.h`.

## TosaSerializationHandler

This is the top-level class that contains the entire TOSA graph.  In
Tai Ly's avatar
Tai Ly committed
particular, it contains a vector of `TosaSerializationRegion` objects,
and provides API for file IO, region access, and version checking.

    a. `LoadFileJson(filename)`:

Tai Ly's avatar
Tai Ly committed
        Loads json-formatted file "filename" from disk, and initialize the
        internal graph structure.

        Requires the schema file to be loaded via `LoadFileSchema()`.

    b. `SaveFileJson(filename)`:

        Snapshots the internal graph structure and saves out JSON-formatted file
        `filename` to disk.
        Requires the schema file to be loaded via `LoadFileSchema()`.

    c. `LoadFileTosaFlatbuffer(filename)`:

Tai Ly's avatar
Tai Ly committed
        Loads serialized flatbuffer file "filename" from disk, and initialize the
        internal graph structure.

    d. `SaveFileTosaFlatbuffer(filename)`:

Tai Ly's avatar
Tai Ly committed
        Snapshots the internal graph structure and saves out serialized
        flatbuffer file `filename` to disk.

Tai Ly's avatar
Tai Ly committed
    e. `GetVersion()`:
Tai Ly's avatar
Tai Ly committed
        Returns TOSA version implemented by the serialization library.
Tai Ly's avatar
Tai Ly committed
    f. `GetRegions()`:
Tai Ly's avatar
Tai Ly committed
        Returns vector of `TosaSerializationRegion`. A valid graph must have
        one `main` region as the first region being traversed.
Tai Ly's avatar
Tai Ly committed
    g. `GetMainRegion()`:
Tai Ly's avatar
Tai Ly committed
        Shortcut for accessing the first region.
Tai Ly's avatar
Tai Ly committed
    h.  `GetRegionByName(name)`
Tai Ly's avatar
Tai Ly committed
        Returns region whose name is 'name'. A valid graph must have one `main`
        region as the first region being traversed.

    i. `GetInputs()` / `GetOutputs()`:

        Shortcut for `main` region's input/output tensor name. Input tensors of
        the main block are usually treated as `tosa.PLACEHOLDER`. Output tensors
        are the output of the entire graph and should be evaluated when graph
        traversal has finished.

Tai Ly's avatar
Tai Ly committed
## TosaSerializationRegion

This is the region class. It contains vectors of `TosaSerializationBasicBlock` objects,
and provides API for block access.

    a. `GetName()`:

        Returns name of the region.

    b. `GetBlocks()`:

        Returns vector of TosaSerializationBasicBlock. A valid region must have
        at least one block.

    c. `GetBlockByName(name)`:

        Returns the `TosaSerializationBasicBlock` with name `name`. Returns `nullptr`
        if nothing matches.

## TosaSerializationBasicBlock

This is the basic-block class. It contains vectors of
`TosaSerializationOperator` and `TosaSerializationTensor`. Once entering
a basic block, all of the operators within the block will be evaluated
in order.

Upon reaching a TOSA control flow operator (`tosa.WHILE` and
`tosa.COND_IF`), the status of current unfinished block will be saved, and
the regions specified in control flow operator will be evaluated as needed. Once
the control flow regions finish their evaluation, the original unfinished
block status will be restored and evaluation continues.  This is more
analogous to a function call than a compiler basic block.

    a. `GetName()`:

Tai Ly's avatar
Tai Ly committed
        Returns name of the basic block.

    b. `GetRegionName()`:

        Returns name of the region containing the basic block.
Tai Ly's avatar
Tai Ly committed
    c. `GetOperators()`:
Tai Ly's avatar
Tai Ly committed
        Returns vector of `TosaSerializationOperator`
Tai Ly's avatar
Tai Ly committed
    d. `GetTensors()`:
Tai Ly's avatar
Tai Ly committed
        Returns vector of `TosaSerializationTensor`
Tai Ly's avatar
Tai Ly committed
    e. `GetTensorByName(name)`:
Tai Ly's avatar
Tai Ly committed
        Returns the `TosaSerializationTensor` with name `name`. Returns `nullptr`
        if nothing matches.

Tai Ly's avatar
Tai Ly committed
    f. `GetInputs()` / `GetOutputs()`:
        Returns the array of input/output tensor name of the basic block.

## TosaSerializationOperator

The operator class contains (1) what TOSA Op, (2) attribute (compile-time-
Tai Ly's avatar
Tai Ly committed
known input) and (3) input/output tensor names.
Tai Ly's avatar
Tai Ly committed
        Returns TOSA Op. Defined in schema `tosa.fbs`.

    b. `GetAttribute()` / `GetAttributeType()`:

        `GetAttribute()` returns the base object of attribute.
        `GetAttributeType()` returns which type of attribute the base object
        needs to be casted to.  Type of attribute is defined in `tosa.fbs` and
        `include/attribute.def`.

Tai Ly's avatar
Tai Ly committed
    c. `GetInputTensorNames()` / `GetOutputTensorNames()`:
        Returns the array of input/output tensor names of the operator.

## TosaSerializationTensor

The tensor class contains (1) name, (2) shape, (3) data type, (4) data value, and
properties

    a. `GetName()` / `SetName(name)`:

        `GetName()` returns the name of the tensor. `SetName()` sets the name
        of the tensor.

    b. `GetShape()`:

        Returns the shape of the tensor as `vector<int32_t>`.

    c. `GetDtype()` / `SetDtype(dtype)`:

        `GetDtype()` returns the data type of the tensor. `SetDtype()` sets the
        data type of the tensor. DType is defined in `tosa.fbs`.

    d. `GetData()` / `SetData(data)`:

        `GetData()` returns a vector of `uint8_t` values which stores the constant
        value for a constant tensor, or the initialization value for a variable tensor.
        `SetData()` sets the constant value for a constant tensor, or the initialization
        value for a variable tensor.

    e. `GetVariable()`:
Tai Ly's avatar
Tai Ly committed

        Returns whether tensor is a Tosa Variable.

    f. `GetIsUnranked()` / `SetIsUnranked(value)`:
Tai Ly's avatar
Tai Ly committed

        `GetIsUnranked()` returns whether tensor is an unranked tensor.
        `SetIsUnranked()` sets whether tensor is an unranked tensor.
        When a tensor is unranked, its shape should be ignored.
Tai Ly's avatar
Tai Ly committed

    g. `GetVariableName()`:
        Returns the variable name for a Tosa Variable tensor.
        Returns the empty string "" for tensors that are not Variable tensors.
The *TOSA Serialization Library*'s C++ and Python versions can be tested with GoogleTest and
PyTest, respectively. After building, unit tests can be run with the following commands.

- `ctest` from the project's build directory
- `pytest` from the project's root directory
    - `pytest --leave-tmp` preserves temporary files at `python/pytests/tmp/` for debugging.

# Pre Commit Checks

Before pushing a commit, pre commit checks must be run to ensure conformity.

##### Prerequisites
* Do as instructed in the main [Prerequisites section](#prerequisites) and [Compilation section](#compilation)

##### Install Additional pip Package
* pre-commit (tested with 3.8.0)
* clang-format (tested with 14)

Any platform
```bash
pip install pre-commit==3.8.0 clang-format==14
```

##### Run Pre Commit Checks

Any platform
```bash
pre-commit run --all
```
Note:  regenerate-headers is only currently supported on POSIX compliant
platforms.
If changing the schema, regenerate on POSIX to get the new headers
and .py files
# License

The *TOSA Serialization Library* is licensed under Apache-2.0.

## Third Party Projects

Third party projects are referenced as in the CMakeLists.txt file and as such,
are licensed under the licenses stated in their projects.