CoDec

This is the code for CoDec: Prefix-Shared Decoding Kernel for LLMs

Environment

CUDA Toolkit 12.9

For CoDec On Ascend, the required hardware and software environment dependencies for this project are as follows:

Ascend hardware:
- Atlas A2 Training / Inference Series Products
- Atlas A3 Training / Inference Series Products
- Ascend 950PR/Ascend 950DT
CPU architecture: aarch64/x86_64
OS: Linux distributions supported by CANN, such as Ubuntu 20.04/22.04 and openEuler 22.03 SP4
Software dependencies:
- gcc >= 7.5, < 13.0
- cmake >= 3.16
- python >= 3.8, < 3.12
- CANN Toolkit >= 8.5.0 (https://www.hiascend.com/cann).
Recommended configurations:

OS	`CANN`	`gcc`	`cmake`	`python`
Ubuntu 20.04.5	8.5.0	9.3	3.16	3.10
Ubuntu 22.04.5	8.5.0	11.3	3.22	3.10
openEuler 22.03 SP4	8.5.0	10.3	3.22	3.10

Installation

uv pip install torch
uv pip install -Ue . --no-build-isolation

For Codec On Ascend, run the following build command in the project directory:

Install the Community Edition CANN toolkit package

Based on the category of Ascend product you are using, download the corresponding CANN toolkit package Ascend-cann-toolkit_{version}_linux-{arch}.run. See CANN toolkit for the download link.

Then install the CANN toolkit package (for details, refer to the CANN Installation Guide).

# Ensure the installer has executable permission
chmod +x Ascend-cann-toolkit_{version}_linux-{arch}.run
# Install the CANN toolkit package
./Ascend-cann-toolkit_{version}_linux-{arch}.run --full --force --install-path=${install_path}
# Enable the CANN environment. For default path installation, taking root user as an example
# (for non-root users, replace /usr/local with ${install_path})
source /usr/local/Ascend/ascend_toolkit/set_env.sh

{version}: CANN package version.
{arch}: System architecture.
{install_path}: Specified installation path, default is /usr/local/Ascend.

Download and install dependencies

Download the source code of this project, and execute the following commands in the project directory.

# Download project source code
git clone https://github.com/wzbxpy/codec.git
# Install the Python environment dependencies according to the requirements file.

# Build the specified example
cd catlass-faInfer-shared-prefix
bash scripts/build.sh flash_attention_infer_tla

If the following message appears, the build is successful.

"[INFO] Target "{flash_attention_infer_tla}" built successfully."

Run the operator

We have prepared the script for running and testing:

bash examples/flash_attention_infer_tla/run.sh

You can modify the following parameters in the script:

batch, qSeqlen, kvSeqlen, numHeads, kvHeads, headSize, dtype="bf16"(or "half"), device, accCheck

Evaluation

# kernel evaluation
scripts/kernel.sh

# end to end evaluation
scripts/e2e.sh

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
benchmark		benchmark
catlass-faInfer-shared-prefix		catlass-faInfer-shared-prefix
codec		codec
csrc		csrc
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoDec

Environment

Installation

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CoDec

Environment

Installation

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages