Comparison of DGEMV (Double) and SGEMV (Single) across Vector Targets (Vector Sizes 256-4096)
Copyright 2025
Benchmarks were captured from OpenBLAS 0.3.30 (official release), which was released on June 19, 2025.
Detailed guidelines for running these benchmarks and collecting performance data are described in the benchmark.md documentation.
We invite you to read the benchmark guide to learn how the data presented in this report were collected, including build configurations, runtime parameters, and measurement methodologies.
Released by the Fedora-V Force team.
Download link: images.fedoravforce.org
Manufactured by SpacemiT.
CPU: 8 cores, model Spacemit® X60.
ISA Profile:
rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintpause_zihpm
zfh_zfhmin_zca_zcd_zba_zbb_zbc_zbs_zkt_zve32f_zve32x
zve64d_zve64f_zve64x_zvfh_zvfhmin_zvkt_sscofpmf_sstc
svinval_svnapot_svpbmt
MMU Mode: sv39.
GEMV performs matrix-vector multiplication: y = α·A·x + β·y, where A is a matrix and x, y are vectors. Unlike GEMM (matrix-matrix), GEMV is memory-bandwidth bound rather than compute-bound, making it more challenging to optimize. Each element of A is typically accessed only once, limiting opportunities for data reuse in cache.
Unlike GEMM, GEMV optimization shows highly variable results across different vector sizes and precisions:
Loading and analyzing data...
| Precision | Target | LMUL | Vector Size | Flops (MFLOPS) | Time (s) |
|---|