135 questions
0
votes
0
answers
124
views
calculate flops in Whisper model
I am trying to calculate the flop count of a single pass through the whisper forward() function. I am using the thop library and getting 0 flop counts. I believe its because in its register_hooks, ...
0
votes
0
answers
126
views
calculating Flops of the model
@al2
I am using the post https://stackoverflow.com/a/63426284/25649468 to calculate the flops. I have my_model.h5 file. I am getting the huge difference in the flops calculations. When the other ...
2
votes
1
answer
298
views
looking for a tool to calculate FLOPs of XLA-HLO computational graph
I'm looking for a tool to calculate the FLOPs when given the computational graph of XLA-HLO.
Is someone know some HLO cost models or analytical models for print the FLOPs of operator node for ...
1
vote
0
answers
114
views
Achieving More FMA3 Performance Than The Theoretical Maximum [duplicate]
For an assignment, I am trying to calculate the theoretical maximum achievable GFLOPS/sec of a single core of my processor, an AMD Ryzen 9 5900HS. According to Agner Fog's tables for a Zen 3 AMD ...
1
vote
1
answer
188
views
The Intel MKL LINPACK test indicates too big performance
I ran an Intel MKL LINPACK test on an Intel Core i7-14700K processor and got a peak performance of 557 GFLOPS which seems quite unrealistic.
Size LDA Align. Average Maximal
1000 1000 4 ...
-1
votes
1
answer
161
views
How to compute the model complexity of FasterRCNNFPN pretrained from torchvision?
I got the pretrained FASTERRCNN_RESNET50_FPN model from pytorch (torchvision), here's the link.
Now I want to compute the model's complexity (number of parameters and FLOPs) as reported from ...
1
vote
0
answers
1k
views
How to calculate the FLOPS of a Python program?
I have the following python program for Linear Search algorithm:
import numpy as np
a, item = [2.6778716682529704, 8.224004328108661, 8.819020166860604, 25.04500044837642,
114.6788167136755, 147....
0
votes
0
answers
371
views
Counting FLOPS in tensorflow
Is there a way to count FLOPS for the training and prediction of tensorflow models?
The models are running on a CPU using tensorflow 2.8.0 and i would not like to use an external (e.g. command line) ...
2
votes
2
answers
14k
views
calculate flops in a custom pytorch model
I have a deeply nested pytorch model and want to calculate the flops per layer. I tried using the flopth, ptflops, pytorch-OpCounter library but couldn't run it for such a deeply nested model. How to ...
0
votes
0
answers
129
views
calculate program execution time in embedded device from python code
I have a python program which I want to deploy in an MCU. Before selecting an MCU for this task, i want to estimate the absolute base requirements for the MCU. On a M1 pro chip the self CPU execution ...
0
votes
0
answers
193
views
why is Rpeak different from Rmax when measuring performance?
Rmax is maximum performance
RPeak is theorotical maximum performance.
but why can't supercomputers reach Rpeak. what causes the inefficency?
an explanation to the cause of inefficency.
1
vote
0
answers
254
views
Why do I get higher Whetstone FLOPS from SiSoft Sandra when I disable extensions (SSE, AVX, FMA)?
I'm working on a college assignment for my computer architecture class and we have to run different benchmark tests on our personal computers to determine how different technologies affect its ...
0
votes
1
answer
1k
views
Is it possible that the inference time is large while number of parameters and flops are low in pytorch?
I calculated flops of network using Pytorch.
I used the function 'profile' in 'thop' library.
In my experiment. My network showed that
Flops : 619.038M
Parameters : 4.191M
Inference time : 25.911
...
0
votes
1
answer
267
views
Difficulty understanding FLOPS in this scenario
Given FLOPS are the floating point operations per second, would that not be dependent on the power of the machine rather than the model and how many parameters it has? What am I missing here? ...
1
vote
0
answers
146
views
TensorFlow Object Detection API - determining FLOPS and number of Parameters
I have trained a custom SSD-MobileNetV2-FPN-Lite (320x320) from the TensorFlow Model zoo (TF2) and would like to know how many trainable parameters and FLOPs this network has. Does anyone know how to ...