Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
0 votes
1 answer
63 views

Does TensorFlow or XLA provide a python API to read and parse the dumped MHLO mlir module?

I turned on XLA when running TensorFLow, and in order to further optimize the fused kernels, I added export XLA_FLAGS="--xla_dump_to=/tmp/xla_dump", and got the dumped IRs, including lmhlo....
StayFoolish's user avatar
1 vote
1 answer
123 views

How to compile tensorflow serving (tensorflow/xla) to have llvm/mlir as shared objects rather than statically included in the binary?

I am trying to compile the tensorflow serving project and I would like to have llvm/mlir compiled as a shared objects. The project is tensorflow serving -> tensorflow -> xla and compiles to a ...
Capybara's user avatar
  • 1,483
0 votes
0 answers
56 views

How XLA loads saved model and gets tensor information

Context: I want to use XLA (the one within tensorflow repo) to load model and input data, and get the output. HloRunner executes model via Literal: https://github.com/tensorflow/tensorflow/blob/...
Tinyden's user avatar
  • 578
2 votes
0 answers
416 views

No registered 'RaggedTensorToTensor' OpKernel for XLA_GPU_JIT devices

In short, I have the problem of getting the following error when running a keras_cv/retina_net based object-detection model: "No registered 'RaggedTensorToTensor' OpKernel for XLA_GPU_JIT devices ...
user4711's user avatar
4 votes
0 answers
3k views

Is there a way to suppress STDERR message from tensorflow and XLA

When I run my python script, I had the messages below: WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1701341037.989729 1542352 device_compiler.h:...
xxx yyy's user avatar
  • 41
0 votes
1 answer
258 views

Is it okay to use python operators for tensorflow tensors?

TL;DR Is (a and b) equivalent to tf.logical_and(a, b) in terms of optimization and performance? (a and b are tensorflow tensors) Details: I use python with tensorflow. My first priority is to make the ...
Daniel S.'s user avatar
  • 6,680
2 votes
1 answer
2k views

Why does tensorflow.function (without jit_compile) speed up forward passes of a Keras model?

XLA can be enabled using model = tf.function(model, jit_compile=True). Some model types are faster that way, some are slower. So far, so good. But why can model = tf.function(model, jit_compile=None) ...
Tobias Hermann's user avatar
0 votes
0 answers
41 views

Get computation cost of running a tensorflow graph

I have a frozen tensorflow graph, and I'm wondering what the best method is to get the computation cost of running it(assuming it only uses deterministic operations and nothing that makes it turing ...
Dan8757's user avatar
  • 63
1 vote
0 answers
273 views

Tensorflow w/ XLA causing a memory leak

I'm training the EfficientDet neural network with Tensorflow 2.9 in a Docker container. Without XLA compilation, everything runs fine. With XLA, I'm getting a 4x performance boost! However, there is a ...
Fred's user avatar
  • 666
4 votes
0 answers
425 views

Visualize TensorFlow graphs before and after Grappler passes?

I've been trying to visualize the graph of a tf.function with and without Grappler optimizations but so far I’m not managing to see any difference in the generated graphs. Here is the process I ...
Paul Delestrac's user avatar
1 vote
1 answer
844 views

In tensorflow 1.15, what's the difference of using explicit XLA compilation and Auto-clustering?

I'm trying to learn how to use XLA for my models. And I'm looking at the doc from official here: https://www.tensorflow.org/xla#enable_xla_for_tensorflow_models. It was documented that there are two ...
StayFoolish's user avatar
0 votes
0 answers
140 views

How to get a coarse-grained op-level graph in tensorflow

I want to use tensorflow to get the full computation graph (including forward, backward and parameter update). I tried tf.functions, but the graph I got is too fine-grained, as many ops (Adam for ...
Jason's user avatar
  • 1
1 vote
1 answer
514 views

Tensorflow with XLA doesn't fully utilize CPU capacity

I have created a Monte-Carlo simulation model implemented in Tensorflow 2.5. The model mostly consists of vector multiplications inside a tf.while_loop. I am benchmarking the performance on a Linux ...
photon1981's user avatar
1 vote
0 answers
123 views

Why TensorFlow XLA needs many new xla op kernels?

In TensorFlow code about XLA, I see kernels about many OPs like compiler/tf2xla/kernels/concat_op. It seems like a repetition of core/kernels/concat_op. Why Ops like compiler/tf2xla/kernels/concat_op ...
liym27's user avatar
  • 41
0 votes
0 answers
106 views

XLA rng-bit-generator takes too much memory

XLA allocates 4G of memory to this tensor. The size of which seems to scale with the batch size. Which doesn't make sense to me, it doesn't seem to be part of the model graph to be stored in HBM. I ...
iordanis's user avatar
  • 1,284

15 30 50 per page