-
Notifications
You must be signed in to change notification settings - Fork 22.2k
Insights: pytorch/pytorch
September 24, 2024 – September 27, 2024
Overview
Could not load contribution data
Please try again later
17 Pull requests merged by 8 people
-
SDPA regression fix to work around high-precision by default
#136536 merged
Sep 27, 2024 -
[Docs] fix inconsistent docs in conv1d, conv2d, and conv3d
#136813 merged
Sep 27, 2024 -
[Update] Update note for Getting Started with PyTorch on Intel GPUs
#136731 merged
Sep 27, 2024 -
Fix ROCm skip decorator for test_ddp_tp and multiprocess UTs (#136161)
#136801 merged
Sep 26, 2024 -
Update current maintainers
#136769 merged
Sep 26, 2024 -
Constraint setuptools to 72.1.0 or older in requirements.txt
#136729 merged
Sep 26, 2024 -
Revert "Trace fwd graph under no_grad mode #134872"
#136734 merged
Sep 26, 2024 -
Make test_skip_data_serialization regex more flexible
#136710 merged
Sep 26, 2024 -
Disable iOS workflow
#136706 merged
Sep 26, 2024 -
[RELEASE-ONLY CHANGES] Don't push to https://ghcr.io/
#136703 merged
Sep 26, 2024 -
Fix hardcoded ROCm paths in
Caffe2Targets.cmake
#136700 merged
Sep 26, 2024 -
[ROCm] upgrade ROCm CI builds to py3.10 (#134108)
#136696 merged
Sep 26, 2024 -
[ROCm][CI] upgrade CI to ROCm 6.2 (#132555)
#136467 merged
Sep 25, 2024 -
[ROCm] Cherry-pick unit test fixes to release/2.5
#136557 merged
Sep 25, 2024 -
fix stride compare failed when size value equal to one in ForeachUtils.h
#136426 merged
Sep 25, 2024 -
[ROCm] [BUGFIX] Re-enable rocm-specific tuning parameters v2 (#133852)
#136139 merged
Sep 25, 2024 -
Fix test_skip_data_serialization pickle exception match
#136617 merged
Sep 25, 2024
131 Pull requests opened by 80 people
-
[DeviceMesh][EZ] Add group description to new group
#136558 opened
Sep 24, 2024 -
Sac ilp
#136562 opened
Sep 24, 2024 -
Not for commit -- debugging CI
#136564 opened
Sep 24, 2024 -
Rel 2.5 dummy change
#136569 opened
Sep 24, 2024 -
Enable regression test for add loop benchmarks
#136573 opened
Sep 24, 2024 -
[SymmetricMemory] improve multicast initialization/fallback logic
#136577 opened
Sep 24, 2024 -
[DTensor] Add shard method
#136589 opened
Sep 25, 2024 -
[export] simplify automatic dynamic shapes processing
#136591 opened
Sep 25, 2024 -
Bump webrick from 1.7.0 to 1.8.2 in /ios/TestApp
#136593 opened
Sep 25, 2024 -
[Partitioner] Enumerate partitions by iterating partition ids
#136598 opened
Sep 25, 2024 -
[DeviceMesh] Remove set_device
#136604 opened
Sep 25, 2024 -
[FSDP2] Added `shard_placement_fn` only for `_FlatShard`
#136606 opened
Sep 25, 2024 -
[Partitioner] Remove unnecessary upstream nodes in dependency viewer
#136608 opened
Sep 25, 2024 -
[inductor] Reduce block sizes when using Triton CPU backend
#136612 opened
Sep 25, 2024 -
[Partitioner] Reduce time consuming of partitions merger
#136614 opened
Sep 25, 2024 -
[DeviceMesh] Respect user's device type in complimentary PG init
#136615 opened
Sep 25, 2024 -
[Partitioner] Speed up the update of partition map
#136616 opened
Sep 25, 2024 -
Enable ruff's unused variable checking for all pytorch
#136625 opened
Sep 25, 2024 -
Add out_dtype kw argument to optimize_bsr_dense_addmm
#136626 opened
Sep 25, 2024 -
Set RUNPATH so installed tests can find the required shared libraries
#136627 opened
Sep 25, 2024 -
[aotd] Test rrelu noise mutation in compile
#136629 opened
Sep 25, 2024 -
Fix 136201-Compile with USE_CPP_CODE_COVERAGE=ON throw erros: use lld…
#136632 opened
Sep 25, 2024 -
Add output dtype support to count_nonzeros
#136635 opened
Sep 25, 2024 -
Remove potentially unnecessary decomps
#136641 opened
Sep 25, 2024 -
[DO NOT MERGE] Test .github/workflows/inductor-perf-compare.yml on AWS A100 infra
#136646 opened
Sep 25, 2024 -
[ts_converter] Fix prim::If buffer names
#136648 opened
Sep 25, 2024 -
[ts_converter] Support as_tensor
#136649 opened
Sep 25, 2024 -
Fix six broken tests in test_ops.py
#136653 opened
Sep 25, 2024 -
add types to _dynamo/code_context.py
#136665 opened
Sep 25, 2024 -
Migrate ARM64 Linux binary jobs to runner determinator
#136666 opened
Sep 25, 2024 -
Don't generate implicit value ranges for missing symbols.
#136667 opened
Sep 25, 2024 -
Revert a bunch of stuff
#136668 opened
Sep 25, 2024 -
inductor: use previous guards to know if a size is 1 for broadcasting
#136670 opened
Sep 25, 2024 -
Tensorify compute on Python scalars
#136674 opened
Sep 25, 2024 -
[aotd] No AOT compilation for backward
#136675 opened
Sep 25, 2024 -
enable auto functionalize v2 by default
#136685 opened
Sep 25, 2024 -
[user triton] Make tl.constexpr specialization work for triton_op & capture_triton
#136686 opened
Sep 25, 2024 -
[Inductor][CPP] Cache weight tiles in L1D for AMX int8 WoQ GEMM
#136688 opened
Sep 25, 2024 -
Rewrite fake mode detector
#136690 opened
Sep 25, 2024 -
Scoped extension building for C++ backed custom ops tests
#136695 opened
Sep 26, 2024 -
## Fix `devices` Parameter Type in `benchmark_utilization` Function
#136698 opened
Sep 26, 2024 -
[inductor] Test scheme to minimize mem overhead of autotuning
#136701 opened
Sep 26, 2024 -
[WIP][Inductor] auto-chunker
#136702 opened
Sep 26, 2024 -
TEMP
#136707 opened
Sep 26, 2024 -
Delete duplicate bindings in torch/csrc/autograd/python_torch_functions_manual.cpp
#136711 opened
Sep 26, 2024 -
[PT2][Inductor] Add runtime numeric check for the post grad pass
#136724 opened
Sep 26, 2024 -
Add diagonal_copy to torch/_decomp/__init__.py
#136730 opened
Sep 26, 2024 -
[Inductor] change user_visible_outputs to user_visible_output_idxs
#136732 opened
Sep 26, 2024 -
[compiled autograd] initialize cudagraph tls from context manager
#136735 opened
Sep 26, 2024 -
Add type check for `f` in `torch.package.PackageExporter`
#136738 opened
Sep 26, 2024 -
[Quant] Check stride > 0 for QConv and QConvTranspose
#136739 opened
Sep 26, 2024 -
[no land] test fail due to win
#136740 opened
Sep 26, 2024 -
[compiled autograd] undo view_to_reshape inductor fx pass in node name matching
#136741 opened
Sep 26, 2024 -
[AOTI] Support generate c shim layer for Intel GPU.
#136742 opened
Sep 26, 2024 -
Wrap torch_python with torch_compile_options
#136743 opened
Sep 26, 2024 -
Fix overflow error when `torch.bincount()` handles a large tensor
#136745 opened
Sep 26, 2024 -
change GPT2ForSequenceClassification inference accuracy tolerance
#136749 opened
Sep 26, 2024 -
[Intel GPU] qlinear.pointwise with mixed dtype support
#136753 opened
Sep 26, 2024 -
compile time benchmarks for AOTDispatcher (inference/training/subclasses)
#136759 opened
Sep 26, 2024 -
compile time benchmarks for AOTDispatcher (partitioner)
#136760 opened
Sep 26, 2024 -
[Inductor UT] Generalize device-bias code introduced from #136472
#136761 opened
Sep 26, 2024 -
TEST
#136763 opened
Sep 26, 2024 -
Fix for MSVC problem on Windows Arm64
#136765 opened
Sep 26, 2024 -
[aoti][inplace] Support skipping model buffers
#136770 opened
Sep 26, 2024 -
Preserve custom ops via run_decomps
#136773 opened
Sep 26, 2024 -
Remove dtype check on meta device
#136774 opened
Sep 26, 2024 -
[WIP] Add dtype attribute to TritonCSEVariable
#136778 opened
Sep 26, 2024 -
[Inductor] Ensure that the strides of user-visible outputs remain unchanged after post_grad passes
#136779 opened
Sep 26, 2024 -
Add generator parameter to rand*_like functions
#136780 opened
Sep 26, 2024 -
[inductor] add a threshold for membw saving during fusion
#136782 opened
Sep 26, 2024 -
Enable experiments for protected branches
#136785 opened
Sep 26, 2024 -
Download pre-compiled AOTriton from GitHub unless AOTRITON_INSTALL_FROM_SOURCE=1 is set
#136786 opened
Sep 26, 2024 -
add ToFloat, TruncToInt to PythonReferenceAnalysis
#136787 opened
Sep 26, 2024 -
override bool(), is_nonzero for real tensor tracing
#136788 opened
Sep 26, 2024 -
[NCCL] Implement ncclCommInitRankScalable
#136789 opened
Sep 26, 2024 -
[c10d] Fix the device query story of ProcessGroup
#136790 opened
Sep 26, 2024 -
FlexAttention support for NJT
#136792 opened
Sep 26, 2024 -
Init threadpool with user defined num_threads before default
#136793 opened
Sep 26, 2024 -
[BE] Add script to keept the runner-determinator scripts in sync
#136794 opened
Sep 26, 2024 -
Skip the torch.compile in torch::deploy
#136795 opened
Sep 26, 2024 -
[TorchRec][PT2 compile] enable dynamo in _get_user_embeddings
#136798 opened
Sep 26, 2024 -
[CI] upload_metrics function to upload to s3 instead of dynamo
#136799 opened
Sep 26, 2024 -
update the torch.linalg.solve tests for NumPy 2
#136800 opened
Sep 26, 2024 -
[export] add translations for SymInt/Bool deserialization; FloorDiv
#136802 opened
Sep 26, 2024 -
[hack/POC] get DTensor to work with compiled autograd
#136803 opened
Sep 26, 2024 -
[pipelining] Clean up dead code
#136804 opened
Sep 26, 2024 -
[RELEASE-ONLY CHANGES] Delete slow workflows
#136805 opened
Sep 26, 2024 -
Enable tracing through auot_functionalized_v2 in compiled autograd
#136806 opened
Sep 26, 2024 -
[Pytorch][AO] Update choose_qparams_per_token op to output correct shape for scales and zp
#136807 opened
Sep 27, 2024 -
[inductor] Improve operatorbench.py
#136808 opened
Sep 27, 2024 -
[inductor] Benchmark Halide in operatorbench.py
#136809 opened
Sep 27, 2024 -
[halide-backend] Fix ops.fma codegen
#136810 opened
Sep 27, 2024 -
Companion PR to https://github.com/pytorch/pytorch/pull/134022
#136818 opened
Sep 27, 2024 -
Avoid sqrt calculations with values less than zero
#136824 opened
Sep 27, 2024 -
[DONOTMERGE] Update xpu.txt
#136825 opened
Sep 27, 2024 -
Added some tests to prevent regressions in partitioning and flexattention
#136826 opened
Sep 27, 2024 -
[cpu] Modify inductor opt flag --- ftree-loop-vectorize
#136827 opened
Sep 27, 2024 -
[Inductor] Handle device property `warp_size` is None but used on XPU.
#136834 opened
Sep 27, 2024 -
Add back DistributedDataParallel types that were lost when pyi was removed
#136835 opened
Sep 27, 2024 -
[Distributed][Test] Fix todo in distributed test files
#136836 opened
Sep 27, 2024 -
Add option to disable operator profiling
#136838 opened
Sep 27, 2024 -
Update maintainers for inductor and x86 CPU
#136839 opened
Sep 27, 2024 -
[SymmetricMemory] expose the multicast_ptr
#136840 opened
Sep 27, 2024 -
[TEST ONLY][hack/POC] get DTensor to work with compiled autograd
#136841 opened
Sep 27, 2024 -
Traceable FSDP2 + TP
#136842 opened
Sep 27, 2024 -
[fsdp2] based on device, use stream and Event
#136843 opened
Sep 27, 2024 -
Use static variables
#136847 opened
Sep 27, 2024 -
Fix clang-tidy warnings
#136848 opened
Sep 27, 2024 -
Enable XNNPACK for quantized add
#136850 opened
Sep 27, 2024 -
Enable clang-tidy on torch/csrc/lazy
#136851 opened
Sep 27, 2024 -
Run Aarch64 Dashboard with TORCHINDUCTOR_FREEZING and TORCHINDUCTOR_CPP_WRAPPER
#136853 opened
Sep 27, 2024 -
[DCP] use global coordinator rank for distributed ops in _DistWrapper
#136854 opened
Sep 27, 2024 -
Change aarch64 dashboard config to use float32 inference
#136855 opened
Sep 27, 2024 -
[WIP][Inductor UT] Generalize newly introduced inductor UTs for intel GPU (Part 2)
#136856 opened
Sep 27, 2024 -
Get rid of quadratic tests to has_same_metadata
#136857 opened
Sep 27, 2024 -
Avoid reorder in mkldnn_to_dense when output is already in a public format
#136859 opened
Sep 27, 2024 -
[MPS] Error checking/bf16 support for `torch.normal`
#136863 opened
Sep 27, 2024 -
[reland][Elastic] Skip store barrier and store get in host assign
#136865 opened
Sep 27, 2024 -
[inductor] Enable coordinate descent tuning with max-autotune
#136867 opened
Sep 27, 2024 -
testing
#136868 opened
Sep 27, 2024 -
[export] Draft of draft export
#136869 opened
Sep 27, 2024 -
drafting
#136870 opened
Sep 27, 2024 -
Improve is_fbcode functionality
#136871 opened
Sep 27, 2024 -
Fix prefix store seg fault
#136872 opened
Sep 27, 2024 -
[AOTI] Add TORCH_CHECK_STD_ERROR
#136873 opened
Sep 27, 2024 -
Bump triton pin to latest 3.1.x release branch
#136874 opened
Sep 27, 2024 -
Fix autograd.Function + NJT when an output grad is None
#136875 opened
Sep 27, 2024 -
[testing] reenable kernel_benchmark.py tests
#136876 opened
Sep 27, 2024 -
upload test stats: remove nan/inf when uploading
#136877 opened
Sep 27, 2024 -
dont let partitioner think it can fuse pointwise ops into user triton kernels
#136878 opened
Sep 27, 2024
196 Issues closed by 23 people
-
[Doc issue] RMSNorm formula
#136597 closed
Sep 27, 2024 -
Severe SDPA Performance Regression 2.5.0-RC1
#135778 closed
Sep 27, 2024 -
Could not find a configuration file for package "HIP" - requested 1.0, found 6.0.0
#128313 closed
Sep 27, 2024 -
PyTorch for ROCm on a Supported Device Throws "hipErrorNoBinaryForGpu"
#73534 closed
Sep 27, 2024 -
test_eig_with_eigvec_cuda_float64 is flaky on ROCm
#57128 closed
Sep 27, 2024 -
Error on installation
#83795 closed
Sep 27, 2024 -
DISABLED test_unary_ops (__main__.TestTensorExprFuser)
#105119 closed
Sep 27, 2024 -
`torch.igamma` error and gives wrong results on float64 ROCm
#46531 closed
Sep 27, 2024 -
[ROCm] test failures during 4.1 upgrade
#54535 closed
Sep 27, 2024 -
from torch._C import default_generator ImportError: cannot import name 'default_generator'
#40295 closed
Sep 27, 2024 -
ROCm 2.1: test_gamma_gpu_sample test fails
#16661 closed
Sep 27, 2024 -
SDPA batching rules need randomness handling
#135020 closed
Sep 27, 2024 -
`torch._export.aot_compile` CUDA version not compatible with C++ ABI
#134777 closed
Sep 27, 2024 -
Time cost bug in "torch.linalg.cholesky()"
#136823 closed
Sep 27, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_3_cpu_bool (__main__.TestInductorOpInfoCPU)
#135986 closed
Sep 27, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_0_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135985 closed
Sep 27, 2024 -
DISABLED test_comprehensive_nn_functional_cosine_embedding_loss_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135981 closed
Sep 27, 2024 -
DISABLED test_comprehensive_short_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135987 closed
Sep 27, 2024 -
DISABLED test_comprehensive_repeat_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135984 closed
Sep 27, 2024 -
DISABLED test_comprehensive_nn_functional_interpolate_nearest-exact_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135988 closed
Sep 27, 2024 -
DISABLED test_comprehensive_nn_functional_interpolate_nearest-exact_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135983 closed
Sep 27, 2024 -
DISABLED test_comprehensive_nn_functional_pad_replicate_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135975 closed
Sep 27, 2024 -
DISABLED test_comprehensive_softmax_with_dtype_cpu_bool (__main__.TestInductorOpInfoCPU)
#135976 closed
Sep 27, 2024 -
DISABLED test_comprehensive_special_spherical_bessel_j0_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135982 closed
Sep 27, 2024 -
DISABLED test_comprehensive_round_decimals_neg_3_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135980 closed
Sep 27, 2024 -
DISABLED test_comprehensive_nn_functional_pad_replicate_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135977 closed
Sep 27, 2024 -
DISABLED test_comprehensive_scatter_reduce_amax_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135978 closed
Sep 27, 2024 -
DISABLED test_comprehensive_signal_windows_nuttall_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135989 closed
Sep 27, 2024 -
DISABLED test_comprehensive_nn_functional_pad_constant_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135974 closed
Sep 27, 2024 -
Torchvision.transforms.v2 does nothing / fails silently with numpy arrays
#136844 closed
Sep 27, 2024 -
`torch._dynamo.exc.Unsupported: torch.* op returned non-Tensor bool call_method is_inference`
#135439 closed
Sep 27, 2024 -
DISABLED test_comprehensive_sinc_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135940 closed
Sep 27, 2024 -
DISABLED test_comprehensive_put_cpu_bool (__main__.TestInductorOpInfoCPU)
#135952 closed
Sep 27, 2024 -
DISABLED test_comprehensive_scatter_reduce_sum_cpu_bool (__main__.TestInductorOpInfoCPU)
#135947 closed
Sep 27, 2024 -
DISABLED test_comprehensive_round_decimals_3_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135951 closed
Sep 27, 2024 -
DISABLED test_comprehensive_prod_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135942 closed
Sep 27, 2024 -
DISABLED test_comprehensive_rsub_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135946 closed
Sep 27, 2024 -
DISABLED test_comprehensive_scatter_reduce_prod_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135944 closed
Sep 27, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_2_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135950 closed
Sep 27, 2024 -
DISABLED test_comprehensive_special_scaled_modified_bessel_k1_cpu_bool (__main__.TestInductorOpInfoCPU)
#135941 closed
Sep 27, 2024 -
DISABLED test_comprehensive_put_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135938 closed
Sep 27, 2024 -
DISABLED test_comprehensive_scatter_reduce_amin_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135939 closed
Sep 27, 2024 -
DISABLED test_comprehensive_special_ndtri_cpu_bool (__main__.TestInductorOpInfoCPU)
#135948 closed
Sep 27, 2024 -
DISABLED test_comprehensive_square_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135943 closed
Sep 27, 2024 -
DISABLED test_comprehensive_ones_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135945 closed
Sep 27, 2024 -
DISABLED test_comprehensive_remainder_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135937 closed
Sep 27, 2024 -
models `.forward` and exported onnx are not the same
#130826 closed
Sep 27, 2024 -
torch.onnx.errors.UnsupportedOperatorError
#131635 closed
Sep 27, 2024 -
torch.onnx.export Doesn't allow dynamic shapes when updating a tensor
#135233 closed
Sep 27, 2024 -
ONNX dynamic sized model export with torch.onnx.dynamo_export fails when .copy_() / roll / fftn is used
#128324 closed
Sep 27, 2024 -
onnx.export() fails on aten::embedding_bag with padding_idx
#128930 closed
Sep 27, 2024 -
[ONNX] How to export the FlashAttention kernel
#135645 closed
Sep 27, 2024 -
`torch.nn.functional._in_projection_packed` Failed to export to ONNX
#135764 closed
Sep 27, 2024 -
[ONNX] Support `operator.mod`
#136524 closed
Sep 27, 2024 -
DISABLED test_comprehensive_scatter_add_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135899 closed
Sep 27, 2024 -
DISABLED test_comprehensive_norm_inf_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135908 closed
Sep 27, 2024 -
DISABLED test_comprehensive_ones_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135902 closed
Sep 27, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_4_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135900 closed
Sep 27, 2024 -
DISABLED test_comprehensive_ones_like_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135905 closed
Sep 27, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_2_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135904 closed
Sep 27, 2024 -
DISABLED test_comprehensive_special_scaled_modified_bessel_k0_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135901 closed
Sep 27, 2024 -
DISABLED test_comprehensive_roll_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135898 closed
Sep 27, 2024 -
DISABLED test_comprehensive_special_xlog1py_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135895 closed
Sep 27, 2024 -
DISABLED test_comprehensive_special_modified_bessel_k1_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135903 closed
Sep 27, 2024 -
DISABLED test_comprehensive_rand_like_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135897 closed
Sep 27, 2024 -
DISABLED test_comprehensive_remainder_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135907 closed
Sep 27, 2024 -
DISABLED test_comprehensive_nn_functional_conv3d_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135896 closed
Sep 27, 2024 -
DISABLED test_comprehensive_signbit_cpu_bool (__main__.TestInductorOpInfoCPU)
#135906 closed
Sep 27, 2024 -
DISABLED test_comprehensive_reciprocal_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135909 closed
Sep 27, 2024 -
DISABLED test_comprehensive_repeat_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135910 closed
Sep 27, 2024 -
vector norm is drastically different for different data types
#123645 closed
Sep 27, 2024 -
Long queue for Linux runners
#136762 closed
Sep 26, 2024 -
[torch.profiler] double counting CUDA wrapper self-cuda-time
#60783 closed
Sep 26, 2024 -
matrix_norm performance vastly underwhelming vs deprecated torch.norm
#136360 closed
Sep 26, 2024 -
Can't load AOT Inductor binary on cuda:1 device
#136369 closed
Sep 26, 2024 -
[NJT] Gradients for bias do not get populated for nn.Linear
#136652 closed
Sep 26, 2024 -
[RFC] Cuda support matrix for Release 2.5
#134015 closed
Sep 26, 2024 -
DISABLED test_torch_function_mode_guards_ignored_types_py (__main__.TorchFunctionModeTests)
#135102 closed
Sep 26, 2024 -
RuntimeError: "arange_mps" not implemented for 'BFloat16'
#136624 closed
Sep 26, 2024 -
Wrapper subclasses utilizing reentrant dispatch break when a TorchDispatchMode is enabled
#136565 closed
Sep 26, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_0_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135856 closed
Sep 26, 2024 -
DISABLED test_comprehensive_special_modified_bessel_k1_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135846 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_conv2d_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135839 closed
Sep 26, 2024 -
DISABLED test_comprehensive_randn_like_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135840 closed
Sep 26, 2024 -
DISABLED test_comprehensive_special_scaled_modified_bessel_k1_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135850 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_threshold_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135844 closed
Sep 26, 2024 -
DISABLED test_comprehensive_remainder_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135855 closed
Sep 26, 2024 -
DISABLED test_comprehensive_where_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135842 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_triplet_margin_loss_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135854 closed
Sep 26, 2024 -
DISABLED test_comprehensive_rsub_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135848 closed
Sep 26, 2024 -
DISABLED test_comprehensive_signbit_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135853 closed
Sep 26, 2024 -
DISABLED test_comprehensive_rad2deg_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135852 closed
Sep 26, 2024 -
DISABLED test_comprehensive_sigmoid_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135849 closed
Sep 26, 2024 -
DISABLED test_comprehensive_randint_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135838 closed
Sep 26, 2024 -
DISABLED test_comprehensive_zeros_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135845 closed
Sep 26, 2024 -
DISABLED test_comprehensive_special_hermite_polynomial_he_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135847 closed
Sep 26, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_2_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135851 closed
Sep 26, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_3_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135841 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_triplet_margin_loss_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135798 closed
Sep 26, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_3_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135738 closed
Sep 26, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_4_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135744 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_smooth_l1_loss_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135746 closed
Sep 26, 2024 -
DISABLED test_comprehensive_prod_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135810 closed
Sep 26, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_0_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135753 closed
Sep 26, 2024 -
DISABLED test_scaled_dot_product_fused_attention_overrideable (__main__.TestSDPAPrivateUse1Only)
#134601 closed
Sep 26, 2024 -
`input` parameter of `index_select()` with a 0D tensor works
#136636 closed
Sep 26, 2024 -
The doc of `linalg.matrix_norm()` should say that there is `input` parameter instead of `A` parameter
#136619 closed
Sep 26, 2024 -
DISABLED test_comprehensive_ones_cpu_bool (__main__.TestInductorOpInfoCPU)
#135740 closed
Sep 26, 2024 -
DISABLED test_comprehensive_prod_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135807 closed
Sep 26, 2024 -
DISABLED test_comprehensive_randint_like_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135799 closed
Sep 26, 2024 -
DISABLED test_multi_output_unbacked_custom_op_cuda (__main__.TestInductorDynamicCUDA)
#135755 closed
Sep 26, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_4_cpu_bool (__main__.TestInductorOpInfoCPU)
#135751 closed
Sep 26, 2024 -
DISABLED test_closure_out_of_scope_cell_with_mutation (__main__.MiscTests)
#135556 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_rms_norm_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135739 closed
Sep 26, 2024 -
DISABLED test_comprehensive_prod_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135800 closed
Sep 26, 2024 -
DISABLED test_comprehensive_scatter_reduce_prod_cpu_bool (__main__.TestInductorOpInfoCPU)
#135752 closed
Sep 26, 2024 -
DISABLED test_comprehensive_vdot_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135748 closed
Sep 26, 2024 -
DISABLED test_comprehensive_special_modified_bessel_i0_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135743 closed
Sep 26, 2024 -
DISABLED test_comprehensive_where_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135784 closed
Sep 26, 2024 -
DISABLED test_comprehensive_special_scaled_modified_bessel_k0_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135801 closed
Sep 26, 2024 -
DISABLED test_comprehensive_triu_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135737 closed
Sep 26, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_1_cpu_bool (__main__.TestInductorOpInfoCPU)
#135812 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_softmin_with_dtype_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135813 closed
Sep 26, 2024 -
DISABLED test_comprehensive_special_modified_bessel_i1_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135814 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_tanhshrink_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135804 closed
Sep 26, 2024 -
DISABLED test_comprehensive_special_bessel_y1_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135805 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_softshrink_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135742 closed
Sep 26, 2024 -
DISABLED test_comprehensive_transpose_copy_cpu_bool (__main__.TestInductorOpInfoCPU)
#135782 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_upsample_nearest_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135802 closed
Sep 26, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_4_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135749 closed
Sep 26, 2024 -
DISABLED test_comprehensive_special_airy_ai_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135750 closed
Sep 26, 2024 -
DISABLED test_comprehensive__unsafe_masked_index_cuda_int32 (__main__.TestInductorOpInfoCUDA)
#131118 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_binary_cross_entropy_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135806 closed
Sep 26, 2024 -
DISABLED test_comprehensive_special_ndtr_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135754 closed
Sep 26, 2024 -
DISABLED test_comprehensive_pow_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135815 closed
Sep 26, 2024 -
DISABLED test_python_ref_executor__refs_stft_executor_aten_cuda_complex128 (__main__.TestCommonCUDA)
#135756 closed
Sep 26, 2024 -
DISABLED test_comprehensive_scatter_reduce_prod_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135797 closed
Sep 26, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_3_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135745 closed
Sep 26, 2024 -
DISABLED test_comprehensive_unravel_index_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135808 closed
Sep 26, 2024 -
DISABLED test_comprehensive_pca_lowrank_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135811 closed
Sep 26, 2024 -
DISABLED test_comprehensive_nn_functional_avg_pool3d_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135747 closed
Sep 26, 2024 -
DISABLED test_comprehensive_short_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135803 closed
Sep 26, 2024 -
[Break XPU] Device bias code introduced from #134874: create CUDA tensor on XPU device.
#136595 closed
Sep 26, 2024 -
The doc of `Sigmoid()` says there are `*args` and `**kwargs` but they don't work
#133688 closed
Sep 26, 2024 -
The doc of `Softsign()` says there are `*args` and `**kwargs` but they don't work
#133684 closed
Sep 26, 2024 -
The doc of `Tanh()` says there are `*args` and `**kwargs` but they don't work
#133683 closed
Sep 26, 2024 -
Disable Python torch.library calls under torch::deploy
#136177 closed
Sep 26, 2024 -
`/opt/rocm/lib/libamdhip64.so` is hardcoded in `Caffe2Targets.cmake` in ROCm wheels
#131701 closed
Sep 26, 2024 -
[Flex attention] Error in create_block_mask with _compile=True on Torch 2.6
#136306 closed
Sep 26, 2024 -
ValueError: Pointer argument (at 3) cannot be accessed from Triton
#136078 closed
Sep 25, 2024 -
AttributeError: module 'distutils' has no attribute '_msvccompiler'
#136541 closed
Sep 25, 2024 -
Discrepancy between scaled_dot_product_attention and flex_attention outputs
#136651 closed
Sep 25, 2024 -
quantize_fx module not working on x86 machine for any torch vision model
#136511 closed
Sep 25, 2024 -
tl.constexpr inputs to user-defined triton kernels should not be dynamic
#136504 closed
Sep 25, 2024 -
DISABLED test_b2b_gemm_left_assoc_good_shape (__main__.B2BGEMMTest)
#133233 closed
Sep 25, 2024 -
DISABLED test_b2b_gemm_trivial_right_assoc_good_shape (__main__.B2BGEMMTest)
#134143 closed
Sep 25, 2024 -
DISABLED test_b2b_gemm_trivial_left_assoc_good_shape (__main__.B2BGEMMTest)
#133403 closed
Sep 25, 2024 -
DISABLED test_b2b_gemm_right_assoc_good_shape (__main__.B2BGEMMTest)
#133311 closed
Sep 25, 2024 -
[inductor][cpu] inductor_max_autotune models accuracy failure in 2024-08-10 nightly release
#133465 closed
Sep 25, 2024 -
DISABLED test_comprehensive_nn_functional_prelu_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135667 closed
Sep 25, 2024 -
DISABLED test_comprehensive_trapezoid_cpu_int64 (__main__.TestInductorOpInfoCPU)
#135670 closed
Sep 25, 2024 -
DISABLED test_comprehensive_rot90_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135666 closed
Sep 25, 2024 -
DISABLED test_comprehensive_xlogy_cpu_bool (__main__.TestInductorOpInfoCPU)
#135674 closed
Sep 25, 2024 -
DISABLED test_comprehensive_softmax_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135682 closed
Sep 25, 2024 -
DISABLED test_comprehensive_special_hermite_polynomial_h_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135677 closed
Sep 25, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_3_cpu_int32 (__main__.TestInductorOpInfoCPU)
#135669 closed
Sep 25, 2024 -
DISABLED test_comprehensive_nn_functional_soft_margin_loss_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135673 closed
Sep 25, 2024 -
DISABLED test_autograd_cpp_node_saved_dynamic (__main__.TestCompiledAutograd)
#135685 closed
Sep 25, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_2_cpu_bool (__main__.TestInductorOpInfoCPU)
#135684 closed
Sep 25, 2024 -
DISABLED test_comprehensive_nn_functional_prelu_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135672 closed
Sep 25, 2024 -
DISABLED test_comprehensive_polygamma_polygamma_n_1_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135679 closed
Sep 25, 2024 -
DISABLED test_comprehensive_nn_functional_adaptive_max_pool2d_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135671 closed
Sep 25, 2024 -
DISABLED test_comprehensive_norm_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135681 closed
Sep 25, 2024 -
DISABLED test_comprehensive_nn_functional_unfold_cpu_bool (__main__.TestInductorOpInfoCPU)
#135683 closed
Sep 25, 2024 -
DISABLED test_comprehensive_t_copy_cpu_bool (__main__.TestInductorOpInfoCPU)
#135665 closed
Sep 25, 2024 -
DISABLED test_comprehensive_nn_functional_smooth_l1_loss_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135678 closed
Sep 25, 2024 -
DISABLED test_comprehensive_nn_functional_upsample_nearest_cpu_uint8 (__main__.TestInductorOpInfoCPU)
#135676 closed
Sep 25, 2024 -
DISABLED test_comprehensive_signal_windows_nuttall_cpu_float32 (__main__.TestInductorOpInfoCPU)
#135675 closed
Sep 25, 2024 -
DISABLED test_comprehensive_nn_functional_kl_div_cpu_float64 (__main__.TestInductorOpInfoCPU)
#135668 closed
Sep 25, 2024 -
`.eval()` and `.train()` don't set value of `.training` properly on `torch.compile()` module
#132986 closed
Sep 25, 2024 -
DISABLED test_comprehensive_zeros_cpu_float16 (__main__.TestInductorOpInfoCPU)
#135642 closed
Sep 25, 2024 -
AdEMAMix: Adaptive Exponential Moving Average Mix Optimizer
#135609 closed
Sep 25, 2024 -
Libtorch build for ROCM error: “aten/src/THH” not exist
#126640 closed
Sep 25, 2024 -
[ONNX] Replace ONNXProgram class
#136274 closed
Sep 24, 2024 -
torch.linalg.lstsq generating different solutions on CPU and GPU
#136443 closed
Sep 24, 2024 -
Pytorch unable to compile with gcc version later than 12
#136556 closed
Sep 24, 2024 -
[Feature Request] Calculating FLOPs for computational graph operations
#5013 closed
Sep 24, 2024 -
[torch.export] Automate export constrains like in onnx dynamo
#136210 closed
Sep 24, 2024 -
[torch.export] Can't load UNet after compiling ExportedProgram with torch_tensorrt.dynamo.compile and saving
#136317 closed
Sep 24, 2024
94 Issues opened by 60 people
-
fused_scaled_matmul_reduce_scatter report error with channel-wise scaling
#136866 opened
Sep 27, 2024 -
`enforce_cond_guards_match` (completely unused)
#136864 opened
Sep 27, 2024 -
Cleanup stale Dynamo feature flags
#136862 opened
Sep 27, 2024 -
Error when calling multiple backward passes on FSDP model
#136861 opened
Sep 27, 2024 -
ONNX export: torch.onnx.errors.SymbolicValueError: Unsupported prim::Constant kind: 'ival'
#136860 opened
Sep 27, 2024 -
AOTAutograd has_same_metadata call in collect_metadata_analysis.py is quadratic
#136852 opened
Sep 27, 2024 -
Pytorch picks wrong cuda version for building extensions
#136845 opened
Sep 27, 2024 -
Compiling a module leads to `AssertionError: expected size 64==64, stride 1==49 at dim=1`
#136837 opened
Sep 27, 2024 -
Thread safety issue with torch.compile()
#136833 opened
Sep 27, 2024 -
false INTERNAL ASSERT FAILED was triggered when torch.device is mkldnn
#136831 opened
Sep 27, 2024 -
false INTERNAL ASSERT FAILED in `torch.jit.set_fusion_strategy`
#136829 opened
Sep 27, 2024 -
false INTERNAL ASSERT FAILED in `torch.empty`/`torch.ones`
#136828 opened
Sep 27, 2024 -
[Break XPU] device_props.warp_size is None on XPU.
#136820 opened
Sep 27, 2024 -
Aborted (core dumped) in `torch.hsmm`/`torch.hspmm`/`torch.hsmm`/`torch.sspaddmm`
#136819 opened
Sep 27, 2024 -
Segmentation fault (core dumped) in `torch.profiler.profile`
#136817 opened
Sep 27, 2024 -
Aborted (core dumped) in `torch.cuda.caching_allocator_delete`
#136815 opened
Sep 27, 2024 -
Dynamo inlining errors with some calls to nested functions that use captured variables
#136814 opened
Sep 27, 2024 -
[rfc] [pipelining] shape inference + cached buffer allocation
#136811 opened
Sep 27, 2024 -
ValueRange division breaks with pow_by_natural
#136797 opened
Sep 26, 2024 -
compilation of rrelu_with_noise with bfloat16 input does not capture noise mutation
#136784 opened
Sep 26, 2024 -
torch.is_grad_enabled() is False when using custom_op decorator
#136771 opened
Sep 26, 2024 -
Pipelining zero bubble and activation checkpointing bug
#136766 opened
Sep 26, 2024 -
PyTorch_ROCm use CPU rather than GPU
#136758 opened
Sep 26, 2024 -
torch.export.export fails to trace through a binary operator
#136757 opened
Sep 26, 2024 -
Onnx scaled_dot_product_attention does not allow to export model
#136756 opened
Sep 26, 2024 -
Some virtual functions in `AcceleratorHooksInterface` are not overrided
#136751 opened
Sep 26, 2024 -
Poor-quality random numbers generated by torch.poisson on gpus
#136750 opened
Sep 26, 2024 -
Inconsistent behavior of cdist with half-precision inputs
#136748 opened
Sep 26, 2024 -
Provide `gather_mm` functionality and/or expand nested tensor support
#136747 opened
Sep 26, 2024 -
torch._int_mm accuracy issue on AMD CPU
#136746 opened
Sep 26, 2024 -
Onnx exporting bug
#136737 opened
Sep 26, 2024 -
torch.unravel_index does not check out-of-bounds
#136736 opened
Sep 26, 2024 -
Be smart about autograd formulas saving either the input or output, depending on context
#136733 opened
Sep 26, 2024 -
Aborted (core dumped) in `torch.package.package_exporter.PackageExporter`/`torch.package.PackageExporter`
#136728 opened
Sep 26, 2024 -
Aborted (core dumped) in `torch.distributed.rpc`
#136726 opened
Sep 26, 2024 -
Segmentation fault (core dumped) in `torch.distributed.dist.TCPStore`
#136725 opened
Sep 26, 2024 -
Segmentation fault (core dumped) in `torch.distributed.PrefixStore`
#136723 opened
Sep 26, 2024 -
Segmentation fault (core dumped) in `torch.bincount`
#136720 opened
Sep 26, 2024 -
Segmentation fault (core dumped) in `torch.nn.functional.max_pool1d`
#136719 opened
Sep 26, 2024 -
Floating point exception (core dumped) in `torch.ao.nn.quantized.Conv1d/Conv2d/Conv3d` when stride=0
#136718 opened
Sep 26, 2024 -
Aborted (core dumped) in `torch.linalg.ldl_solve` with double free or corruption (out)
#136714 opened
Sep 26, 2024 -
Segmentation fault (core dumped) in `torch.ao.nn.quantized.dynamic.LSTMCell/GRUCell`
#136712 opened
Sep 26, 2024 -
Segmentation fault (core dumped) in `torch._weight_norm`/`torch._weight_int8pack_mm`
#136709 opened
Sep 26, 2024 -
Segmentation fault (core dumped) in `torch._fft_r2c`/`torch._fft_c2`/`torch._fft_c2r`
#136704 opened
Sep 26, 2024 -
false INTERNAL ASSERT FAILED in `torch._add_batch_dim`
#136699 opened
Sep 26, 2024 -
Incorrect Type for `devices` Parameter in `benchmark_utilization` Function
#136697 opened
Sep 26, 2024 -
`slow` workflow has been broken for 4+ weeks
#136694 opened
Sep 25, 2024 -
`windows.g4dn.xlarge` are periodically unavailable
#136693 opened
Sep 25, 2024 -
NotImplementedError: The operator 'aten::linalg_matrix_exp' is not currently implemented for the MPS device.
#136692 opened
Sep 25, 2024 -
Segfaulting/aborting unit tests do not show up in "Show Additional Test Info" section
#136691 opened
Sep 25, 2024 -
DISABLED test_pinned_memory_empty_cache (__main__.TestCuda)
#136687 opened
Sep 25, 2024 -
[triton_op] Automatically `tl.constexpr` user-written kernel params when they are static integers
#136681 opened
Sep 25, 2024 -
torch.nn.InstanceNorm3d producing inconsistent output for float16 tensors on CPU and GPU
#136680 opened
Sep 25, 2024 -
[BUG] torch/extension.h: undefined symbol
#136664 opened
Sep 25, 2024 -
Composition of torch.compile and torch.func.grad silently produces a wrong result.
#136662 opened
Sep 25, 2024 -
[NJT] Dropout(0.0) with NJT increments cuda rng_state (only for no-compile)
#136656 opened
Sep 25, 2024 -
[TorchScript] typing_extensions.deprecated doesn't work
#136654 opened
Sep 25, 2024 -
torch.compile HUD dashboard should have repro commands
#136647 opened
Sep 25, 2024 -
Add support for immutable tensors in torch.export
#136642 opened
Sep 25, 2024 -
inductor can't broadcast tensors when they have dynamic shapes:
#136640 opened
Sep 25, 2024 -
[inductor][cpu]jx_nest_base fp32 inductor_max_autotune accuracy failure in 2024_09_23 nightly release
#136639 opened
Sep 25, 2024 -
DISABLED test_graph_optims_RMSprop_cuda_float32 (__main__.TestCudaOptimsCUDA)
#136638 opened
Sep 25, 2024 -
`all_gather_object` fails
#136637 opened
Sep 25, 2024 -
torch.unique does not keep order of occurences even with "sorted=False"
#136633 opened
Sep 25, 2024 -
[torch.export.load] failed while executing `pow_by_natural`
#136628 opened
Sep 25, 2024 -
The MPS Backend sometimes samples outside of distribution domain with `multinomial`
#136623 opened
Sep 25, 2024 -
DISABLED test_graph_optims_RAdam_cuda_float32 (__main__.TestCudaOptimsCUDA)
#136620 opened
Sep 25, 2024 -
tensor.triu_(1) not working properly with large matrix
#136611 opened
Sep 25, 2024 -
automatic_dynamic_shapes for mark_unbacked
#136605 opened
Sep 25, 2024 -
DISABLED test_graph_optims_NAdam_cuda_float32 (__main__.TestCudaOptimsCUDA)
#136602 opened
Sep 25, 2024 -
[ONNX] Single model export for HF models prompt and token phase
#136592 opened
Sep 25, 2024 -
DistributedSampler shuffle option doesn't work as expected
#136588 opened
Sep 24, 2024 -
"Python Replay Stack is Empty"
#136587 opened
Sep 24, 2024 -
support FakeTensor input for torch.compile
#136586 opened
Sep 24, 2024 -
torch.export support for the latest transformers `DynamicCache` as input
#136582 opened
Sep 24, 2024 -
Bool convolutions (and other integral types)
#136578 opened
Sep 24, 2024 -
DISABLED test_graph_optims_Adamax_cuda_float32 (__main__.TestCudaOptimsCUDA)
#136576 opened
Sep 24, 2024 -
[ONNX] Dynamic shapes: support `torch.sym_not`
#136572 opened
Sep 24, 2024 -
Setting a `complex` tensor to `linalg.vector_norm()` returns a `float` tensor
#136568 opened
Sep 24, 2024 -
[RFC] Offload collectives to NVSwitch when possible
#136567 opened
Sep 24, 2024 -
The doc of `linalg.vector_norm()` should not say `ord` parameter accepts the `str` value `fro` or `nuc`
#136563 opened
Sep 24, 2024 -
The doc of `linalg.vector_norm()` should say there is `x` or `input` parameter
#136560 opened
Sep 24, 2024 -
torch.compile errors when tracing numpy.random.uniform with numpy2
#136559 opened
Sep 24, 2024 -
The doc of `linalg.norm()` should say there is `input` parameter instead of `A` parameter for `linalg.norm()`
#136555 opened
Sep 24, 2024
310 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Add `truediv` support in export serializer
#136364 commented on
Sep 27, 2024 • 13 new comments -
Implements user buffer registration using MemPool
#133603 commented on
Sep 27, 2024 • 10 new comments -
[scan] support jit inductor
#135603 commented on
Sep 27, 2024 • 10 new comments -
Fix AOTI CPP GEMM Template issue without freezing
#136421 commented on
Sep 27, 2024 • 10 new comments -
Enable Windows Arm64
#133088 commented on
Sep 27, 2024 • 9 new comments -
[inductor] Add a OperatorBench to benchmark custom operations
#136169 commented on
Sep 27, 2024 • 8 new comments -
[inductor] Support freezing with FX graph caching
#136505 commented on
Sep 27, 2024 • 8 new comments -
Add SVE implementation of embedding_lookup_idx
#133995 commented on
Sep 27, 2024 • 7 new comments -
[dynamo] add torch._dynamo.enable and fix compile/enable/disable interaction
#132926 commented on
Sep 27, 2024 • 7 new comments -
Create copies for all replica `dict` attributes (namely hook-tracking dicts) in `_replicate_for_data_parallel`
#128272 commented on
Sep 26, 2024 • 7 new comments -
raw_alloc ignores PYTORCH_NO_CUDA_MEMORY_CACHING
#131114 commented on
Sep 27, 2024 • 6 new comments -
[sparse][semi-structured] Add float8 dtype support to 24 sparsity
#136397 commented on
Sep 27, 2024 • 6 new comments -
Make IPC features extendable on third-party devices
#133222 commented on
Sep 27, 2024 • 6 new comments -
Enable failing diffs on regression
#136551 commented on
Sep 27, 2024 • 6 new comments -
Add lowering for aten.searchsorted
#135701 commented on
Sep 27, 2024 • 6 new comments -
[Inductor] Enable Cpp wraper for Intel GPU.
#135318 commented on
Sep 27, 2024 • 6 new comments -
Ensure noncontiguous tensor creation tests offsetting
#136396 commented on
Sep 27, 2024 • 5 new comments -
Fix AOT Graph capture not propagating non_blocking copy parameter to …
#136513 commented on
Sep 27, 2024 • 5 new comments -
Fix tensor subclass + dynamic shapes in torch.compile + aot autograd
#125941 commented on
Sep 27, 2024 • 5 new comments -
Fix PT2 Source Code Annotations
#136460 commented on
Sep 27, 2024 • 5 new comments -
[aotd] Subclasses profile logging
#136478 commented on
Sep 27, 2024 • 5 new comments -
Introduce torch.sym_sum
#136429 commented on
Sep 26, 2024 • 4 new comments -
Allow async ops for all gather with gather dim != 0
#136428 commented on
Sep 27, 2024 • 4 new comments -
[ROCm] fastSpecializedAtomicAdd for MI300
#135770 commented on
Sep 27, 2024 • 4 new comments -
Simplify find_localzeros
#133325 commented on
Sep 27, 2024 • 4 new comments -
Lowerings: remove restriction on TensorBox keyword arguments
#136055 commented on
Sep 27, 2024 • 4 new comments -
Add support for `@contextmanager` in Dynamo
#136033 commented on
Sep 27, 2024 • 3 new comments -
Add UTs for accelerator device-agnostic runtime APIs
#133572 commented on
Sep 25, 2024 • 3 new comments -
Introduce a device-agnostic runtime API design
#132204 commented on
Sep 25, 2024 • 3 new comments -
Enable XPUEvent elapsed_time function
#134666 commented on
Sep 27, 2024 • 3 new comments -
Improve decomposition for constant_pad_nd
#123661 commented on
Sep 25, 2024 • 3 new comments -
[inductor] refine loop split logic
#128812 commented on
Sep 26, 2024 • 3 new comments -
Add CI for Triton CPU backend
#135342 commented on
Sep 27, 2024 • 2 new comments -
[ROCm] Tunableop record untuned
#128813 commented on
Sep 27, 2024 • 2 new comments -
[Inductor] Pick ISA for inductor based on ATEN_CPU_CAPABILITY
#123514 commented on
Sep 27, 2024 • 2 new comments -
Remove unused Python variables outside torch/ and test/
#136359 commented on
Sep 25, 2024 • 2 new comments -
[ARM][feat]: Add KleidiAI Backend & enable 4 bit matmul operators
#134124 commented on
Sep 27, 2024 • 2 new comments -
[RELEASE ONLY CHANGES] Revert XNNPACK Update
#136522 commented on
Sep 27, 2024 • 2 new comments -
Add deterministic path for CUDA `cumsum`
#136224 commented on
Sep 27, 2024 • 2 new comments -
Enable additional tests for MPS CI runs
#134356 commented on
Sep 26, 2024 • 2 new comments -
[1/N] Fix clang-tidy warnings in torch/csrc/api/
#134545 commented on
Sep 27, 2024 • 2 new comments -
multiprocessing.spawn: allow a grace period when shutdown
#131278 commented on
Sep 27, 2024 • 1 new comment -
[wip][compiled autograd] Lifted C++ lambdas
#135402 commented on
Sep 27, 2024 • 1 new comment -
Error message for allow_in_graph decorator and arbitrary function combo
#135972 commented on
Sep 27, 2024 • 1 new comment -
fix sampler - force cpu device for .tolist tensors
#135990 commented on
Sep 25, 2024 • 1 new comment -
Extend vectorization with SVE(ARM) with Torch Compile (Inductor)
#134672 commented on
Sep 25, 2024 • 1 new comment -
Enable -Werror on s390x
#136527 commented on
Sep 27, 2024 • 1 new comment -
Fix adaptive_max_pool2d fallback
#136367 commented on
Sep 24, 2024 • 1 new comment -
Add doc for device-agnostic runtime APIs
#133323 commented on
Sep 25, 2024 • 1 new comment -
[WIP] add support for bias grads in flexattention inductor
#136077 commented on
Sep 27, 2024 • 1 new comment -
Add Support for Tracking Parameter Names (named_parameters) in Optimizer State Dict
#134107 commented on
Sep 25, 2024 • 1 new comment -
fix sequence number for group
#134578 commented on
Sep 25, 2024 • 1 new comment -
[ONNX] Remove deprecated OperatorExportTypes and ExportTypes
#136277 commented on
Sep 27, 2024 • 1 new comment -
[scan] support closure
#135602 commented on
Sep 27, 2024 • 1 new comment -
Properly uses ref-counting for torch.cuda.use_mem_pool
#133600 commented on
Sep 27, 2024 • 0 new comments -
Adds snapshot API for MemPools to get pool memory segments
#133601 commented on
Sep 27, 2024 • 0 new comments -
Refactors empty_cache to return only MemPool memory to the system
#133602 commented on
Sep 27, 2024 • 0 new comments -
Add decomposition for squeeze_copy
#130941 commented on
Sep 26, 2024 • 0 new comments -
Reuse UT for Intel GPU backend [Part1]
#127602 commented on
Sep 27, 2024 • 0 new comments -
Enable Bert with Semi Structure Sparsity on ROCm
#133934 commented on
Sep 27, 2024 • 0 new comments -
test_execution_trace.py: Use instantiate_device_type_tests to run GPU tests on HPU as well
#133975 commented on
Sep 27, 2024 • 0 new comments -
[inductor] enable bf32 for mkldnn linear pointwise/binary in inductor
#127294 commented on
Sep 25, 2024 • 0 new comments -
[inductor] enable bf32 test for mkldnn conv
#127293 commented on
Sep 25, 2024 • 0 new comments -
Fix lru_cache where config is used
#134235 commented on
Sep 25, 2024 • 0 new comments -
INT8 SDPA API
#134317 commented on
Sep 26, 2024 • 0 new comments -
Fix unbind_copy and add its decomposition
#134319 commented on
Sep 26, 2024 • 0 new comments -
WIP - Prologue Fusion
#134532 commented on
Sep 27, 2024 • 0 new comments -
Disable AMP when propagating fake tensors
#134583 commented on
Sep 27, 2024 • 0 new comments -
Remove unused Python variables in test/
#134665 commented on
Sep 25, 2024 • 0 new comments -
DISABLED test_grad_scaler_with_preset_grad_scale_in_place_unscale_False_Adam_cuda_float32 (__main__.TestCudaOptimsCUDA)
#135721 commented on
Sep 26, 2024 • 0 new comments -
Fix constant propagation in builtins and UserClasses
#131354 commented on
Sep 26, 2024 • 0 new comments -
[triton ci] Allow building with triton hash
#131371 commented on
Sep 27, 2024 • 0 new comments -
[BE] typing for decorators - _jit_internal
#131573 commented on
Sep 26, 2024 • 0 new comments -
torch.fx.Tracer.record_stack_traces fix
#131741 commented on
Sep 27, 2024 • 0 new comments -
Generalization of distributed UT content to enable non cuda device execution
#131758 commented on
Sep 27, 2024 • 0 new comments -
Danielmic decouple some api
#131882 commented on
Sep 27, 2024 • 0 new comments -
Add support for more dtypes for serialization
#131939 commented on
Sep 27, 2024 • 0 new comments -
Tentative fix for fake tensor SymInt
#131943 commented on
Sep 25, 2024 • 0 new comments -
Fix bmm_sparse_cuda illegal memory access
#131977 commented on
Sep 27, 2024 • 0 new comments -
Delete cmake/Modules_CUDA_fix directory
#132035 commented on
Sep 27, 2024 • 0 new comments -
Add Weighted Loss Functions to PyTorch : WMSE, WMAE, and Weighted Huber Loss
#132049 commented on
Sep 26, 2024 • 0 new comments -
Refactor serialization IO infrastructure and support HDFS/HTTP
#130913 commented on
Sep 27, 2024 • 0 new comments -
use spawn as default start method to create dataloader subprocess
#132210 commented on
Sep 25, 2024 • 0 new comments -
xpu: support sycl with torch.utils.cpp_extension APIs
#132945 commented on
Sep 24, 2024 • 0 new comments -
S390x update builder image
#132983 commented on
Sep 27, 2024 • 0 new comments -
[Intel GPU] qconv at XPU backend
#133080 commented on
Sep 26, 2024 • 0 new comments -
support zb1p and zb2p algorithms
#130752 commented on
Sep 27, 2024 • 0 new comments -
Inductor annotations
#130429 commented on
Sep 26, 2024 • 0 new comments -
[inductor][cpp] Add BMM kernel template for autotuning
#129772 commented on
Sep 27, 2024 • 0 new comments -
[Intel GPU] qlinear at XPU backend
#133307 commented on
Sep 26, 2024 • 0 new comments -
Add Triton CPU as an Inductor backend
#133408 commented on
Sep 27, 2024 • 0 new comments -
[c10d]add coalesce support for device types other than cuda
#133429 commented on
Sep 25, 2024 • 0 new comments -
Remove unused variables in torch/
#133492 commented on
Sep 26, 2024 • 0 new comments -
Removed std namespace from log function calls
#133565 commented on
Sep 25, 2024 • 0 new comments -
Make dot and vdot structured ops (#64)
#134671 commented on
Sep 26, 2024 • 0 new comments -
Replace vmap custom ctx manager by one annotated with `@contextmanager`
#136053 commented on
Sep 27, 2024 • 0 new comments -
Unify cpp_extension build directory removal
#136059 commented on
Sep 27, 2024 • 0 new comments -
[WIP][Inductor UT] Generalize newly introduced inductor UTs for intel GPU (Part 1)
#136069 commented on
Sep 27, 2024 • 0 new comments -
[prototype] Invoke subgraph higher order op
#136171 commented on
Sep 27, 2024 • 0 new comments -
[Dynamo][autograd.Function] Use fake tensor prop to infer fwd output
#136184 commented on
Sep 27, 2024 • 0 new comments -
[inductor] Pass `device_type` argument to `do_bench`
#136189 commented on
Sep 25, 2024 • 0 new comments -
Add determinmistic kernel for reflection2d
#136241 commented on
Sep 24, 2024 • 0 new comments -
Remove _preserve_ops from export
#136247 commented on
Sep 27, 2024 • 0 new comments -
Set output num_float_feature to have dynamic dimension
#136268 commented on
Sep 27, 2024 • 0 new comments -
[QAT] Make Fused modules torchscriptable
#136285 commented on
Sep 27, 2024 • 0 new comments -
Introduce _ArglessActivation base class for parameterless activation functions
#136296 commented on
Sep 26, 2024 • 0 new comments -
Fix parameter names in docstrings
#136297 commented on
Sep 24, 2024 • 0 new comments -
Add int1 to int7 dtypes
#136301 commented on
Sep 27, 2024 • 0 new comments -
Pass rounding_mode for div reference inputs through kwargs
#136308 commented on
Sep 25, 2024 • 0 new comments -
[dynamo] Replace __str__ with __repr__ in some places
#136316 commented on
Sep 24, 2024 • 0 new comments -
SuperResolution Adaround experiment
#136328 commented on
Sep 26, 2024 • 0 new comments -
[PyTorch] Port ExecuTorch bfdot improvement back to ATen BlasKernel
#136331 commented on
Sep 25, 2024 • 0 new comments -
Add a new distributed backend (XCCL) for Intel GPUs
#136343 commented on
Sep 26, 2024 • 0 new comments -
[test] add types to composite_compliance.py
#136385 commented on
Sep 26, 2024 • 0 new comments -
Backward pass ac
#136431 commented on
Sep 24, 2024 • 0 new comments -
Increase update_hint_regression problem size to 1000
#136434 commented on
Sep 24, 2024 • 0 new comments -
Fix to() method on sparse tensors.
#136435 commented on
Sep 26, 2024 • 0 new comments -
init
#136475 commented on
Sep 27, 2024 • 0 new comments -
[WIP] Add py3.13t wheel
#136490 commented on
Sep 27, 2024 • 0 new comments -
Limit the option value of TORCH_SHOW_DISPATCH_TRACE
#136510 commented on
Sep 27, 2024 • 0 new comments -
Make Context to be Device-agnostic Step by Step (1/N)
#136519 commented on
Sep 27, 2024 • 0 new comments -
Make Context to be Device-agnostic Step by Step (2/N)
#136526 commented on
Sep 27, 2024 • 0 new comments -
[AOTI] Refactor call chain of generate_kernel_call
#136531 commented on
Sep 24, 2024 • 0 new comments -
[AOTI] Turn on the ABI-compatible mode as default
#136534 commented on
Sep 27, 2024 • 0 new comments -
[mha] Disable native_mha(fast_path) in dynamo compilation
#136542 commented on
Sep 24, 2024 • 0 new comments -
Add missing mappings to support torch.uint16 in quantization and export
#136547 commented on
Sep 27, 2024 • 0 new comments -
[not for commit] Benchmark Triton CPU backend
#134725 commented on
Sep 26, 2024 • 0 new comments -
xpu: support SyclExtension class APIs
#134735 commented on
Sep 24, 2024 • 0 new comments -
Make device-specific event inherits from torch.Event
#134845 commented on
Sep 27, 2024 • 0 new comments -
Use torch.Stream&torch.Event for Dynamo capature
#134850 commented on
Sep 27, 2024 • 0 new comments -
Improvements for associative_scan - lifted_args for combine_mode='generic'
#134921 commented on
Sep 26, 2024 • 0 new comments -
update CMAKE_PREFIX_PATH setting command
#134934 commented on
Sep 26, 2024 • 0 new comments -
[c10d] fix sequence numbers for coalesced operations
#135132 commented on
Sep 25, 2024 • 0 new comments -
[Inductor][Precompile cache] Lookup cache before calling precompile inside the precompiling future
#135166 commented on
Sep 25, 2024 • 0 new comments -
[Intel GPU] qconv_pointwise.binary XPU support
#135189 commented on
Sep 26, 2024 • 0 new comments -
[merge rules] Add ONNX team to docs/source/conf.py
#135228 commented on
Sep 27, 2024 • 0 new comments -
Tests Generelization for multiple accelerator devices
#135242 commented on
Sep 27, 2024 • 0 new comments -
[executorch hash update] update the pinned executorch hash
#135287 commented on
Sep 27, 2024 • 0 new comments -
[Inductor] Rename test_cuda_cpp_wrapper.py to test_gpu_cpp_wrapper.py,
#135320 commented on
Sep 27, 2024 • 0 new comments -
[Intel GPU] qlinear_pointwise.binary[_tensor] XPU support
#135337 commented on
Sep 26, 2024 • 0 new comments -
add supports_coalescing property in c10d::Backend to determine whether backend supports coalescing
#135338 commented on
Sep 26, 2024 • 0 new comments -
Torchbench nightly MPS runs
#135386 commented on
Sep 27, 2024 • 0 new comments -
Don't uselessly recompute axiom dict every static eval call
#135429 commented on
Sep 27, 2024 • 0 new comments -
[Intel GPU] qconv.pointwise with mixed dtype XPU support
#135465 commented on
Sep 26, 2024 • 0 new comments -
Add BFloat16 support for BRGEMM flash attention forward kernel
#135473 commented on
Sep 27, 2024 • 0 new comments -
Download pre-compiled AOTriton from GitHub unless AOTRITON_INSTALL_FROM_SOURCE=1 is set
#135560 commented on
Sep 24, 2024 • 0 new comments -
Fix tensor.data_ptr() representation overflow
#135567 commented on
Sep 25, 2024 • 0 new comments -
[scan] fix typo in signature and remove wrapper
#135600 commented on
Sep 27, 2024 • 0 new comments -
[scan] flatten subgraph output and make subgraph inputs to be a slice
#135601 commented on
Sep 27, 2024 • 0 new comments -
[ROCm][AOTI] add CK backend
#135641 commented on
Sep 25, 2024 • 0 new comments -
Use a custom Symbol class for performance
#135651 commented on
Sep 25, 2024 • 0 new comments -
[compiled autograd] log placeholder origin in verbose
#135663 commented on
Sep 26, 2024 • 0 new comments -
[SDPA] Bump `grad_query` fudge factor for Flash Attention
#135711 commented on
Sep 27, 2024 • 0 new comments -
Migrate to training ir in quantization_pt2e_qat unittests
#135769 commented on
Sep 27, 2024 • 0 new comments -
[ROCm] Update to AOTriton 0.7b (Cherry-picked)
#135869 commented on
Sep 27, 2024 • 0 new comments -
[FlexAttention] Remove restriction on QK headdim > V headdim
#135884 commented on
Sep 26, 2024 • 0 new comments -
[aoti] Add warning to ask users to switch to new API
#135893 commented on
Sep 27, 2024 • 0 new comments -
Failure of iOS Build Test: Build (default, 1, 1, macos-14-xlarge, SIMULATOR, arm64)
#136284 commented on
Sep 25, 2024 • 0 new comments -
DISABLED test_closure_recompiles (__main__.MiscTests)
#135687 commented on
Sep 25, 2024 • 0 new comments -
Get the error: AttributeError: Can't pickle local object 'convert_frame.<locals>._convert_frame'
#93470 commented on
Sep 25, 2024 • 0 new comments -
dataclasses.replace not supported by dynamo
#136481 commented on
Sep 25, 2024 • 0 new comments -
Inductor handling of large (13K+) nodes graph resulted in nccl timeout (10mins)
#136447 commented on
Sep 25, 2024 • 0 new comments -
Allow Inductor to Compose with FakeTensorMode to Estimate Memory Usage
#136446 commented on
Sep 25, 2024 • 0 new comments -
Make it possible to run pr_time_benchmarks without explicitly specifying PYTHONPATH
#136430 commented on
Sep 25, 2024 • 0 new comments -
[Flex attention] RuntimeError with vmap when using torch.compile in create_mask
#136427 commented on
Sep 25, 2024 • 0 new comments -
[Tracker] Move nested tensors to beta
#112398 commented on
Sep 25, 2024 • 0 new comments -
torch.compiled custom Triton kernels can output incorrect results
#136550 commented on
Sep 25, 2024 • 0 new comments -
nn.CosineSimilarity returns value larger than 1
#78064 commented on
Sep 25, 2024 • 0 new comments -
Bug Report: Distributed Process Group Hangs with NCCL and GLOO Backends
#132003 commented on
Sep 25, 2024 • 0 new comments -
Don't create caffe2::pthreadpool() with getDefaultNumThreads()-many threads in set_num_threads(1)
#134714 commented on
Sep 25, 2024 • 0 new comments -
Issues compiling `torch` with `mkl`
#133823 commented on
Sep 25, 2024 • 0 new comments -
Noisy warning - torch.fx.experimental.symbolic_shapes: [WARNING] Ignored guard (...), this could result in accuracy problems
#101265 commented on
Sep 25, 2024 • 0 new comments -
Any plans for a "torch.minmax" (min-max normalization) function?
#128785 commented on
Sep 25, 2024 • 0 new comments -
[torch.export] `torch._export.serde.serialize.SerializeError: Serializing <built-in function truediv> is not supported`
#136113 commented on
Sep 25, 2024 • 0 new comments -
RuntimeError: NVML_SUCCESS == DriverAPI::get()->nvmlInit_v2_() INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":813, please report a bug to PyTorch.
#130486 commented on
Sep 25, 2024 • 0 new comments -
dynamo (re)compilation issues: shape (1,1), nn.Parameter, mark_dynamic
#135011 commented on
Sep 25, 2024 • 0 new comments -
Microsoft Visual C++ Redistributable is not installed, this may lead to the DLL load failure.
#126507 commented on
Sep 26, 2024 • 0 new comments -
RuntimeError: Unrecognized CachingAllocator option: expandable_segments
#123505 commented on
Sep 26, 2024 • 0 new comments -
Confusing error message for DataLoader with num_workers=0 and non-zero timeout
#106634 commented on
Sep 26, 2024 • 0 new comments -
`to()` Method Does Not Move Internal Components of Sparse Tensors
#136258 commented on
Sep 26, 2024 • 0 new comments -
DISABLED test_autograd_function_backed_op (__main__.TestCustomOp)
#132115 commented on
Sep 26, 2024 • 0 new comments -
[dynamo] Format string with __class__
#118675 commented on
Sep 26, 2024 • 0 new comments -
[MPS] MPSNDArray error: product of dimension sizes > 2**32
#134177 commented on
Sep 26, 2024 • 0 new comments -
[compiled autograd][cudagraphs] TLS is gc'd between unit test cudagraph runs
#126934 commented on
Sep 26, 2024 • 0 new comments -
☂️ 150+ MacOS tests were marked flaky recently
#135885 commented on
Sep 26, 2024 • 0 new comments -
[MPS] Inconsistent performance issues
#136003 commented on
Sep 26, 2024 • 0 new comments -
Real tensor prop for bool cast on x.eq().any() call fails export
#135630 commented on
Sep 26, 2024 • 0 new comments -
[dynamo] enable TorchDispatchMode for eager part when graph breaks
#136495 commented on
Sep 26, 2024 • 0 new comments -
DISABLED test_grad_scaler_with_preset_grad_scale_in_place_unscale_False_AdamW_cuda_float32 (__main__.TestCudaOptimsCUDA)
#135692 commented on
Sep 26, 2024 • 0 new comments -
Significant Accuracy Difference between Compiled and Eager Flex Attention
#135161 commented on
Sep 25, 2024 • 0 new comments -
Improve inductor codegen for writing out tensor and tensor.t() in the same kernel
#133242 commented on
Sep 25, 2024 • 0 new comments -
linux-aarch64 CI tests are being timed out resulting in test failures
#136192 commented on
Sep 25, 2024 • 0 new comments -
Improve the Inductor generated kernel for the pattern of `output1 = pointwise(intput); output2 = transpose(output1)`
#130015 commented on
Sep 25, 2024 • 0 new comments -
Can't run Flex-Attention on CPU - NoValidChoicesError during autotuneSelectAlgorithm
#136525 commented on
Sep 25, 2024 • 0 new comments -
AOTDispatcher debug mode
#136272 commented on
Sep 25, 2024 • 0 new comments -
DISABLED test_pattern_matcher_multi_user_cpu (__main__.CpuTests)
#135296 commented on
Sep 24, 2024 • 0 new comments -
Wrong results when sampling from the beta distribution for small alpha=beta
#136532 commented on
Sep 24, 2024 • 0 new comments -
dynamo creates unnecessary buffers
#124653 commented on
Sep 24, 2024 • 0 new comments -
[triton x pt2] Dynamo should trace data_ptr accesses
#136271 commented on
Sep 24, 2024 • 0 new comments -
DISABLED test_angle_cpu (__main__.CpuTritonTests)
#136124 commented on
Sep 24, 2024 • 0 new comments -
[export] Export CIA preservation doesn't work well with prim decomposition as custom decomp.
#136050 commented on
Sep 24, 2024 • 0 new comments -
Use Incremental Fake Tensor Updater more uniformly across torch.compile compilation
#120116 commented on
Sep 24, 2024 • 0 new comments -
ImportError: libcudnn.so.8: cannot open shared object file: No such file or directory
#104259 commented on
Sep 24, 2024 • 0 new comments -
The difference between input grad computed by channels last backward and the input grad computed by channels first backward of Hardswish on MPS is too large
#107214 commented on
Sep 24, 2024 • 0 new comments -
DISABLED test_roi_align_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#103156 commented on
Sep 24, 2024 • 0 new comments -
profiler for PT2 can give wrong compilation frame ID
#136235 commented on
Sep 24, 2024 • 0 new comments -
Allow partial fallback when frame recompiles failed or exceed the cache size limit
#135458 commented on
Sep 24, 2024 • 0 new comments -
torch.compile 100x slower than eager mode for torch.cumprod backward pass
#136263 commented on
Sep 24, 2024 • 0 new comments -
[ONNX] Handle autocast HOP
#136545 commented on
Sep 24, 2024 • 0 new comments -
Dynamo inlining should compile partial subgraphs / improve graph break with recursive calls
#111003 commented on
Sep 24, 2024 • 0 new comments -
torch.library.custom_op with mutated inputs can have silent incorrectness under torch.compile
#130487 commented on
Sep 24, 2024 • 0 new comments -
capture_triton gets skipped by dynamo
#136056 commented on
Sep 24, 2024 • 0 new comments -
Symmetric memory's rendezvous throws an error
#136494 commented on
Sep 24, 2024 • 0 new comments -
[Inductor][FP8] AOTInductor appears to ignore a transpose in `test_fp8_view_of_param_non_abi_compatible_cuda`
#136209 commented on
Sep 24, 2024 • 0 new comments -
Investigate torch.compile Windows support.
#122094 commented on
Sep 25, 2024 • 0 new comments -
Errors with torch.compile after upgrading to 2.4.0
#133571 commented on
Sep 25, 2024 • 0 new comments -
distributed.scatter memory leak in source rank
#104174 commented on
Sep 25, 2024 • 0 new comments -
DISABLED test_gelu_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#135222 commented on
Sep 25, 2024 • 0 new comments -
forward AD implimentation : _scaled_dot_product_efficient_attention
#98164 commented on
Sep 25, 2024 • 0 new comments -
Stochastic rounding in bfloat16
#120376 commented on
Sep 25, 2024 • 0 new comments -
DISABLED test_metadata_consistency_check (__main__.DTensorMeshTest)
#131598 commented on
Sep 25, 2024 • 0 new comments -
[PTD BE DAY]Burn Down Distributed Disabled Tests!!
#132845 commented on
Sep 25, 2024 • 0 new comments -
DISABLED test_vdd_clamp_cpu (__main__.CpuTests)
#135328 commented on
Sep 25, 2024 • 0 new comments -
torch._dynamo.exc.Unsupported: Unexpected type in sourceless builder torch.Tensor when running Mamba models in vLLM
#136497 commented on
Sep 25, 2024 • 0 new comments -
torch._dynamo.exc.Unsupported: 'immutable_list' object does not support mutation when running MiniCPM-Llama model in vLLM
#136499 commented on
Sep 25, 2024 • 0 new comments -
torch._dynamo.exc.Unsupported: ObservedKeyError exception running Gguf llama model in vLLM
#136502 commented on
Sep 25, 2024 • 0 new comments -
torch.compile error
#113537 commented on
Sep 25, 2024 • 0 new comments -
Add Test for Inductor and Dynamo Config BC Breakages
#133040 commented on
Sep 25, 2024 • 0 new comments -
Make TRITON_INTERPRET=1 work with inductor generated kernels
#123956 commented on
Sep 24, 2024 • 0 new comments -
input buffer sometimes incorrectly marked as inplace when it's not in Inductor
#120217 commented on
Sep 24, 2024 • 0 new comments -
Dynamo should prune non-live captured variables
#127350 commented on
Sep 24, 2024 • 0 new comments -
[feature request] Varlen indexing function for lookup and concat of varlen BPE tokens from a tensor vocab (i.e. `detokenize(...)` and arrays of strings)
#135704 commented on
Sep 24, 2024 • 0 new comments -
DISABLED test_graph_optims_Adam_cuda_float32 (__main__.TestCudaOptimsCUDA)
#136537 commented on
Sep 24, 2024 • 0 new comments -
[inductor] online softmax
#127011 commented on
Sep 27, 2024 • 0 new comments -
CUDA deps cannot be preloaded under Bazel
#117350 commented on
Sep 27, 2024 • 0 new comments -
Dynamo is not thread safe
#118260 commented on
Sep 27, 2024 • 0 new comments -
Periodic ROCM distribtued jobs are broken
#91630 commented on
Sep 27, 2024 • 0 new comments -
Potential memory leak in Adam optimizer in AMD chips (CPU)
#75949 commented on
Sep 27, 2024 • 0 new comments -
[channels_last] Segmentation fault with aten.convolution
#136348 commented on
Sep 27, 2024 • 0 new comments -
[CTA] Let's Stamp Out Flaky Tests!
#74590 commented on
Sep 27, 2024 • 0 new comments -
DISABLED test_grad_scaler_with_preset_grad_scale_in_place_unscale_True_SGD_cuda_float32 (__main__.TestCudaOptimsCUDA)
#135925 commented on
Sep 27, 2024 • 0 new comments -
GPU Vendor-Agnosticism via Vulkan
#47996 commented on
Sep 27, 2024 • 0 new comments -
Pytorch + ROCm+ Windows
#106529 commented on
Sep 27, 2024 • 0 new comments -
[FSDP2 Related]`torch.split_with_sizes_copy` of the GPU does not update the version counter of `out` correctly.
#132014 commented on
Sep 27, 2024 • 0 new comments -
Label tracking meta-issue (edit me to get automatically CC'ed on issues! cc bot)
#24422 commented on
Sep 27, 2024 • 0 new comments -
Performance regression in torch.compile
#136254 commented on
Sep 27, 2024 • 0 new comments -
DISABLED test_grad_scaler_with_preset_grad_scale_in_place_unscale_True_Adam_cuda_float32 (__main__.TestCudaOptimsCUDA)
#135881 commented on
Sep 27, 2024 • 0 new comments -
DISABLED test_scatter_fallback_abi_compatible_cuda (__main__.AOTInductorTestABICompatibleCuda)
#131627 commented on
Sep 27, 2024 • 0 new comments -
Build failure with Xcode 15 linker
#111086 commented on
Sep 27, 2024 • 0 new comments -
Function Request: np.interp
#50334 commented on
Sep 27, 2024 • 0 new comments -
[mark_dynamic] Assertion errors when marking tensor with outer and inner functions
#135568 commented on
Sep 27, 2024 • 0 new comments -
DISABLED test_grad_scaler_with_preset_grad_scale_in_place_unscale_True_AdamW_cuda_float32 (__main__.TestCudaOptimsCUDA)
#135829 commented on
Sep 27, 2024 • 0 new comments -
DISABLED test_device_mode_ops_sparse_mm_reduce_cpu_bfloat16 (__main__.TestDeviceUtilsCPU)
#132494 commented on
Sep 27, 2024 • 0 new comments -
Fix max_width computation in _tensor_str._Formatter
#126859 commented on
Sep 26, 2024 • 0 new comments -
[Storage_ipc] Option II: Provides IPC extensions for 3rd devices.
#126373 commented on
Sep 27, 2024 • 0 new comments -
[WIP] Warn on future divergent behavior for conditional views
#126129 commented on
Sep 25, 2024 • 0 new comments -
allow to use bf16 as fp32 internal precision for mkldnn conv backward
#126054 commented on
Sep 25, 2024 • 0 new comments -
allow to use bf16 as fp32 internal precision for mkldnn conv
#126050 commented on
Sep 25, 2024 • 0 new comments -
refine fp32 precision api
#125888 commented on
Sep 25, 2024 • 0 new comments -
[vision hash update] update the pinned vision hash
#125806 commented on
Sep 27, 2024 • 0 new comments -
[Storage_ipc] Provides IPC extensions for 3rd devices.
#125122 commented on
Sep 27, 2024 • 0 new comments -
[ROCm] hipSPARSELt Integration
#124320 commented on
Sep 27, 2024 • 0 new comments -
Switch batch norm stack to consolidated ops
#119496 commented on
Sep 24, 2024 • 0 new comments -
Automated submodule update: FBGEMM
#115316 commented on
Sep 27, 2024 • 0 new comments -
Automated submodule update: kineto
#106149 commented on
Sep 27, 2024 • 0 new comments -
[MPS] Possible persistent infinite loop in `nn.ReplicationPad1d`
#135442 commented on
Sep 27, 2024 • 0 new comments -
[ONNX] Cannot view a tensor with shape torch.Size([1, 512, 32, 128]) and strides (2097152, 128, 65536, 1) as a tensor with shape (1, 512, 4096)
#136543 commented on
Sep 27, 2024 • 0 new comments -
[MPS] BatchNorm2D produces incorrect results for column first tensors
#134580 commented on
Sep 27, 2024 • 0 new comments -
[MPS] Incorrect result from batch norm with sliced inputs
#133520 commented on
Sep 27, 2024 • 0 new comments -
aot_export is not currently supported with traceable tensor subclass- error comes when distributed tensor is an input to aot_export_joint_simple
#136289 commented on
Sep 27, 2024 • 0 new comments -
Python 3.13 support for PyTorch
#130249 commented on
Sep 27, 2024 • 0 new comments -
[RFC] Default torch.compile backend customization
#136118 commented on
Sep 27, 2024 • 0 new comments -
[ONNX] Exporter improvement tasks
#129274 commented on
Sep 27, 2024 • 0 new comments -
[ONNX] rfftn/irfftn produces incorrect shapes
#125903 commented on
Sep 27, 2024 • 0 new comments -
Stack trace is symbolized when no exception is thrown
#133979 commented on
Sep 27, 2024 • 0 new comments -
DISABLED test_input_mutation2_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#135295 commented on
Sep 26, 2024 • 0 new comments -
`torch.compile` cannot be used in official Docker runtime images
#116696 commented on
Sep 26, 2024 • 0 new comments -
[RFC] Integrate NCCL scalable init API
#136539 commented on
Sep 26, 2024 • 0 new comments -
Cannot Convert Pytorch model with fft_rfftn layers to ONNX using latest torch.onnx.dynamo_export
#133785 commented on
Sep 26, 2024 • 0 new comments -
torch._dynamo.exc.InternalTorchDynamoError when tracing through torch.ops.prim.NumToTensor
#136448 commented on
Sep 26, 2024 • 0 new comments -
Inductor pattern doesn't match on dynamic tensor marked with torch.dynamo.mark_dynamic
#136329 commented on
Sep 26, 2024 • 0 new comments -
Iterating dataloader fails on sliced dataset
#131883 commented on
Sep 26, 2024 • 0 new comments -
On AMD GPUs (ROCm 5.7-6.2), cannot backpropagate loss tensor containing more than `2e8` elements
#136291 commented on
Sep 26, 2024 • 0 new comments -
Remove cusparselt deprecated API usage
#136553 commented on
Sep 26, 2024 • 0 new comments -
Wonder why _MultiProcessingDataLoaderIter.__next__ too slow?
#132492 commented on
Sep 26, 2024 • 0 new comments -
[torch.export] Detect internal constrains
#136216 commented on
Sep 26, 2024 • 0 new comments -
DISABLED test_aot_export_cond_simple_cuda_float32 (__main__.TestHOPCUDA)
#123096 commented on
Sep 26, 2024 • 0 new comments -
DISABLED test_grad_scaler_with_preset_grad_scale_in_place_unscale_False_SGD_cuda_float32 (__main__.TestCudaOptimsCUDA)
#135783 commented on
Sep 26, 2024 • 0 new comments -
torch.compile doesn't generate any graphs when modules are patched with new parameters class
#136257 commented on
Sep 26, 2024 • 0 new comments -
missing thrust/complex.h PyTorch_rocm on gfx1032
#136442 commented on
Sep 26, 2024 • 0 new comments -
[inductor] Graph breaks in CohereForAI/aya-23-8b
#128095 commented on
Sep 26, 2024 • 0 new comments -
ReduceLROnPlateau will throw IndexError: list index out of range with modified optimizer's param_groups.
#104361 commented on
Sep 27, 2024 • 0 new comments -
`import torch` fails on stock ubuntu if one uses XPU nightly binaries
#135867 commented on
Sep 27, 2024 • 0 new comments -
DISABLED test_scaled_dot_product_fused_attention_overrideable_backward (__main__.TestSDPAPrivateUse1Only)
#134602 commented on
Sep 27, 2024 • 0 new comments -
DISABLED test_fused_sdp_choice_privateuseone (__main__.TestSDPAPrivateUse1Only)
#134600 commented on
Sep 27, 2024 • 0 new comments -
[inductor][cpu]GPT2ForSequenceClassification AMP static/dynamic shape default/cpp wrapper single thread accuracy crash
#123503 commented on
Sep 27, 2024 • 0 new comments -
Support `divmod` for tensors
#90820 commented on
Sep 27, 2024 • 0 new comments -
DISABLED test_autograd_cpp_node_data_dependent (__main__.TestCompiledAutograd)
#125579 commented on
Sep 27, 2024 • 0 new comments -
DISABLED test_pointwise_bessel_y1_cuda (__main__.GPUTests)
#127756 commented on
Sep 27, 2024 • 0 new comments -
xpu: huggingface levit test_retain_grad_hidden_states_attentions test hangs on exit on PVC
#136007 commented on
Sep 27, 2024 • 0 new comments -
torch.nn.InstanceNorm2d and torch.nn.InstanceNorm3d returns nan with tensors of float16 dtype on cpu
#135542 commented on
Sep 27, 2024 • 0 new comments -
torch.onnx.export with dynamic axes fails for torch.nn.InstanceNorm1d with track_running_stats=True
#128501 commented on
Sep 27, 2024 • 0 new comments -
InternalTorchDynamoError on converting llama-2 to onnx using torch.onnx.dynamo_export
#128480 commented on
Sep 27, 2024 • 0 new comments -
[ONNX] view(dtype=dtype) is not supported by both onnx.export and onnx.dynamo_export
#126921 commented on
Sep 27, 2024 • 0 new comments -
[v.2.5.0] Release Tracker
#135522 commented on
Sep 27, 2024 • 0 new comments -
ONNX Export Fails with Dynamic Slicing on Data-Dependent Value
#136083 commented on
Sep 27, 2024 • 0 new comments -
autograd.Function x Dynamo tracing incorrectly returns Tensors that don't require grad
#129963 commented on
Sep 27, 2024 • 0 new comments -
PyTorch 2.5.0 exposes statically linked `libstdc++` CXX11 ABI symbols.
#133437 commented on
Sep 27, 2024 • 0 new comments -
The call to ncclCommSplit does not have the correct config parameters set
#129862 commented on
Sep 27, 2024 • 0 new comments -
MPS code contains references to undocumented APIs
#135637 commented on
Sep 27, 2024 • 0 new comments