Skip to content

[iOS][CI] Update dev certs #66004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed

[iOS][CI] Update dev certs #66004

wants to merge 8 commits into from

Conversation

malfet
Copy link
Contributor

@malfet malfet commented Oct 1, 2021

Fixes #65988

@pytorch-probot
Copy link

pytorch-probot bot commented Oct 1, 2021

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/85878e84765806be0d6f00c1bb2dae5ad9d3748e/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/xla ✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3.6-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers ✅ triggered
linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3.6-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/win ✅ triggered
Skipped Workflows
libtorch-linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow 🚫 skipped
linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow 🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
puretorch-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux 🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:
# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Oct 1, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 85878e8 (more details on the Dr. CI page):


  • 11/13 failures introduced in this PR
  • 2/13 broken upstream at merge base eac218d on Oct 01 from 6:49am to 9:44am

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build linux-xenial-py3.6-gcc5.4 / test (distributed, 1, 1, linux.2xlarge) (1/3)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-10-01T17:00:02.7254160Z test_remote_mess...yUniqueId(created_on=0, local_id=0) to be created.
2021-10-01T16:59:38.7279436Z frame #11: <unknown function> + 0x4053651 (0x7fea1d3aa651 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-10-01T16:59:38.7281414Z frame #12: c10::ThreadPool::main_loop(unsigned long) + 0x2a3 (0x7fea1910ce93 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-10-01T16:59:38.7282807Z frame #13: <unknown function> + 0xc92bd (0x7fea1903b2bd in /opt/conda/lib/libstdc++.so.6)
2021-10-01T16:59:38.7284241Z frame #14: <unknown function> + 0x76ba (0x7fea2e4296ba in /lib/x86_64-linux-gnu/libpthread.so.0)
2021-10-01T16:59:38.7285576Z frame #15: clone + 0x6d (0x7fea2e15f51d in /lib/x86_64-linux-gnu/libc.so.6)
2021-10-01T16:59:38.7286176Z 
2021-10-01T16:59:38.9182876Z ok (3.215s)
2021-10-01T16:59:46.1400800Z   test_remote_message_dropped_pickle (__main__.FaultyFaultyAgentRpcTest) ... ok (7.222s)
2021-10-01T16:59:53.3609162Z   test_remote_message_dropped_pickle_to_self (__main__.FaultyFaultyAgentRpcTest) ... ok (7.221s)
2021-10-01T16:59:59.6802171Z   test_remote_message_script_delay_timeout (__main__.FaultyFaultyAgentRpcTest) ... ok (6.319s)
2021-10-01T17:00:02.7254160Z   test_remote_message_script_delay_timeout_to_self (__main__.FaultyFaultyAgentRpcTest) ... [E request_callback_no_python.cpp:559] Received error while processing request type 260: falseINTERNAL ASSERT FAILED at "/var/lib/jenkins/workspace/torch/csrc/distributed/rpc/rref_context.cpp":387, please report a bug to PyTorch. Expected OwnerRRef with id GloballyUniqueId(created_on=0, local_id=0) to be created.
2021-10-01T17:00:02.7257430Z Exception raised from getOwnerRRef at /var/lib/jenkins/workspace/torch/csrc/distributed/rpc/rref_context.cpp:387 (most recent call first):
2021-10-01T17:00:02.7260382Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x69 (0x7f3944c85059 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-10-01T17:00:02.7263165Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xd2 (0x7f3944c81602 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-10-01T17:00:02.7266805Z frame #2: c10::detail::torchInternalAssertFail(char const*, char const*, unsigned int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x4e (0x7f3944c82f9e in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-10-01T17:00:02.7269915Z frame #3: torch::distributed::rpc::RRefContext::getOwnerRRef(torch::distributed::rpc::GloballyUniqueId const&, bool) + 0x4a4 (0x7f3948ef1264 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-10-01T17:00:02.7273922Z frame #4: torch::distributed::rpc::RequestCallbackNoPython::assignOwnerRRef(torch::distributed::rpc::GloballyUniqueId const&, torch::distributed::rpc::GloballyUniqueId const&, c10::intrusive_ptr<c10::ivalue::Future, c10::detail::intrusive_target_default_null_type<c10::ivalue::Future> >) const + 0x71 (0x7f3948ee15c1 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-10-01T17:00:02.7278375Z frame #5: torch::distributed::rpc::RequestCallbackImpl::processScriptRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x12a (0x7f395135a15a in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
2021-10-01T17:00:02.7282563Z frame #6: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x14c (0x7f3948ee5dcc in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-10-01T17:00:02.7286745Z frame #7: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x65 (0x7f3951357025 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
2021-10-01T17:00:02.7289534Z frame #8: <unknown function> + 0x402396a (0x7f3948ee296a in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)

See GitHub Actions build linux-xenial-py3.6-gcc7-bazel-test / build-and-test (2/3)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-10-01T16:43:47.5657491Z ModuleNotFoundError: No module named 'boto3'
2021-10-01T16:43:47.5650124Z   File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/tools/stats/upload_binary_size_to_scuba.py", line 155, in <module>
2021-10-01T16:43:47.5650878Z     register_rds_schema("binary_size", schema_from_sample(sample_data))
2021-10-01T16:43:47.5651784Z   File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/tools/stats/scribe.py", line 94, in register_rds_schema
2021-10-01T16:43:47.5652376Z     invoke_rds(event)
2021-10-01T16:43:47.5653090Z   File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/tools/stats/scribe.py", line 72, in invoke_rds
2021-10-01T16:43:47.5653829Z     return invoke_lambda("rds-proxy", events)
2021-10-01T16:43:47.5654606Z   File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/tools/stats/scribe.py", line 30, in invoke_lambda
2021-10-01T16:43:47.5655383Z     res = aws_lambda().invoke(FunctionName=name, Payload=json.dumps(payload).encode())
2021-10-01T16:43:47.5656289Z   File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/tools/stats/scribe.py", line 21, in aws_lambda
2021-10-01T16:43:47.5656894Z     import boto3  # type: ignore[import]
2021-10-01T16:43:47.5657491Z ModuleNotFoundError: No module named 'boto3'
2021-10-01T16:43:47.5722349Z ##[group]Run # detached container should get cleaned up by teardown_ec2_linux
2021-10-01T16:43:47.5723127Z �[36;1m# detached container should get cleaned up by teardown_ec2_linux�[0m
2021-10-01T16:43:47.5723629Z �[36;1mexport SHARD_NUMBER=0�[0m
2021-10-01T16:43:47.5724093Z �[36;1m# TODO: Stop building test binaries as part of the build phase�[0m
2021-10-01T16:43:47.5724690Z �[36;1m# Make sure we copy test results from bazel-testlogs symlink to�[0m
2021-10-01T16:43:47.5725256Z �[36;1m# a regular directory ./test/test-reports�[0m
2021-10-01T16:43:47.5725707Z �[36;1mcontainer_name=$(docker run \�[0m
2021-10-01T16:43:47.5726113Z �[36;1m  -e BUILD_ENVIRONMENT \�[0m
2021-10-01T16:43:47.5726532Z �[36;1m  -e CUSTOM_TEST_ARTIFACT_BUILD_DIR \�[0m
2021-10-01T16:43:47.5726916Z �[36;1m  -e GITHUB_ACTIONS \�[0m

See GitHub Actions build Lint / clang-tidy (3/3)

Step: "Check for warnings" (full log | diagnosis details | 🔁 rerun)

2021-10-01T16:38:34.3344594Z /__w/pytorch/pytor...e [performance-for-range-copy,-warnings-as-errors]
2021-10-01T16:38:34.3333467Z             ^
2021-10-01T16:38:34.3333778Z   const    &
2021-10-01T16:38:34.3335724Z /__w/pytorch/pytorch/torch/csrc/jit/tensorexpr/kernel.cpp:1321:13: error: the variable 'b' is copy-constructed from a const reference but is only used as const reference; consider making it a const reference [performance-unnecessary-copy-initialization,-warnings-as-errors]
2021-10-01T16:38:34.3337964Z   BufHandle b = c10::get<BufHandle>(inputs[2]);
2021-10-01T16:38:34.3338406Z             ^
2021-10-01T16:38:34.3338666Z   const    &
2021-10-01T16:38:34.3340878Z /__w/pytorch/pytorch/torch/csrc/jit/tensorexpr/kernel.cpp:1857:23: error: the const qualified variable 'rhs' is copy-constructed from a const reference; consider making it a const reference [performance-unnecessary-copy-initialization,-warnings-as-errors]
2021-10-01T16:38:34.3342505Z       const BufHandle rhs = c10::get<BufHandle>(inputs[1]);
2021-10-01T16:38:34.3342883Z                       ^
2021-10-01T16:38:34.3343134Z                      &
2021-10-01T16:38:34.3344594Z /__w/pytorch/pytorch/torch/csrc/jit/tensorexpr/kernel.cpp:2309:23: error: loop variable is copied but only used as const reference; consider making it a const reference [performance-for-range-copy,-warnings-as-errors]
2021-10-01T16:38:34.3345727Z             for (auto a : axes) {
2021-10-01T16:38:34.3346002Z                       ^
2021-10-01T16:38:34.3346273Z                  const  &
2021-10-01T16:38:34.3346579Z Warnings detected!
2021-10-01T16:38:34.3346889Z Summary:
2021-10-01T16:38:34.3347473Z [performance-for-range-copy] occurred 3 times
2021-10-01T16:38:34.3348170Z     /__w/pytorch/pytorch/torch/csrc/jit/tensorexpr/kernel.cpp:463
2021-10-01T16:38:34.3348799Z     /__w/pytorch/pytorch/torch/csrc/jit/tensorexpr/kernel.cpp:1161
2021-10-01T16:38:34.3349361Z     /__w/pytorch/pytorch/torch/csrc/jit/tensorexpr/kernel.cpp:2309
2021-10-01T16:38:34.3349696Z 

8 failures not recognized by patterns:

Job Step Action
GitHub Actions linux-xenial-py3.6-gcc5.4 / test (default, 1, 2, linux.2xlarge) Unknown 🔁 rerun
GitHub Actions linux-xenial-py3.6-gcc5.4 / build-docs (python) Unknown 🔁 rerun
GitHub Actions linux-bionic-py3.6-clang9 / test (default, 2, 2, linux.2xlarge) Unknown 🔁 rerun
GitHub Actions linux-bionic-py3.6-clang9 / test (default, 1, 2, linux.2xlarge) Unknown 🔁 rerun
GitHub Actions linux-xenial-py3.6-gcc5.4 / test (default, 2, 2, linux.2xlarge) Unknown 🔁 rerun
GitHub Actions linux-bionic-py3.6-clang9 / test (noarch, 1, 1, linux.2xlarge) Unknown 🔁 rerun
GitHub Actions linux-xenial-py3.6-gcc5.4 / build-docs (cpp) Build cpp docs 🔁 rerun
GitHub Actions Lint / mypy Run mypy 🔁 rerun

🚧 2 fixed upstream failures:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Oct 1, 2021
@facebook-github-bot
Copy link
Contributor

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot facebook-github-bot deleted the gh/xta0/136/head branch October 5, 2021 14:17
malfet pushed a commit to malfet/pytorch that referenced this pull request Oct 6, 2021
Summary:
Fixes pytorch#65988

Pull Request resolved: pytorch#66004

Reviewed By: xta0

Differential Revision: D31340893

Pulled By: malfet

fbshipit-source-id: 3bf0be266e9686a73d62e86c5cf0bebeb0416260
malfet added a commit that referenced this pull request Oct 6, 2021
Summary:
Fixes #65988

Pull Request resolved: #66004

Reviewed By: xta0

Differential Revision: D31340893

Pulled By: malfet

fbshipit-source-id: 3bf0be266e9686a73d62e86c5cf0bebeb0416260

Co-authored-by: Tao Xu <taox@fb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
oncall: jit Add this issue/PR to JIT oncall triage queue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Many iOS jobs are broken due to expired cert
3 participants