Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-24263: Fix unittest to discover tests named with non-ascii characters #13149

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

abadger
Copy link
Contributor

@abadger abadger commented May 7, 2019

This is a continuation of #1338 from @louisom which is itself based on a patch in the bpo issue tracker fromsih4sing5hong5.

In addition to rebasing louisom's work to current master, I've done some work to address comments on the earlier code:

it looks for isidentifier on $thing.py, but not on just $thing (which we need to do to handle packages, vs modules).

I've addressed that by adding a test using isidentifier() in the isdirectory section of the conditional: https://github.com/python/cpython/pull/13149/files#diff-c1c31e722f730423a5e6dcc3eb6eef67R471

and adding directory names which are not valid identifiers (but have __init__.py inside of them) to the test cases.

https://bugs.python.org/issue24263

louisom and others added 4 commits May 7, 2019
The unitteset test discovery was incorrectly looking for tests in
directories which had non-identifiers in the directory name.  This
change makes the discovery only enter directories which can be valid
python identifiers
@abadger abadger force-pushed the rebase-unittest-unicode-pattern branch from cf97858 to 86faeec Compare May 7, 2019
Lib/unittest/loader.py Outdated Show resolved Hide resolved
Lib/unittest/loader.py Outdated Show resolved Hide resolved
Lib/unittest/loader.py Show resolved Hide resolved
path_lists = [['test2.py', 'test1.py', 'not_a_test.py', 'test_dir',
'test.foo', 'test-not-a-module.py', 'another_dir'],
path_lists = [ # Valid module and package names
['test2.py', 'test1.py', 'not_a_test.py', 'test_dir',
Copy link
Member

@ezio-melotti ezio-melotti May 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move the [ before the comment, otherwise it seems like the comment applies to the whole list.

Copy link
Contributor Author

@abadger abadger May 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Lib/unittest/test/test_discovery.py Outdated Show resolved Hide resolved
# of a package directory
['測試4.py', '測試3.py', ]]
os.listdir = lambda path: path_lists.pop(0)
self.addCleanup(restore_listdir)
Copy link
Member

@ezio-melotti ezio-melotti May 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of defining all the restore_* functions, I would just do something like self.addCleanup(setattr, os, 'listdir', original_listdir).
Another option is using mock.patch, but maybe here it's overkill.

Copy link
Contributor Author

@abadger abadger May 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, for the remainder of these comments.... It appears that this test_find_tests_with_unicode() function is a copy of the pre-existing test, test_find_tests(), adjusted to use new data.

If you'd like me to make these changes, I can, but I think I'd pair it with merging the two tests.


def isfile(path):
# '另外的_資料夾' is not a package and so shouldn't be recursed into
return not path.endswith('資料夾') and not '另外的_資料夾' in path
Copy link
Member

@ezio-melotti ezio-melotti May 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return not path.endswith('資料夾') and '另外的_資料夾' not in path


def isdir(path):
return path.endswith('資料夾')
os.path.isdir = isdir
Copy link
Member

@ezio-melotti ezio-melotti May 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why here you defined an isdir function instead of using a lambda?
(FWIW I think for the sake of brevity lambdas are fine in this context.)

# execution order.
expected = [[name + ' module tests'] for name in
('測試1', '測試2', '測試_資料夾')]
expected.extend([[('測試_資料夾.%s' % name) + ' module tests'] for name in
Copy link
Member

@ezio-melotti ezio-melotti May 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected.extend([['測試_資料夾.%s module tests' % name] for name in

@bedevere-bot
Copy link

bedevere-bot commented May 12, 2019

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@abadger
Copy link
Contributor Author

abadger commented May 12, 2019

@rbtcollins, it seems like the people presently interested in this PR are leaning towards removing the Python2 compatible code so I'm going to go ahead and remove that.

@abadger
Copy link
Contributor Author

abadger commented May 12, 2019

@ezio-melotti Updated. I believe with all of the changes that you and @vstinner asked for.

@abadger abadger force-pushed the rebase-unittest-unicode-pattern branch from d870a5e to 883876f Compare May 12, 2019
* Merge these two discovery tests as they really are just variants on the
  data provided rather than the code of the test.

* Simplify a few sections of the code as suggested by ezio melotti

* Fix some formatting issues
@abadger abadger force-pushed the rebase-unittest-unicode-pattern branch from 883876f to b4808be Compare May 12, 2019
@abadger
Copy link
Contributor Author

abadger commented May 12, 2019

I have made the requested changes; please review again

@bedevere-bot
Copy link

bedevere-bot commented May 12, 2019

Thanks for making the requested changes!

@ezio-melotti: please review the changes made to this pull request.

@csabella csabella requested a review from ezio-melotti Jan 25, 2020
# what about .pyc (etc)
# we would need to avoid loading the same tests multiple times
# from '.py', *and* '.pyc'
VALID_MODULE_NAME = re.compile(r'[_a-z]\w*\.py$', re.IGNORECASE)
Copy link
Member

@ezio-melotti ezio-melotti May 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to keep this global and update it to use r'[^\W\d]\w*\.py$', but as @serhiy-storchaka pointed out in #68451 (comment) this doesn't match all identifiers.

Perhaps a is_valid_module_name() global function could be added instead -- this will still provide a way to control what is valid, instead of hardcoding the check in _find_test_path().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants