-
-
Notifications
You must be signed in to change notification settings - Fork 31.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-16512: one jpeg image added with different profile, code modified to detect all types of jpg #8322
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. When your account is ready, please add a comment in this pull request Thanks again for your contribution, we look forward to reviewing it! |
I have signed the CLA. |
Lib/test/test_imghdr.py
Outdated
@@ -79,7 +80,7 @@ def test_bad_args(self): | |||
imghdr.what() | |||
with self.assertRaises(AttributeError): | |||
imghdr.what(None) | |||
with self.assertRaises(TypeError): | |||
with self.assertRaises(Exception):#TypeError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this. Why changing the TypeError
to a general Exception
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In imghdr.py when an integer is passed instead of a file name, some functions such as test_gif throws TypeError while other functions like test_png throw AttributeError.
Initially, test_imghdr was passing because first function was test_jpeg and it was throwing TypeError. Now after the change it throws AttributeError. So I changed it to Exception because the error was depending on the order of the functions written. Or, Should I change error type to AttributeError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As now h.startswith(b'\xff\xd8')
raises AttributeError
instead, you need to assert on that:
with self.assertRaises(AttributeError):
If you do this change, all tests in test_imghdr
should pass.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
I have made the requested changes; please review again |
Thanks for making the requested changes! @pablogsal: please review the changes made to this pull request. |
I have made the requested changes; please review again |
Thanks for making the requested changes! @pablogsal: please review the changes made to this pull request. |
I have made the requested changes; please review again |
Thanks for making the requested changes! @pablogsal: please review the changes made to this pull request. |
@pablogsal Hi, This is my first commit to an open source project. So, thanks in advance. I have one query. Should I make a change to test_imghdr.py file or not? Or everything is done from my end? |
@gov-vj Thank you very much for your first contribution! :) You should make sure that the test suite covers your change. Each Pull Request should add tests (or modify existing ones) that checks that the code you are adding works correctly. In this case, you are modifying how Please, add the image of issue21230 to the test suite unless the one you added is covering that case. |
@@ -0,0 +1 @@ | |||
Fix bug in imghdr.py: some jpeg files were not detecting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not an English native speaker, but I think this is incorrect. Let's improve this NEWS entry with something in the lines of:
Fixed bug in
imghdr
that was causing some jpeg image files not being detected as such.
(Notice the double backticks in imghdr
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add the image of issue21230 to the test suite unless the one you added is covering that case.
The image is covering the test case.
I edited the NEWS entry.
My image starts with b'\xff\xd8\xff\xdb\x00C\x00\x06\x04\x05\x06\x05\x04\x06\x06\x05\x06\x07\x07\x06\x08\n\x10\n\n\t\t\n\x14\x0e\x0f\x0c'. Issue 21230 image starts with '\xff\xd8\xff\xee\x00\x0eAdobe\x00d\x00\x00\x00\x00\x00\xff\xed\x008Photoshop '. In both cases first few bytes of jpeg image does not contain JFIF or Exif. So, I think my image cover the test case for Issue 21230. |
@pablogsal Hi, I forgot to tag you in previous reply. Here is the summary of previous comment:
The image in the issue 21230 and the image I added both does not contains JFIF or Exif. |
@@ -36,7 +36,7 @@ def what(file, h=None): | |||
|
|||
def test_jpeg(h, f): | |||
"""JPEG data in JFIF or Exif format""" | |||
if h[6:10] in (b'JFIF', b'Exif'): | |||
if h.startswith(b'\xff\xd8'): | |||
return 'jpeg' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While technically this test is correct, there is a risk to have false positive. Would it be possible to design a stricter test?
My Hachoir parser for JPEG picture checks that it's able to parse at least 3 chunks:
https://github.com/vstinner/hachoir/blob/15f61aef34115a7ed1e76efc84b9176be55032a4/hachoir/parser/image/jpeg.py#L565-L578
A chunk must start with 0xFF, then have a tag in the 0xD0..0xD9 range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vstinner Hi,
I think I understand your code. For example in b'\xff\xd8\xff\xdb\x00C....':
- First 2 bytes must equal to '\xff\xd8'
- Then you are looking for the tag (defined in JpegChunk.TAG_INFO) in the next 3 bytes( i.e. '\xff\xdb\x00C').
Am I correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vstinner Hi,
I had trouble understainding the code you provided. Can you please elebroate more?
Thanks.
@gov-vj, are you interested in continuing to work on this PR? Another PR was opened for this issue and since they implement the same change, I'd like to close one of them. Since you started working on this first, I would leave this PR open and close the other one if you're still interested. Otherwise, this one can be closed. Thank you! |
Thanks for the PR, @gov-vj. Unfortunately, the imghdr module is now deprecated following the acceptance of PEP 594 by the Steering Council. Bugfixes and improvements to the module are therefore now considered to be very low priority. Given the large backlog of PRs in the CPython repo, and the fact that this module has no active maintainer in the core dev team, I am therefore going to close this PR as per the policy laid out in the dev guide. I hope that this does not dissuade you from contributing to CPython in the future 🙂 |
I have added one jpeg file which does not have JFIF and Exif.
Changed test_imghdr.py: added the extra file in the test case.
https://bugs.python.org/issue16512