Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-16512: one jpeg image added with different profile, code modified to detect all types of jpg #8322

Closed
wants to merge 4 commits into from

Conversation

gov-vj
Copy link

@gov-vj gov-vj commented Jul 18, 2018

I have added one jpeg file which does not have JFIF and Exif.

Changed test_imghdr.py: added the extra file in the test case.

https://bugs.python.org/issue16512

@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

When your account is ready, please add a comment in this pull request
and a Python core developer will remove the CLA not signed label
to make the bot check again.

Thanks again for your contribution, we look forward to reviewing it!

@gov-vj
Copy link
Author

gov-vj commented Jul 18, 2018

I have signed the CLA.

@pablogsal pablogsal changed the title issue16512: one jpeg image added with different profile, code modified to detect all types of jpg bpo-16512: one jpeg image added with different profile, code modified to detect all types of jpg Jul 18, 2018
@@ -79,7 +80,7 @@ def test_bad_args(self):
imghdr.what()
with self.assertRaises(AttributeError):
imghdr.what(None)
with self.assertRaises(TypeError):
with self.assertRaises(Exception):#TypeError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. Why changing the TypeError to a general Exception?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In imghdr.py when an integer is passed instead of a file name, some functions such as test_gif throws TypeError while other functions like test_png throw AttributeError.

Initially, test_imghdr was passing because first function was test_jpeg and it was throwing TypeError. Now after the change it throws AttributeError. So I changed it to Exception because the error was depending on the order of the functions written. Or, Should I change error type to AttributeError?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As now h.startswith(b'\xff\xd8') raises AttributeError instead, you need to assert on that:

with self.assertRaises(AttributeError):

If you do this change, all tests in test_imghdr should pass.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@gov-vj
Copy link
Author

gov-vj commented Jul 18, 2018

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@pablogsal: please review the changes made to this pull request.

@gov-vj
Copy link
Author

gov-vj commented Jul 18, 2018

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@pablogsal: please review the changes made to this pull request.

@gov-vj
Copy link
Author

gov-vj commented Jul 18, 2018

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@pablogsal: please review the changes made to this pull request.

@gov-vj
Copy link
Author

gov-vj commented Jul 20, 2018

@pablogsal Hi,

This is my first commit to an open source project. So, thanks in advance. I have one query. Should I make a change to test_imghdr.py file or not? Or everything is done from my end?

@pablogsal
Copy link
Member

pablogsal commented Jul 24, 2018

@gov-vj Thank you very much for your first contribution! :)

You should make sure that the test suite covers your change. Each Pull Request should add tests (or modify existing ones) that checks that the code you are adding works correctly. In this case, you are modifying how imghdr recognizes a jpeg image. As there are already tests that checks that the previous code was working, these tests also work for your case but you should add new tests that checks that the images that were not working before are working now, such as the image in https://bugs.python.org/issue21230 and the one you already added.

Please, add the image of issue21230 to the test suite unless the one you added is covering that case.

@@ -0,0 +1 @@
Fix bug in imghdr.py: some jpeg files were not detecting.
Copy link
Member

@pablogsal pablogsal Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not an English native speaker, but I think this is incorrect. Let's improve this NEWS entry with something in the lines of:

Fixed bug in imghdr that was causing some jpeg image files not being detected as such.

(Notice the double backticks in imghdr).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, add the image of issue21230 to the test suite unless the one you added is covering that case.

The image is covering the test case.

I edited the NEWS entry.

@gov-vj
Copy link
Author

gov-vj commented Jul 24, 2018

Please, add the image of issue21230 to the test suite unless the one you added is covering that case.

My image starts with b'\xff\xd8\xff\xdb\x00C\x00\x06\x04\x05\x06\x05\x04\x06\x06\x05\x06\x07\x07\x06\x08\n\x10\n\n\t\t\n\x14\x0e\x0f\x0c'.

Issue 21230 image starts with '\xff\xd8\xff\xee\x00\x0eAdobe\x00d\x00\x00\x00\x00\x00\xff\xed\x008Photoshop '.

In both cases first few bytes of jpeg image does not contain JFIF or Exif. So, I think my image cover the test case for Issue 21230.

@gov-vj
Copy link
Author

gov-vj commented Jul 30, 2018

@pablogsal Hi,

I forgot to tag you in previous reply. Here is the summary of previous comment:

  • I have also corrected test_imghdr.py.
  • I have made the change in Misc/NEWS.d/next/Library/2018-07-18-19-57-19.[bpo-16512](https://www.bugs.python.org/issue16512).Jh27PQ.rst
  • The image I added is I think covers the issue 21230. Reason:

The image in the issue 21230 and the image I added both does not contains JFIF or Exif.

@@ -36,7 +36,7 @@ def what(file, h=None):

def test_jpeg(h, f):
"""JPEG data in JFIF or Exif format"""
if h[6:10] in (b'JFIF', b'Exif'):
if h.startswith(b'\xff\xd8'):
return 'jpeg'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While technically this test is correct, there is a risk to have false positive. Would it be possible to design a stricter test?

My Hachoir parser for JPEG picture checks that it's able to parse at least 3 chunks:
https://github.com/vstinner/hachoir/blob/15f61aef34115a7ed1e76efc84b9176be55032a4/hachoir/parser/image/jpeg.py#L565-L578

A chunk must start with 0xFF, then have a tag in the 0xD0..0xD9 range.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vstinner Hi,

I think I understand your code. For example in b'\xff\xd8\xff\xdb\x00C....':

  1. First 2 bytes must equal to '\xff\xd8'
  2. Then you are looking for the tag (defined in JpegChunk.TAG_INFO) in the next 3 bytes( i.e. '\xff\xdb\x00C').

Am I correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vstinner Hi,

I had trouble understainding the code you provided. Can you please elebroate more?
Thanks.

@csabella csabella requested a review from vstinner January 25, 2020 20:24
@csabella
Copy link
Contributor

csabella commented Feb 6, 2020

@gov-vj, are you interested in continuing to work on this PR? Another PR was opened for this issue and since they implement the same change, I'd like to close one of them. Since you started working on this first, I would leave this PR open and close the other one if you're still interested. Otherwise, this one can be closed. Thank you!

@AlexWaygood
Copy link
Member

Thanks for the PR, @gov-vj. Unfortunately, the imghdr module is now deprecated following the acceptance of PEP 594 by the Steering Council. Bugfixes and improvements to the module are therefore now considered to be very low priority. Given the large backlog of PRs in the CPython repo, and the fact that this module has no active maintainer in the core dev team, I am therefore going to close this PR as per the policy laid out in the dev guide.

I hope that this does not dissuade you from contributing to CPython in the future 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants