Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-31553: add --json-lines option to json.tool #10051

Merged
merged 4 commits into from Nov 7, 2018

Conversation

hongweipeng
Copy link
Contributor

@hongweipeng hongweipeng commented Oct 23, 2018

Copy link
Member

@vstinner vstinner left a comment

The change LGTM, but I'm not 100% sure about the usefulness of the feature :-) See discussion at https://bugs.python.org/issue31553

Lib/json/tool.py Outdated
with infile:
try:
obj = json.load(infile)
if jsonlines:
objs = [json.loads(line) for line in infile]
Copy link
Member

@vstinner vstinner Oct 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of loading all lines and only later dump all objects, it may be better to: load on object, dump it, then load the second object, etc.

The behavior difference is that first JSONs are written into outfile if they are valid, but if the last one is invalid. With the current code, it's all-or-nothing. Maybe it's done on purpose?

The other advantage of working on a single object is to reduce the memory footprint when working with large JSON files.

Copy link
Member

@serhiy-storchaka serhiy-storchaka Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

Just make objs an iterator:

objs = (json.loads(line) for line in infile)

or

objs = map(json.loads, infile)

Copy link
Contributor Author

@hongweipeng hongweipeng Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's great. Thank you for your guidance.

Lib/json/tool.py Outdated
@@ -26,19 +26,26 @@ def main():
help='write the output of infile to outfile')
parser.add_argument('--sort-keys', action='store_true', default=False,
help='sort the output of dictionaries alphabetically by key')
parser.add_argument('--jsonlines', action='store_true', default=False,
Copy link
Member

@serhiy-storchaka serhiy-storchaka Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to name the option --json-lines or just --lines.

Copy link
Contributor Author

@hongweipeng hongweipeng Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--json-lines named similar to the existing --sort-keys,it will be more suitable.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Needed the documentation and an entry in What's New.

@@ -26,19 +26,26 @@ def main():
help='write the output of infile to outfile')
parser.add_argument('--sort-keys', action='store_true', default=False,
help='sort the output of dictionaries alphabetically by key')
parser.add_argument('--jsonlines', action='store_true', default=False,
help='parse input using the jsonlines format')
Copy link
Member

@serhiy-storchaka serhiy-storchaka Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 'parse every input line as separate JSON object'?

Copy link
Contributor Author

@hongweipeng hongweipeng Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write whatsnew in 3.8?

Lib/json/tool.py Outdated
with infile:
try:
obj = json.load(infile)
if jsonlines:
objs = [json.loads(line) for line in infile]
Copy link
Member

@serhiy-storchaka serhiy-storchaka Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

Just make objs an iterator:

objs = (json.loads(line) for line in infile)

or

objs = map(json.loads, infile)

@bedevere-bot
Copy link

bedevere-bot commented Oct 25, 2018

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@@ -135,6 +135,11 @@ by right-clicking the button. (Contributed by Tal Einat in :issue:`1529353`.)
The changes above have been backported to 3.7 maintenance releases.


json.tool
-------
Copy link
Member

@serhiy-storchaka serhiy-storchaka Oct 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an empty line after "---". And the number of minuses should match the length of the header.

Copy link
Contributor Author

@hongweipeng hongweipeng Oct 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have limited knowledge about Doc.Thank you for your patience.

@@ -135,6 +135,11 @@ by right-clicking the button. (Contributed by Tal Einat in :issue:`1529353`.)
The changes above have been backported to 3.7 maintenance releases.


json.tool
-------
Add option ``--json-lines`` to parse every input line as separate JSON object.
Copy link
Member

@serhiy-storchaka serhiy-storchaka Oct 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "(Contributed by ... in :issue:`...`.)"

@hongweipeng
Copy link
Contributor Author

hongweipeng commented Oct 26, 2018

I have made the requested changes; please review again.

@bedevere-bot
Copy link

bedevere-bot commented Oct 26, 2018

Thanks for making the requested changes!

@serhiy-storchaka: please review the changes made to this pull request.

@hongweipeng hongweipeng changed the title bpo-31553: add jsonlines option to json.tool bpo-31553: add --json-lines option to json.tool Oct 26, 2018
Copy link
Member

@matrixise matrixise left a comment

You can add your full name if you prefer.

@hongweipeng
Copy link
Contributor Author

hongweipeng commented Oct 26, 2018

@matrixise It's my full name :)

@hongweipeng
Copy link
Contributor Author

hongweipeng commented Nov 5, 2018

ping

@serhiy-storchaka serhiy-storchaka merged commit f194479 into python:master Nov 7, 2018
5 checks passed
@bedevere-bot
Copy link

bedevere-bot commented Nov 7, 2018

@serhiy-storchaka: Please replace # with GH- in the commit message next time. Thanks!

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Nov 7, 2018

Damn it! I have pressed Enter by mistake while edited commit message.

Thank you for your contribution @hongweipeng!

@hongweipeng
Copy link
Contributor Author

hongweipeng commented Nov 7, 2018

@serhiy-storchaka Oh!It is a beautiful mistake for me. Will you revoke this merger?

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Nov 7, 2018

It can't be revoked without making larger harm.

@hongweipeng hongweipeng deleted the jsonlines-for-json.tool branch Mar 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants