pythongh-102140 : False neg csv header bug fix #102787

Drakariboo · 2023-03-17T18:25:14Z

gh-102140 : We've improved the heuristic of has_header() method in Lib/csv.py. We wanted to respect the methodology on how this function was created, even if the determining factor string length is meaningless . We added more verifications before deleting a column for its inconsistency :

similiratyWords is a dictionnary in which we stock the number of repetitions of words per column.

compareWords is a list in which we stock every word of an element by using regex. By comparing two lists which represent for example element_line1 and element_line2, we increment similarityWords if there are same values in element_line1 and element_line2.
Thanks to this, we respect, the methology of comparing each element of every row in a column.

We've made the average of string lengths and compared it to the header length to keep the consistency.

Checking the header if it's a single word.

We group up all of that in the vote at the end of the function.

Where: gh-102140

Contributors : @Drakariboo & @Vanille-22

ghost · 2023-03-17T18:25:17Z

All commit authors signed the Contributor License Agreement.

bedevere-bot · 2023-03-17T18:25:18Z

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

Drakariboo and others added 2 commits March 17, 2023 18:58

similiraty and average added in the heuristic

7d64135

Single word header check vote

443c730

bedevere-bot mentioned this pull request Mar 17, 2023

False negative from csv.Sniffer.has_header with only strings #102140

Open

bedevere-bot added the awaiting review label Mar 17, 2023

📜🤖 Added by blurb_it.

d45b1ac

Drakariboo closed this Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

pythongh-102140 : False neg csv header bug fix #102787

pythongh-102140 : False neg csv header bug fix #102787

Uh oh!

Drakariboo commented Mar 17, 2023

Uh oh!

ghost commented Mar 17, 2023 •

edited by ghost

Loading

Uh oh!

bedevere-bot commented Mar 17, 2023

Uh oh!

Uh oh!

Uh oh!

pythongh-102140 : False neg csv header bug fix #102787

pythongh-102140 : False neg csv header bug fix #102787

Uh oh!

Conversation

Drakariboo commented Mar 17, 2023

Uh oh!

ghost commented Mar 17, 2023 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bedevere-bot commented Mar 17, 2023

Uh oh!

Uh oh!

ghost commented Mar 17, 2023 •

edited by ghost

Loading