Skip to content

_splitnetloc in urllib split netloc wrongly if username or password contain special charcaters #110869

Closed as not planned
@Mattia2700

Description

@Mattia2700

Bug report

Bug description:

This next snippet can be found here.

def _splitnetloc(url, start=0):
    delim = len(url)   # position of end of domain part of url, default is end
    for c in '/?#':    # look for delimiters; the order is NOT important
        wdelim = url.find(c, start)        # find first of this delim
        if wdelim >= 0:                    # if found
            delim = min(delim, wdelim)     # use earliest delim position
    return url[start:delim], url[delim:]   # return (domain, rest)

As the title says, if the username or password part in the url (http://user:pass@domain.com/path) contains any special character (/?#), the path/query/fragment is obtained wrongly, because the function looks only for the first instance. This of course won't work if the searched characters are present in the string before the "@" symbol.

Possible fix ready, will opening a pr soon

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions