bpo-39170: Sqlite3 row_factory for attribute access: sqlite3.NamedRow #17768

jidn · 2019-12-31T07:17:03Z

Currently, sqlite3 returns rows by tuple or sqlite3.Row for dict-style, index access. I constantly find myself wanting attribute access like namedtuple for rows. I find attribute access cleaner
without the brackets and quoting field names. However, unlike previous discussions (https://bugs.python.org/issue13299), I don't want to use the namedtuple object. I appreciate the simple API and minimal memory consumption of sqlite3.Row and used it as my guide in creating sqlite3.NamedRow to allow access by index and attribute.

Why a new object instead of adding attribute access to the existing sqlite3.Row?
There is an existing member method keys and any table with the field "keys" would cause a hard to debug, easily avoidable, collision.

Features

Optimized in C, so it will be faster than any python implementation.
Access columns by attribute for all valid names and by index for all names.
Iterate over fields by name/value pairs.
Works with standard function len and operator contains.
Identical memory consumption to sqlite3.Row with two references: the data tuple and the cursor description.
Identical speed to sqlite3.Row if not faster. Timing usually has it slightly faster for index by name or attribute, but it is almost identical.

Examples

    >>> import sqlite3
    >>> c = sqlite3.Connection(":memory:").cursor()
    >>> c.row_factory = sqlite3.NamedRow
    >>> named_row = c.execute("SELECT 'A' AS letter, '.-' AS morse, 65 AS ord").fetchone()

    >>> len(named_row)
    3
    >>> 'letter' in named_row
    true
    >>> named_row == named_row
    true
    >>> hash(named_row)
    5512444875192833987

    >>> named_row[0]
    'A'
    >>> named_row[1:]
    ('.-', 65)

    >>> named_row["ord"]
    65

    >>> named_row.morse
    '.-'

    >>> dict(named_row)
    {'letter': 'A', 'morse': '.-', 'ord': 65}
    >>> tuple(named_row)
    (('letter', 'A'), ('morse', '.-'), ('ord', 65))

How sqlite3.NamedRow differs from sqlite3.Row

The class only has class dunder methods to allow any valid attribute name. When the column name would be an invalid attribute name, you have two options: either use the SQL AS in the select statement or index by name.

    >>> row = cursor.execute("SELECT count(*) FROM mytable").fetchone()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      AttributeError: 'sqlite3.NamedRow' object has no attribute 'count'
    >>> tuple(row)
    (('count(*)', 104),)
    >>> row["count(*)"], row[0]
    (104, 104)
    >>> row = cursor.execute("SELECT  count(*) AS cnt FROM mytable").fetchone()
    (('count', 2))
    row.count
    104

To get the field names, use the iterator [x[0] for x in row] or do the same from the
cursor.description.

    titles = [x[0] for x in row]
    titles = [x[0] for x in cursor.description]
    titles = dict(row).keys()

Name and dict access are no longer case-insensitive. There are four reasons for this.

Case-insensitive comparison only works well for ASCII characters. In a Unicode world, case-insensitive edge cases create unnecessary errors. Looking at several existing codebases, this feature of Row is almost never used and I believe is not needed in NamedRow.
Case-insensitivity is not allowed for attribute access. This "feature" would treat attribute access differently from the rest of Python and "special cases aren't special enough to break the rules". Where row.name, row.Name, and row.NAME are all the same it gives off the faint code smell of something wrong. When case-insensitively is needed and the query SELECT can not be modified, sqlite3.Row is still there.
Code is simpler and easier to maintain.
It is faster.

Timing Results

NamedRow is faster than sqlite3.Row for index-by-name access. Speed is not the focus. I just want to show that this is not any slower than sqlite3.Row when dealing with thousands for results.
I have published a graph and the methodology of my testing. In the worst case scenario, it is just as fast as sqlite3.Row without any extra memory.
For more information, see the post at https://jidn.com/posts/2019/10/namedrow-better-python-sqlite3-row-factory/

https://bugs.python.org/issue39170


        Initial commit

+ Creates object in sqlite3 + row_factory works + Get by index + Get by dict + Case insensitive + Underscore acceptable replacement for space or dash + AttributeError works + IndexError works


        Working version passing tests


        Fix Issue38175

Test both the cursor descriptor and data.


        Add improved timing test


        Add row creating timing

Does the cursor.execute().fetchone() effect the general shape of the data? Add testing which only covers object creation and access.


        Merge remote-tracking branch 'upstream/master' into sqlite3-namedrow


        Fix column access

+ Faster + Remove case insensitive


        Add documentation


        Add tests


        📜🤖 Added by blurb_it.

Clinton James added 9 commits Sep 8, 2019

Initial commit

dc87278

+ Creates object in sqlite3 + row_factory works + Get by index + Get by dict + Case insensitive + Underscore acceptable replacement for space or dash + AttributeError works + IndexError works

Working version passing tests

1e4e907

Fix Issue38175

9013073

Test both the cursor descriptor and data.

Add improved timing test

958e579

Add row creating timing

b4ea904

Does the cursor.execute().fetchone() effect the general shape of the data? Add testing which only covers object creation and access.

Merge remote-tracking branch 'upstream/master' into sqlite3-namedrow

1c32c1a

Fix column access

7d28522

+ Faster + Remove case insensitive

Add documentation

bd34ea6

Add tests

Loading status checks…

f05be4f

jidn requested review from berkerpeksag and python/windows-team as code owners Dec 31, 2019

the-knights-who-say-ni added the CLA signed label Dec 31, 2019

bedevere-bot added the awaiting review label Dec 31, 2019

📜🤖 Added by blurb_it.

Verified

This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.

GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits

Loading status checks…

f6f5e33

python/cpython

bpo-39170: Sqlite3 row_factory for attribute access: sqlite3.NamedRow #17768

bpo-39170: Sqlite3 row_factory for attribute access: sqlite3.NamedRow #17768

jidn commented Dec 31, 2019 •

edited

Sponsor python/cpython

python/cpython

Join GitHub today

bpo-39170: Sqlite3 row_factory for attribute access: sqlite3.NamedRow #17768

bpo-39170: Sqlite3 row_factory for attribute access: sqlite3.NamedRow #17768

Conversation

jidn commented Dec 31, 2019 • edited

Features

Examples

How sqlite3.NamedRow differs from sqlite3.Row

Timing Results

jidn commented Dec 31, 2019 •

edited