Duplicates values in identity column when mixing redshift.copy and redshift.to_sql

Hello everyone! We are having an issue with duplicated values in an identity column when executing `redshift.to_sql()` after executing a `redshift.copy()`. When trying to insert more than 1000 records we use the COPY method and when trying to insert less than 1000 records we use the `redshift.to_sql()` method as is recommended in the docs. 

This is the structure of the table:
```
create table if not exists dim_testing
(
    id BIGINT IDENTITY NOT NULL PRIMARY KEY,
    name varchar
);
```

This is the code in which we are able to reproduce the issue:
```python
import random
import string

import awswrangler as wr
import pandas as pd

def generate_df(length):
    return pd.DataFrame([''.join(random.choices(string.ascii_letters + string.digits, k=30)) for _ in range(length)], columns=['name'])

df = generate_df(1500)
wr.redshift.copy(
    df=df,
    con=redshift.conn,
    path=f's3://s3_bucket_path/dim_testing/',
    schema='public',
    table='dim_testing',
    use_column_names=True,
    mode='append'
)

df = generate_df(500)
wr.redshift.to_sql(
    df,
    redshift.conn,
    schema='public',
    table='dim_testing',
    use_column_names=True,
    mode='append'
)
```

After executing that code if we look for duplicates by the identity column id, we get duplicated values:
```
select id, count(*)
from dim_testing
group by id
having count(*) > 1;
``` 

If we only use to_sql() or copy(), the issue does not happens. It happens when executing to_sql() after a copy().
Is this normal? Are we doing something wrong? Should we only chose one write method and not use the other?

Thanks!






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicates values in identity column when mixing redshift.copy and redshift.to_sql #2996

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Duplicates values in identity column when mixing redshift.copy and redshift.to_sql #2996

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions