Concatenating two dfs in Pandas

Question

I wanted to know if there was a way to verify if a value in one df that was being Concatenated with another existed in both dfs or the other df as well prior to stacking them on top of each other??

I had two dfs as

df = pd.concat([df1, df2])
df = df.sort_values(by=['id', 'timestamp']).reset_index(drop=True)
df

   id  timestamp
0   1 1959-06-01
1   1 2019-01-01
2   1 2019-01-02
3   2 1989-12-01
4   2 2019-01-15
5   3 1999-01-25
6   3 2019-01-17
7   3 2019-02-01
8   3 2019-02-03

Was there a way to to verify that the I.D in df1 existed in df2 before concatenating similar to a merge? I didn't need to merge but instead concatenate the dfs on top of each other.

One df has multiple ids and timestamps and the other had only one ID and I wanted to make sure only IDs that existed in both were in the resulting concatenated df

thanks!

you want to check for one column or entire row?
– Pyd
Commented Feb 28, 2019 at 7:17 — Pyd, Commented Feb 28, 2019 at 7:17

Andy Hayden · Accepted Answer · 2019-02-28 07:14:57Z

9

One workaround you can do here is to create a dummy column:

df1["df"] = 1
df2["df"] = 2
df = pd.concat([df1, df2])

That way you can see where each row was derived.

answered Feb 28, 2019 at 7:14

Andy Hayden

377k109 gold badges639 silver badges541 bronze badges

Add a comment |

Cerebration · Accepted Answer · 2019-02-28 10:00:08Z

is this what you are looking for? attaching a sample code.

df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'],'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']})
other = pd.DataFrame({'key': ['K0', 'K1', 'K2'],'B': ['B0', 'B1', 'B2']})
new = df.set_index('key').join(other.set_index('key'))
new.dropna(axis=0, how='any', thresh=None, subset=None, inplace=True)
print(new)

Or

import pandas as pd 
df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'],'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']})
other = pd.DataFrame({'key': ['K0', 'K1', 'K2'],'A': ['A0', 'A1', 'A2']})
new = pd.merge(df , other , how = 'inner')
print(new)

does this help you?

Pyd · Accepted Answer · 2019-02-28 07:22:01Z

0

check for entire row

df3=pd.concat([df1,df2[~df2.isin(df1)]],ignore_index=True).dropna()

check for one column

df3=pd.concat([df1,df2[~df2['col_name'].isin(df1['col_name'])]],ignore_index=True).dropna()

answered Feb 28, 2019 at 7:22

Pyd

6,16919 gold badges57 silver badges117 bronze badges

Add a comment |

Asha Choudhary · Accepted Answer · 2024-07-11 10:23:58Z

0

I think you would need to specify the axis=1 for concatenating column wise and axis=0 for concatenating row wise

df=pd.concat([df1, df2], axis =1)

answered Jul 11, 2024 at 10:23

Asha Choudhary

1651 silver badge5 bronze badges

Add a comment |

Sapan Gupta · Accepted Answer · 2019-02-28 07:32:40Z

-2

You can use df.isin please try..

answered Feb 28, 2019 at 7:32

Sapan Gupta

1087 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Concatenating two dfs in Pandas

5 Answers 5

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Related