All Questions
306 questions
2
votes
1
answer
53
views
Get a row subset of a Pandas Dataframe based on conditions with query
I would like
to gain a subset of a Pandas Dataframe
based on query, if possible
giving several conditions based on column values
where only rows have to be selected until conditions appear for the ...
0
votes
1
answer
91
views
pick n different random samples from subgroup
I have two Dataframes that look like this:
df = pd.DataFrame({'PERSONALNUMMER': {4756: '0209740',4820: '0234212',4855: '0251297',4750: '0209326',4992: '4000404'},
'MANDANT': {4756: 'OM', 4820: 'OM', ...
0
votes
1
answer
93
views
Sampling in python with multiple conditions and percentages
Person ID
Condition 1
Condition 2
Condition 3
A
Yes
No
Yes
B
No
Yes
No
C
Yes
No
No
Hi! I have to generate a sample from a fairly large dataset, and the inclusion criteria are a little more complicated ...
0
votes
1
answer
88
views
dataframe how to get subset of rows in a dataframe
I have the following code:
import yfinance as yf
stocks1 = ['AAL','AAPL','ABBV']
new_df1 = yf.download(tickers=stocks1,
start='2023-10-01',
end='2023-10-10')
...
2
votes
3
answers
142
views
(Python/Pandas) Subset DataFrame based on non-missing values from a column
I have a pd dataframe:
import pandas as pd
column1 = [None,None,None,4,8,9,None,None,None,2,3,5,None]
column2 = [None,None,None,None,5,1,None,None,6,3,3,None,None]
column3 = [None,None,None,3,None,7,...
0
votes
0
answers
43
views
Hourly data out of 15 mins interval dataset [duplicate]
I am looking for help in python code.
I have data with 15 mins values for 1 year. I just want to extract the values at hourly interval. No need to apply mean or any function. Just take the data in a ...
0
votes
2
answers
47
views
How to filter dates of time window relative to column value in Pandas?
I have a Pandas data frame df with columns of ID, DATE (continuous year-month dates of the same six-month-long period) and FIX_DATE (constant year-month date per ID always falling in the second half ...
1
vote
1
answer
140
views
Split dataframe into subsets by pairs of columns
I have a wide dataframe with N columns. Columns are presented in pairs like this: the first two go together, and so the next two, until the end of the dataframe:
XS0552790049 Unnamed: 5583 ...
1
vote
1
answer
562
views
Pandas creating new vs. overwriting existing dataframe
I am working with a ~70 GB data frame consisting of around 7 million rows and 56 columns. I want to subset this dataframe to a smaller one, taking 100.000 random rows out of the original dataframe.
...
0
votes
0
answers
69
views
Un-subset a dataset to model data another way
I want to run my analysis based on one subset of data, then re-run my analysis on another subset of data all coming from the same dataframe.
I have a dataset of survey responses. For simplification, ...
0
votes
2
answers
36
views
Sub stringing a column by a number of characters after a certain value
I have a field that looks like this: Non Compliance Risk Situation 1 - domain D-G101 Regulatory licenses and relations with regulators. I want to create a new column that starts after "domain&...
0
votes
1
answer
60
views
Groupby a dataframe conditioned on "subset" relationship?
Generate a sample dataframe using:
import pandas as pd
pd.DataFrame({'A': [{'A', 'B'}, {'A', 'B', 'C', 'E'}, {'B', 'D'}, {'C', 'B'}, {'A', 'B', 'D'}, {'X'}], 'B': [111, 222, 333, 444, 555, 666]})
...
0
votes
0
answers
149
views
Finding a subset whose sum is zero
I have an Excel file containing two rows, one containing numbers, another having ID, serial number basically.
Now, these numbers are both positive and negative. I have to find if there exists a subset ...
0
votes
0
answers
48
views
print a second df by pulling the same values from two different df columns that do not match
I have two different unmatch (columns&rows) data frame, actually one of these df is a subset of the other df. also we can call main and subset dfs. And I want to find matched values between subset ...
0
votes
1
answer
226
views
How to drop rows (or subset other rows) based on values in lists in pandas? Create mutually exclusive subsets of dfs
How to drop rows which have at least 1 element from both the lists? Looking for something iterative over more than 100 columns.
Minimal example with 3 columns is:
list1 = ["abc1", "def&...