Skip to main content
Filter by
Sorted by
Tagged with
-1 votes
0 answers
14 views

Longitudinal Analysis with High Variability in Time Entries per Subject

I am working with retrospective data from a symptom tracking app and I aim to identify different symptom trajectory classes within this data. After reviewing relevant literature, I have found that ...
Carine's user avatar
  • 1
0 votes
0 answers
12 views

pooling phase of multiple imputation

Regarding the pooling phase of multiple imputation, the standard approach is to apply Rubins rules for combining parameter estimates. For example, lets say we have a dataset with missing values for ...
flea85's user avatar
  • 1
0 votes
2 answers
47 views

Missing values in olive oil dataset

I have a dataset of olive oil samples and the goal of creating a classification model for oil quality. I'm having trouble deciding how to deal with missing data. have a look at the data here if you ...
BOBTHEBUILDER's user avatar
-1 votes
3 answers
140 views

Drop rows with missing values in all columns [duplicate]

It looks like tidyr's drop_na will drop rows if any of the specified columns contain missing values. Example: > library(tidyverse) > df <- data.frame(a=c(1,NA,2,NA), b=c(3,4,NA,NA)) > df ...
robertspierre's user avatar
-1 votes
0 answers
29 views

Best practices for handling missing data in pandas to maintain model accuracy [closed]

I’m working with a dataset in pandas that contains several columns with missing values (NaN). I’m trying to decide on the best strategy to handle this missing data before feeding it into a machine ...
0 votes
0 answers
70 views

Why does R evaluate `NA==T|F` as NA, but `NA==F|T` as True? (and related Qs) [duplicate]

If you run the following lines of code in R, you may be surprised by the results (printed above each line as a comment) #1: NA NA==T #2: NA NA==F #3: NA NA==T&T #4: FALSE NA==F&F #5: NA NA==F&...
Some Attribute's user avatar
1 vote
0 answers
32 views

How can I add zero for empty or missing rows?

I have been trying to resolve this for two days and feel the need for help. I've created a cumulative graph, only it's showing as cumulative! That is because there aren't necessarily rows of data ...
Sally Parkes's user avatar
-4 votes
0 answers
46 views

In data analysis [closed]

How do i handle missing data in a large dataset using Python? I’m working on a large dataset using Python and I noticed that some columns have missing values (NaN). I tried using df.dropna() and df....
Sa ra's user avatar
  • 15
1 vote
3 answers
139 views

Python - How to check for missing values not represented by NaN? [duplicate]

I am looking for guidance on how to check for missing values in a DataFrame that are not the typical "NaN" or "np.nan" in Python. I have a dataset/DataFrame that has a string ...
gnocchi17's user avatar
1 vote
1 answer
40 views

Why does RandomForestClassifier in scikit-learn predict even on all-NaN input?

I am training a random forest classifier in python sklearn, see code below- from sklearn.ensemble import RandomForestClassifier rf = RandomForestClassifier(random_state=42) rf.fit(X = df.drop("...
lsr729's user avatar
  • 844
0 votes
0 answers
42 views

Why does ydata-profiling not detect missing values in PySpark DataFrame when using None?

I'm using ydata-profiling to generate profiling reports from a large PySpark DataFrame without converting it to Pandas (to avoid memory issues on large datasets). Some columns contain the string "...
hexxetexxeh's user avatar
0 votes
0 answers
49 views

R mice leaves missing values when I use a where-matrix

I have a large data frame with a lot of variables measured at three time points t1, t2 and t3. I only want to impute those missings where the according time point was answered at all, that is where ...
Qwertzu-iop's user avatar
1 vote
1 answer
47 views

Creating Artificial Gaps in R Dataset [duplicate]

I am processing data using Random Forest, and I am trying to create random artificial gaps in my dataset so that I can test how accurate the random forest predictions are. TIMESTAMP <- c(2001:2020) ...
shrimp's user avatar
  • 101
0 votes
0 answers
46 views

How do I get my data to not dissapear when I click another fragment ? android studio

I am trying to make an app where it controls the aspects of a garden. changing the temperature, the humidity, wind, and etc. My new issue is that my data keeps dissapearing after I click another ...
Isa's user avatar
  • 21
0 votes
0 answers
13 views

Pandas - How to backfill a main dataframe with values from another while prioritizing the main dataframe [duplicate]

SET UP MY PROBLEM I have two pandas dataframes. First, I have main: import pandas as pd import numpy as np main = pd.DataFrame({"foo":{"a":1.0,"b":2.0,"c":3.0,&...
bismo's user avatar
  • 1,461

15 30 50 per page
1
2 3 4 5
193