Questions tagged [python-xarray]

xarray (formerly xray) is an open source library that provides a range of N-dimensional data structures.

Filter by
Sorted by
Tagged with
16
votes
1answer
12k views

What is the pandas.Panel deprecation warning actually recommending?

I have a package that uses pandas Panels to generate MultiIndex pandas DataFrames. However, whenever I use pandas.Panel, I get the following DeprecationError: DeprecationWarning: Panel is ...
15
votes
3answers
5k views

How to get the coordinates of the maximum in xarray?

Simple question: I don't only want the value of the maximum but also the coordinates of it in an xarray DataArray. How to do that? I can, of course, write my own simple reduce function, but I wonder ...
15
votes
1answer
4k views

When to use multiindexing vs. xarray in pandas

The pandas pivot tables documentation seems to recomend dealing with more than two dimensions of data by using multiindexing: In [1]: import pandas as pd In [2]: import numpy as np In [3]: import ...
15
votes
1answer
3k views

Join/merge multiple NetCDF files using xarray

I have a folder with NetCDF files from 2006-2100, in ten year blocks (2011-2020, 2021-2030 etc). I want to create a new NetCDF file which contains all of these files joined together. So far I have ...
14
votes
3answers
6k views

Speeding up reading of very large netcdf file in python

I have a very large netCDF file that I am reading using netCDF4 in python I cannot read this file all at once since its dimensions (1200 x 720 x 1440) are too big for the entire file to be in memory ...
14
votes
3answers
1k views

combining spatial netcdf files using xarray python

Is there a way to merge 2 or more netCDF files with the same time dimension but different spatial domains into a single netCDF file? The spatial domains are specified by latitude and longitude ...
13
votes
4answers
2k views

How to apply linear regression to every pixel in a large multi-dimensional array containing NaNs?

I have a 1D array of independent variable values (x_array) that match the timesteps in a 3D numpy array of spatial data with multiple time-steps (y_array). My actual data is much larger: 300+ ...
11
votes
2answers
4k views

Is it possible to append to an xarray.Dataset?

I've been using the .append() method to concatenate two tables (with the same fields) in pandas. Unfortunately this method does not exist in xarray, is there another way to do it?
10
votes
5answers
1k views

Get hourly average for each month from a netcdf file

I have a netCDF file with the time dimension containing data by the hour for 2 years. I want to average it to get an hourly average for each hour of the day for each month. I tried this: import ...
10
votes
2answers
3k views

Concise way to filter data in xarray

I need to apply a very simple 'match statement' to the values in an xarray array: Where the value > 0, make 2 Where the value == 0, make 0 Where the value is NaN, make NaN Here's my current solution....
10
votes
1answer
737 views

Avoid overlapping colorbar in xarray facet grid plot

import xarray as xr import cartopy.crs as ccrs USA_PROJ = ccrs.AlbersEqualArea(central_longitude=-97., central_latitude=38.) g_simple = ds_by_month.t2m.plot(x='longitude', ...
10
votes
1answer
2k views

boolean indexing in xarray

I have some arrays with dims 'time', 'lat', 'lon' and some with just 'lat', 'lon'. I often have to do this in order to mask time-dependent data with a 2d (lat-lon) mask: x.data[:, mask.data] = np.nan ...
9
votes
2answers
347 views

xarray reverse interpolation (on coordinate, not on data)

I have a the following DataArray arr = xr.DataArray([[0.33, 0.25],[0.55, 0.60],[0.85, 0.71],[0.92,0.85],[1.50,0.96],[2.5,1.1]],[('x',[0.25,0.5,0.75,1.0,1.25,1.5]),('y',[1,2])]) This gives the ...
8
votes
5answers
4k views

add dimension to an xarray DataArray

I need to add a dimension to a DataArray, filling the values across the new dimension. Here's the original array. a_size = 10 a_coords = np.linspace(0, 1, a_size) b_size = 5 b_coords = np.linspace(0,...
7
votes
3answers
2k views

python-xarray: open_mfdataset concat along two dimensions

I have files which are made of 10 ensembles and 35 time files. One of these files looks like: >>> xr.open_dataset('ens1/CCSM4_ens1_07ic_19820701-19820731_NPac_Jul.nc') <xarray.Dataset> ...
7
votes
2answers
5k views

Substitute dataset coordinates in xarray (Python)

I have a dataset stored in NetCDF4 format that consists of Intensity values with 3 dimensions: Loop, Delay and Wavelength. I named my coordinates the same as the dimensions (I don't know if it's good ...
7
votes
2answers
3k views

Extract coordinate values in xarray

I would like to extract the values of the coordinate variables. For example I create a DataArray as: import xarray as xr import numpy as np import pandas as pd years_arr=range(1982,1986) time = pd....
7
votes
1answer
162 views

Writing xarray multiindex data in chunks

I am trying to efficiently restructure a large multidimentional dataset. Let assume I have a number of remotely sensed images over time with a number of bands with coordinates x y for pixel location, ...
7
votes
1answer
812 views

Memory errors using xarray + dask - use groupby or apply_ufunc?

I am using xarray as the basis of my workflow for analysing fluid turbulence data, but I'm having trouble leveraging dask correctly to limit memory usage on my laptop. I have a dataarray n with ...
6
votes
1answer
3k views

Select xarray/pandas index based on specific months

I have an xarray DataArray that I want to select the months April, May, June (similar to time.season=='JJA') for an entire time series. Its structured like: <xarray.DataArray 't2m' (time: 492, ...
6
votes
1answer
2k views

Specify encoding/compression for many variables in xarray dataset when write to_netcdf

I have been writing out some xarray.Datasets that have multiple variables. Currently, in order to keep the size manageable I specify the encoding, e.g. zlib, but needs to be applied on a variable (...
6
votes
1answer
1k views

Grouping by multiple dimensions

Grouping by a single dimension works fine for xarray DataArrays: d = xr.DataArray([1, 2, 3], coords={'a': ['x', 'x', 'y']}, dims=['a']) d.groupby('a').mean()) # -> DataArray (a: 2) array([1.5, 3. ...
6
votes
1answer
889 views

xarray automatically applying _FillValue to coordinates on netCDF output

I'm trying to create a cf compliant netcdf file. I can get it about 98% cf compliant with xarray but there is one issue that I am running into. When I do an ncdump on the file that I am creating, I ...
6
votes
1answer
4k views

replace values in xarray dataset with None

I want to replace values in a variable in an xarray dataset with None. I tried this approach but it did not work: da[da['var'] == -9999.]['var'] = None I get this error: *** TypeError: unhashable ...
6
votes
2answers
2k views

Xarray: slice coordinates with no dimensions

I am having difficultly with this topic, even though it seems like it should be rather simple. I want to slice an xarray dataset using a set of latitude and longitude coordinates. Here is what my ...
6
votes
1answer
2k views

xarray too slow for performance critical code

I planned to use xarray extensively in some numerically intensive scientific code that I am writing. So far, it makes the code very elegant, but I think I will have to abandon it as the performance ...
6
votes
3answers
2k views

How to join data from multiple netCDF files with xarray in Python?

I'm trying to open multiple netCDF files with xarray in Python. The files have data with same shape and I want to join them, creating a new dimension. I tried to use concat_dim argument for xarray....
6
votes
1answer
1k views

How to convert an xarray dataset to pandas dataframes inside a dask dataframe

I have a calculation that expects a pandas dataframe as input. I'd like to run this calculation on data stored in a netCDF file that expands to 51GB - currently I've been opening the file with xarray....
6
votes
2answers
934 views

Importing and decoding dataset in xarray to avoid conflicting _FillValue and missing_value

When using xarray open_dataset or open_mfdataset to load a NARR netcdf dataset (e.g. ftp://ftp.cdc.noaa.gov/Datasets/NARR/monolevel/air.2m.2010.nc), xarray returns an error regarding "conflicting ...
6
votes
1answer
209 views

Parallelized bootstrapping with replacement with xarray/dask

I want to perform N=1000 bootstrapping with replacement on gridded data. One computation takes about 0.5s. I have access to a supercomputer exclusive node with 48 cores. Because the resampling are ...
6
votes
1answer
872 views

How best to rechunk a NetCDF file collection to Zarr dataset

I'm trying to rechunk a NetCDF file collection and create a Zarr dataset on AWS S3. I have 168 original NetCDF4 classic files with arrays of dimension time: 1, y: 3840, x: 4608 chunked as chunks={'...
5
votes
1answer
2k views

Drop duplicate times in xarray

I'm reading NetCDF files with open_mfdataset, which contain duplicate times. For each duplicate time I only want to keep the first occurrence, and drop the second (it will never occur more often). The ...
5
votes
2answers
2k views

Add 'constant' dimension to xarray Dataset

I have a series of monthly gridded datasets in CSV form. I want to read them, add a few dimensions, and then write to netcdf. I've had great experience using xarray (xray) in the past so thought I'd ...
5
votes
3answers
9k views

Python Xarray add DataArray to Dataset

Very simple question but I can't find the answer online. I have a Dataset and I just want to add a named DataArray to it. Something like dataset.add({"new_array": new_data_array}). I know about merge ...
5
votes
2answers
5k views

python mask netcdf data using shapefile

I am using the following packages: import pandas as pd import numpy as np import xarray as xr import geopandas as gpd I have the following objects storing data: print(precip_da) Out[]: <...
5
votes
2answers
928 views

Python xarray.concat then xarray.to_netcdf generates huge new file size

So I have 3 netcdf4 files (each approx 90 MB), which I would like to concatenate using the package xarray. Each file has one variable (dis) represented at a 0.5 degree resolution (lat, lon) for 365 ...
5
votes
2answers
482 views

With xarray, how to parallelize 1D operations on a multidimensional Dataset?

I have a 4D xarray Dataset. I want to carry out a linear regression between two variables on a specific dimension (here time), and keep the regression parameters in a 3D array (the remaining ...
5
votes
2answers
4k views

Create and write xarray DataArray to NetCDF in chunks

Is it also possible to create an out-of-core DataArray, and write it chunk-by-chunk to a NetCDF4 file using xarray? For example, I want to be able to do this in an out-of-core fashion when the ...
5
votes
1answer
2k views

How to merge xArray datasets with conflicting coordinates

Let's say I have two data sets, each containing a different variable of interest and with incomplete (but not conflicting) indices: In [1]: import xarray as xr, numpy as np In [2]: ages = xr.Dataset( ...
5
votes
3answers
2k views

How to flatten an xarray dataset into a 1D numpy array?

Is there a simple way of flattening an xarray dataset into a single 1D numpy array? For example, flattening the following test dataset: xr.Dataset({ 'a' : xr.DataArray( data=[...
5
votes
2answers
1k views

Python Xarray, sort by index or dimension?

Is there a sort_index or sort_by_dimension method of some kind in xarray, much like pandas.DataFrame.sort_index(), where I can sort a xarray.DataArray object by one of its dimensions? In terms of ...
5
votes
1answer
132 views

Calculate the percentile rank of a value in a multi-dimensional array along an axis

I have a 3D dimensional array. >>> M2 = np.arange(24).reshape((4, 3, 2)) >>> print(M2) array([[[ 0, 1], [ 2, 3], [ 4, 5]], [[ 6, 7], [ 8, 9], ...
5
votes
1answer
487 views

Sparse DataArray Xarray search

Using DataArray objects in xarray what is the best way to find all cells that have values != 0. For example in pandas I would do df.loc[df.col1 > 0] My specific example I'm trying to look at 3 ...
5
votes
1answer
1k views

Do xarray or dask really support memory-mapping?

In my experimentation so far, I've tried: xr.open_dataset with chunks arg, and it loads the data into memory. Set up a NetCDF4DataStore, and call ds['field'].values and it loads the data into memory. ...
5
votes
2answers
381 views

python get month of maximum value xarray

How to get the month of maximum runoff I want to get the month of maximum runoff for each year, and for the time series as a whole. The idea is to characterise global seasonality by looking at the ...
5
votes
2answers
779 views

create netcdf using xarray with time stamp beyond year 2263

Is there a way to create a netCDF file with time dimension beyond year 2263 using xarray? Here is how a netCDF toy dataset can be created http://xarray.pydata.org/en/stable/time-series.html However ...
5
votes
0answers
89 views

How to create a gdal.Dataset or xarray.Dataset object from a django.contrib.gis.gdal.GDALRaster object?

I am working on a Django project in which I'm trying to get all the raster data from my Database. Here is my model in models.py from django.contrib.gis.db import models class RasterWithName(models....
5
votes
0answers
120 views

Parallel appending to a zarr store via xarray.to_zarr and Dask

I am in a situation where I want to load objects, transform them into an xarray.Dataset and write that into a zarr store on s3. However, to make the loading of objects faster, I do it in parallel ...
5
votes
0answers
163 views

xarray/dask - limiting the number of threads/cpus

I'm fairly new to xarray and I'm currently trying to leverage it to subset some NetCDFs. I'm running this on a shared server and would like to know how best to limit the processing power used by ...

1
2 3 4 5
23