10

Is it possible to change the order of columns in a dataframe in place?

If yes, would that be faster than making a copy? I am working with a large dataframe with 100 million+ rows.

I see how to change the order with a copy: How to change the order of DataFrame columns?

2
  • how often do you need to do this? why does the order of columns even matter? how long does it take to do it by doing something like newdf = df[new_column_order]?
    – acushner
    Commented Sep 16, 2014 at 21:20
  • 1
    @acushner it is useful for presenting data, e.g. in a Jupyter notebook. Formatting may be important in that case.
    – Moustache
    Commented Jul 12, 2020 at 1:37

3 Answers 3

5

Their is no easy way to do this without making a copy. In theory it is possible to do if you ONLY have a single dtype (or are only changing columns WITHIN out the labels changing dtypes). But is fairly complicated, and hence is not implemented.

That said, if you are careful you can do this. You should ONLY do this with a single-dtyped frame (you are forewarned).

In [22]: df = DataFrame(np.random.randn(5,3),columns=list('ABC'))

In [23]: df
Out[23]: 
          A         B         C
0 -0.696593 -0.459067  1.935033
1  1.783658  0.612771  1.553773
2 -0.572515  0.634174  0.113974
3 -0.908203  1.454289  0.509968
4  0.776575  1.629816  1.630023

If df is multi-dtyped then df.values WILL NOT BE A VIEW (of course you can subselect out the single-dtyped frame which is a view itself). Another note, this is NOT ALWAYS POSSIBLE to have this come out as a view. It depends on what you are doing, YMMV.

e.g. df.values.take([2,0,1],axis=1) gives you the same result BUT IS A COPY.

In [24]: df2 = DataFrame(df.values[:,[2,0,1]],columns=list('ABC'))

In [25]: df2
Out[25]: 
          A         B         C
0  1.935033 -0.696593 -0.459067
1  1.553773  1.783658  0.612771
2  0.113974 -0.572515  0.634174
3  0.509968 -0.908203  1.454289
4  1.630023  0.776575  1.629816

We have a view on the original values

In [26]: df2.values.base
Out[26]: 
array([[ 1.93503267,  1.55377291,  0.1139739 ,  0.5099681 ,  1.63002264],
       [-0.69659276,  1.78365777, -0.5725148 , -0.90820288,  0.7765751 ],
       [-0.45906706,  0.61277136,  0.63417392,  1.45428912,  1.62981613]])

Note that if you then assign to df2 (another float column for instance), you will trigger a copy. So you have to be extremely careful with this.

That said the creation from a view of another frame takes almost no memory and is just a pointer, so very fast.

2

Here is a short and even more memory efficient way (because no additional temporary variable needs to be saved):

df = pd.DataFrame({"A": [0, 1], "B": [2, 3], "C": [4, 5]})

new_order = ["B", "C", "A"]
for column in new_order:
    df[column] = df.pop(column)

This works, because the new columns are assigned to the DataFrame in the new order and the old columns are deleted one by one. Pop returns a column and deletes it from the DataFrame.

1

Hmm... no one proposed drop and insert:

df = pd.DataFrame([['a','b','c']],columns=list('ABC'))

print('Before', id(df))

for i,col in enumerate(['C','B', 'A']):
    tmp = df[col]
    df.drop(labels=[col],axis=1,inplace=True)
    df.insert(i,col,tmp)    
    
print('After ', id(df))
df.head()

The result will preserve the original dataframe

Before 140441780394360
After  140441780394360

   C    B   A
   ----------
0  c    b   a
1
  • The problem is that pandas DataFrame has an internal BlockManager which will consolidate columns with same dtype into a single contiguous memory block. The consolidation may happen when pandas "thinks" (read "hard to tell") it's better to do so. df[col] does not make copy, but after insertion the three columns are no longer contiguous in the given order. Thus it's possible that pandas does the copy when we do some calculation on the DataFrame.
    – cyfdecyf
    Commented Jun 23, 2021 at 8:20

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.