1 Introduction
Scientific notations isn’t helpful when you are trying to make quick comparisons across your dataset. However, Pandas will introduce scientific notations by default when the data type is a float. In this post I want to show how to get around this problem.
2 Scientific notations
Scientific notation (numbers with e) is a way of writing very large or very small numbers in a clear way. Unfortunately for many people these are not very tangible. Here are two examples of how to convert the scientifically written numbers into more readable ones.
Now we know how to convert these numbers. But to do this every time with a calculator or something similar is very complicated. Fortunately there are a few methods to do this automatically.
3 Import the libraries
import pandas as pd
import numpy as np
Here are a few more examples of how differently pandas floats are output.
n_1 = 0.0007
n_1
n_2 = 0.0000035
n_2
n_3 = 15622098465455462.02
n_3
n_ensemble = (n_1, n_2, n_3)
n_ensemble
4 Display Values as Strings
'{:.7f}'.format(n_2)
['{:.7f}'.format(x) for x in n_ensemble]
Hint: with the number before the f you can determine the number of decimal places (default = 6)
['{:f}'.format(x) for x in n_ensemble]
5 Functions
For the following examples we create two artificial datasets:
df = pd.DataFrame(np.random.random(5)**10, columns=['random_numbers'])
df
df1 = pd.DataFrame(np.random.random(5)**10, columns=['random_numbers1'])
df2 = pd.DataFrame(np.random.random(5)**10, columns=['random_numbers2'])
df_multiple = pd.concat([df1, df2], axis=1)
df_multiple
5.2 Use apply()
Also we can apply a lambda function:
df.apply(lambda x: '%.5f' % x, axis=1)
df_apply1 = df_multiple['random_numbers1'].apply(lambda x: '%.5f' % x)
df_apply2 = df_multiple['random_numbers2'].apply(lambda x: '%.5f' % x)
df_multiple_apply = pd.concat([df_apply1, df_apply2], axis=1)
df_multiple_apply
5.3 Use set_option()
Finally, I would like to introduce the set_option function. Note that set_option() changes behavior globaly in Jupyter Notebooks, so it is not a temporary fix.
pd.set_option('display.float_format', lambda x: '%.5f' % x)
df
df_multiple
In order to revert Pandas behaviour to defaul use reset_option().
pd.reset_option('display.float_format')
df