Quantcast
Channel: Robin on Linux
Viewing all articles
Browse latest Browse all 236

To construct DataFrame more effectively

$
0
0

The old code of python looks like:

import pandas as pd

temp = pd.DataFrame()

for record in table:
    df = pd.DataFrame(record)
    temp = pd.concat([temp, df])

# The final result
result = temp

This snippet above will cost 7 seconds to run on my laptop.
Actually, pd.concat() is an expensive operation for CPU. So let’s replace it with common python dictionary:

import pandas as pd

temp = {}

for record in table:
    temp[record[column_name]] = record[column_value]
    ...

# The final result
result = pd.DataFrame.from_dict(temp)

This snippet only costs 0.03 seconds, which is more effective.


Viewing all articles
Browse latest Browse all 236

Trending Articles