1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Mejorar la velicidad creando columnas dinamicas en un Dataframe

Discussion in 'Programming/Internet' started by Yeison H. Arias, Oct 8, 2018.

  1. Estoy creando un Dataframe con la siguiente informacion:

    import numpy as np
    import pandas as pd
    from time import time

    start_time = time()

    columns = 60

    Data = pd.DataFrame(np.random.randint(low=0, high=10, size=(700000, 3)), columns=['a', 'b', 'c'])
    Data['f'] = (Data.index % 60) + 1
    Data['column_-1'] = 100
    for i in range(columns):
    Data['column_' + str(i)] = np.where( # Condicion 1
    Data['f'] == 1,
    1000 + i,
    np.where( # Condicion 2
    i < Data['f'],
    0,
    np.where( # Condicion 3
    Data['a'] > Data['b'],
    Data['column_' + str(-1)] * Data['c'],
    Data['column_' + str(-1)]
    )
    )
    )

    elapsed_time = time() - start_time
    print("Elapsed time: %.10f seconds." % elapsed_time)


    Elapsed time: 1.0710000992 seconds.

    quiero saber si hay una mejor forma de hacerlo, generando las columnas dinamicamente y mejorando la velocidad del script, gracias.

    Login To add answer/comment
     

Share This Page