1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Normalization vs Standardization for multivariate time-series

Discussion in 'Education' started by Marko, Oct 8, 2018.

  1. Marko

    Marko Guest

    I'm using DTW as a distance measure for comparing two multivariate time-series. I want to be able to cluster data using DTW as distance measure, since time-series may be shifted, skewed.

    Since there are a couple of parameters I should normalize the series so that all the parameters have the same influence when trying to determine whether time-series are similar. I'm using Euclidean distance as local distance for DTW.

    My question is - how to determine whether I should use normalization (substract min and divide by max) or standardization (substract mean and divide by standard deviation)?

    Moreover, can anyone explain to me what is the point with standardization? I understand that it can help me determine how many standard deviations are values far from their mean, but why would that improve my similarity measure when comparing two time-series?

    I'm not a statistician, so any explanation would be great. I understand that normalization would give me values in range [0,1] so that all parameters have values in same range, but what will I get by standardization?

    Finally, should I divide each time-series by the standard deviation of the whole dataset, or only by standard deviation of the time-series I'm standardizing?

    I must also emphasize that my data does not belong to a normal distribution.

    Login To add answer/comment
     

Share This Page