1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Understanding Embeddings input and output sizes

Discussion in 'Computer Science' started by Juan Antonio Gomez, Oct 8, 2018.

  1. I have been trying for a while to understand the dimensionality of embeddings in neural networks and I think that finally things have clicked on my brain, however I would love to check whether or not my understanding is correct.

    1. Embeddings are an effective way to transform words into vectors, or at least to reduce the dimensionality of the data (essentially the Bag of Words approach does not work well as data is sparse)
    2. If I have a text corpus that contains say, 5000 sentences, I could then pad each sentence to a standard size, for example 150 and then use embeddings (possibly the Glove pretrained ones) to get an output with dimensionality of 100, that means that I would have 5000x150x100 elements.

    Is my understanding correct? If so this means that I can start training my network using mini-batches of say 16x150x100 elements, the layer after the embedding one could be then a LSTM and so on...

    Login To add answer/comment

Share This Page