In this post, we will study about Memory networks briefly.
We will mainly be looking at Memory Networks and End-to-end memory networks
Memory networks paper summary
Memory Networks combine inference components with a long-term memory component. Used in the context of Question Answering (QA) with memory component acting as a (dynamic) knowledge base. Four learnable components
- I - input feature map: Converts input to internal feature representation
 - G - generalization: Updates old memory given input x.
 - O - the O component is responsible for reading from memory and performing inference. (Calculating what memories to use to perform a good response)
 - R - Produces the final response given O. Efficient memory via hashing: Hash the input into buckets and only score memories in the same bucket. Hashing via clustering word embeddings performs better than just hashing words.
 
Thoughts:
- Memory networks outperform both RNN’s and LSTM’s
 - Requires full supervision
 
End to End memory networks(MEMN2N) summary
- Reads from memory with soft attention
 - Performs multiple lookups (hops) on memory
 - E2E training with backprop
 - Difference with Memory networks
- Memory networks uses “Hard attention” vs MEMN2N uses soft attention
 - Requires full supervision vs MEM2NN requires supervision only on the end output
 - Memory networks uses ArgMax vs Softmax in MEMN2N
 - Backprop might be a challenge in Memory networks?
 - Both of these networks are evaluated on the bAbi tasks. There are a total of 20 tasks and each of the task needs a different type of reasoning.