Outline:

Intro to RNN

  • In feed forward networks, there is no sense of order in inputs.
  • Idea is that, build order in network. (include information about order)
    • split data into parts (text -> words)
    • routing the hidden layer output from the previous step back into hidden layer
  • This architecture colled Recurrent Neural Network(RNN).
    • total input of hidden layer is sum of the combinations from input layer and previous hidden layer. steep-rnn.png
  • Example
    • word -> characters. (steep -> ‘s’, ‘t’, ‘e’, ‘e’, ‘p’) steep-example.png steep-example-num.png

LSTM

hidden-multiply.png

  • In RNN, hidden layer multiplication leads to problem, gradient going to
    • really small and vanish
    • really large and explode vanishing-exploding.png
  • We can think of RNN as
    • bunch of cells with inputs and outputs
    • inside the cells there are network layers rnn-cell.png
  • To solve the problem of vanishing gradients
    • Use more complicated cells, called LSTM (Long Short Term Memory)

LSTM cell

lstm-cell.png

  • 4 network layers as yellow boxes
    • each of them with their own weights
    • is sigmoid
    • is hyperbolic tangent function
      • similar to sigmoid that squashes inputs
      • output is
  • Red circles are point-wise and element-wise operations
  • Cell state, labeled as C
    • Goes through LSTM cell with little interaction
      • allowing information to flow easily through the cells.
    • Modified only element-wise operations which function as gates
    • Hidden state is calculated from the cell state
  • Forget Gate
    • Network can learn to forget information that causes incorrect predictions. (output: 0)
    • Long range of information that is helpful. (output: 1) forget-gate.png
  • Update State
    • this gate updates the cell from the input and previous hidden state update-state.png
  • Cell State to Hidden Output
    • cell state is used to produce the hidden state, to next hidden cell.
    • Sigmoid gates let the network learn which information to keep and rid of. hidden-state.png

Character-wise RNN

Resources

Here are a few great resources for you to learn more about recurrent neural networks. We’ll also continue to cover RNNs over the coming weeks.