EDUNEX ITB
Transcript of EDUNEX ITB
![Page 1: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/1.jpg)
EDUNEX ITB
![Page 2: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/2.jpg)
EDUNEX ITB
IF4074 Minggu 9-15
• Minggu ke-9 (18 Oktober 2021): LSTM + RNN arsitektur + Tubes 2
• Minggu ke-10 (25 Oktober 2021): RNN Latihan + BPTT
• Minggu ke-11 (1 November 2021): Kuliah tamu (sharing Aplikasi ML di Gojek)
• Minggu ke-12 (8 November 2021): Praktikum RNN
• Minggu ke-13 (15 November 2021): Feature Engineering 1 / TugasDesain eksperimen
• Minggu ke-14 (22 November 2021): Kuis 2
• Minggu ke-15 (29 November 2021): Praktikum Feature Engineering 2
![Page 3: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/3.jpg)
EDUNEX ITB
04 LSTM: What & Why
Pembelajaran Mesin Lanjut(Advanced Machine Learning)
Masayu Leylia Khodra([email protected])
KK IF – Teknik Informatika – STEI ITB
Modul 4: Recurrent Neural Network
01
![Page 4: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/4.jpg)
EDUNEX ITB
Long Short-Term Memory (LSTM): Why
https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
ℎ𝑡 = 𝑓(𝑈𝑥𝑡 +𝑊ℎ𝑡−1 + 𝑏xh)
𝑦𝑡 = 𝑓(𝑉ℎ𝑡 + 𝑏hy)
RNN: long-term dependency problem
U
W V
𝑥𝑡
ℎt-1
ℎt
Suffer from short-term memory (forward propagation).
Suffer from vanishing gradient problem (backward propagation).RNNs fail to learn greater than 5-10 time steps.In the worst case, this may completely stop the neural network from further training.
02
![Page 5: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/5.jpg)
EDUNEX ITB
Long Short-Term Memory (LSTM): What
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
LSTMs are explicitly designed to avoid the long-term dependency problem.
Introduced by Hochreiter & Schmidhuber (1997)
LSTM is special kind of RNN. The differences are the operations within the LSTM’s cells. RNN: repeating module have a very simple structure. LSTM: repeating module contains four interacting layers.
03
![Page 6: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/6.jpg)
EDUNEX ITB
LSTM: Cell State & Gates• as the “memory” of the network
• act as a transport highway that transfers relevant information throughout the processing of the sequence.
Cell State
• decides what information should be thrown away or kept.
• Values closer to 0 means to forget, and closer to 1 means to keep.Forget Gate
• Decides what information is relevant to add from the current step
• update the cell state by hidden state and current inputInput Gate
• decides what the next hidden state should be.
• Hidden state contains information on previous inputs. The hidden state is also used for predictions.
Output Gate
https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
04
![Page 7: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/7.jpg)
EDUNEX ITB
Forget Gate
https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
𝑓𝑡 = 𝜎(𝑈𝑓𝑥𝑡 +𝑊𝑓ℎ𝑡−1 + 𝑏f)
Value 1 represents
“completely keep this”
while a 0 represents “completely get rid of this.”
𝑥𝑡
ℎ t-1
𝑐t-1
𝑓𝑡
05
![Page 8: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/8.jpg)
EDUNEX ITB
Input Gate
https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
𝑖𝑡 = 𝜎(𝑈𝑖𝑥𝑡 +𝑊𝑖ℎ𝑡−1 + 𝑏i)
𝑥𝑡
ℎ t-1
𝑐t-1
𝑓𝑡 𝑖𝑡 ǁ𝑐𝑡
෩𝐶𝑡 = tanh(𝑈𝑐𝑥𝑡 +𝑊𝑐ℎ𝑡−1 + 𝑏c)
06
![Page 9: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/9.jpg)
EDUNEX ITB
Cell State
https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
𝑥𝑡
ℎ t-1
𝑐t-1
𝑓𝑡 𝑖𝑡 ǁ𝑐𝑡
𝐶𝑡 = 𝑓𝑡 ⊙𝐶𝑡−1 + 𝑖𝑡 ⊙ ෩𝐶𝑡
07
![Page 10: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/10.jpg)
EDUNEX ITB
Output Gate
https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
𝑥𝑡
ℎ t-1
𝑐t-1
𝑓𝑡 𝑖𝑡 ǁ𝑐𝑡
𝑜𝑡 = 𝜎 𝑈𝑜𝑥𝑡 +𝑊𝑜ℎ𝑡−1 + 𝑏o
ℎ𝑡 = 𝑜𝑡 ⊙ tanh(𝐶t)
08
![Page 11: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/11.jpg)
EDUNEX ITB
LSTM Forward Propagation: Example
https://medium.com/@aidangomez/let-s-do-this-f9b699de31d9
Ԧ𝑥 (2)
ℎ (1)
𝑈𝑓 , 𝑈𝑖,
𝑈𝑐 , 𝑈𝑜
𝑊𝑓 ,𝑊𝑖,
𝑊𝑐 ,𝑊𝑜
A1 A2 Target
1 2 0.5
0.5 3 1
…
𝑓𝑡 = 𝜎(𝑈𝑓𝑥𝑡 +𝑊𝑓ℎ𝑡−1 + 𝑏f)
𝑖𝑡 = 𝜎(𝑈𝑖𝑥𝑡 +𝑊𝑖ℎ𝑡−1 + 𝑏i)
෩𝐶𝑡 = tanh(𝑈𝑐𝑥𝑡 +𝑊𝑐ℎ𝑡−1 + 𝑏c)
𝐶𝑡 = 𝑓𝑡 ⊙𝐶𝑡−1 + 𝑖𝑡 ⊙ ෩𝐶𝑡
𝑜𝑡 = 𝜎 𝑈𝑜𝑥𝑡 +𝑊𝑜ℎ𝑡−1 + 𝑏o
ℎ𝑡 = 𝑜𝑡 ⊙ tanh(𝐶t)
Uf0.700 0.450
Ui0.950 0.800
Uc0.450 0.250
Uo0.600 0.400
Wf bf0.100 0.150
Wi bi0.800 0.650
Wc bc0.150 0.200
Wo bo0.250 0.100
ht-1 Ct-10 0
09
![Page 12: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/12.jpg)
EDUNEX ITB
Computing ht and ct : Timestep t1
t1=<12, 0.5> Uf.xt Wf.ht-1+bf net_ft ft
1.600 0.150 1.750 0.852
Ui.xt Wi.ht-1+bi net_it it2.550 0.650 3.200 0.961
Uc.xt Wc.ht-1+bc net_~ct ~ct0.950 0.200 1.150 0.818
Uo.xt Wo.ht-1+bo net_ot ot1.400 0.100 1.500 0.818
Ct ht0.786 0.536
ht-1 Ct-10 0
𝑓𝑡 = 𝜎(𝑈𝑓𝑥𝑡 +𝑊𝑓ℎ𝑡−1 + 𝑏f)
𝑖𝑡 = 𝜎(𝑈𝑖𝑥𝑡 +𝑊𝑖ℎ𝑡−1 + 𝑏i)
෩𝐶𝑡 = tanh(𝑈𝑐𝑥𝑡 +𝑊𝑐ℎ𝑡−1 + 𝑏c)
𝐶𝑡 = 𝑓𝑡 ⊙𝐶𝑡−1 + 𝑖𝑡 ⊙ ෩𝐶𝑡
𝑜𝑡 = 𝜎 𝑈𝑜𝑥𝑡 +𝑊𝑜ℎ𝑡−1 + 𝑏o
ℎ𝑡 = 𝑜𝑡 ⊙ tanh(𝐶t)
https://medium.com/@aidangomez/let-s-do-this-f9b699de31d9
10
![Page 13: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/13.jpg)
EDUNEX ITB
Computing ht and ct : Timestep t2
t2=<0.53
, 1>
ht-1 Ct-10.786 0.536
𝑓𝑡 = 𝜎(𝑈𝑓𝑥𝑡 +𝑊𝑓ℎ𝑡−1 + 𝑏f)
𝑖𝑡 = 𝜎(𝑈𝑖𝑥𝑡 +𝑊𝑖ℎ𝑡−1 + 𝑏i)
෩𝐶𝑡 = tanh(𝑈𝑐𝑥𝑡 +𝑊𝑐ℎ𝑡−1 + 𝑏c)
𝐶𝑡 = 𝑓𝑡 ⊙𝐶𝑡−1 + 𝑖𝑡 ⊙ ෩𝐶𝑡
𝑜𝑡 = 𝜎 𝑈𝑜𝑥𝑡 +𝑊𝑜ℎ𝑡−1 + 𝑏o
ℎ𝑡 = 𝑜𝑡 ⊙ tanh(𝐶t)
Uf.xt Wf.ht-1+bf net_ft ft1.700 0.204 1.904 0.870
Ui.xt Wi.ht-1+bi net_it it2.875 1.079 3.954 0.981
Uc.xt Wc.ht-1+bc net_~ct ~ct0.975 0.280 1.255 0.850
Uo.xt Wo.ht-1+bo net_ot ot1.500 0.234 1.734 0.850
Ct ht1.518 0.772
https://medium.com/@aidangomez/let-s-do-this-f9b699de31d9
11
![Page 14: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/14.jpg)
EDUNEX ITB
Implementing LSTM on Keras: Many to One
from keras import Sequential
from keras.layers import LSTM, Dense
model = Sequential()
model.add(LSTM(10, input_shape=(50,1)))
#10 neurons & process 50x1 sequences
model.add(Dense(1,activation='linear’))
#linear output as regression problem
https://towardsdatascience.com/a-comprehensive-guide-to-working-with-recurrent-neural-networks-in-keras-f3b2d5e2fa7f
Ԧ𝑥 (1)
ℎ (10)
𝑦 (1)
U
V
W
# predict amazon stock closing prices, LSTM 50 timestep
12
![Page 15: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/15.jpg)
EDUNEX ITB
Number of Parameter
Ԧ𝑥 (1)
ℎ (10)
𝑦 (1)
U
V
W
Total parameter = (1+10+1)*4*10+(10+1)*1=491
Simple RNN with equal networks: 131 parameterU: matrix hidden neurons x (input dimension + 1)W: matrix hidden neurons x hidden neuronsV: matrix output neurons x (hidden neurons+1)
13
Total parameter for n unit lstm from m-dimension input to k dimension output = (m+n+1)*4*n+(n+1)*k
![Page 16: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/16.jpg)
EDUNEX ITB
RNN → LSTM → GRU → ReGU
1985Recurrent nets
1997LSTMBi-RNN
2014GRU
2017Residual LSTM
2019Residual Gated Unit
GRU: no cell state, 2 gates
ReGU: shortcut connection
14
![Page 17: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/17.jpg)
EDUNEX ITB
Summary
LSTMs avoid the long-term
dependency problem
LSTMs have a cell state and 3 gates
(forget, input, output)
Computing ht and ct
15
Backpropagation Through Time
![Page 18: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/18.jpg)
EDUNEX ITB
03 RNN Architecture
Pembelajaran Mesin Lanjut(Advanced Machine Learning)
Masayu Leylia Khodra([email protected])
KK IF – Teknik Informatika – STEI ITB
Modul 4: Recurrent Neural Network
01
![Page 19: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/19.jpg)
EDUNEX ITB
General Architecture
Ԧ𝑥 (i)
ℎ1 (𝑗)
ℎℎ (k)
𝑦 (𝑚)
Uxh1
Uh1h…
V
…
Uh…hh
Wh1
Wh…
Whh
Ԧ𝑥1
ℎ (𝑗)
Ԧ𝑥2
ℎ (𝑗)
Ԧ𝑥𝑛
ℎ (𝑗)…
n timestep
Return sequence = True/False
02
![Page 20: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/20.jpg)
EDUNEX ITB
Architecture
fixed-sized input vector xt
fixed-sized output vector ot
RNN state st
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
One to many: image captioningMany to one: text classificationMany to many: machine translation, video frame classification, POS tagging
03
![Page 21: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/21.jpg)
EDUNEX ITB
One to Many: Image Captioning
CNN Encoder (Inception) - RNN Decoder (LSTM) (Vinyals dkk., 2014)
04
![Page 22: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/22.jpg)
EDUNEX ITB
Many to One: Text Classification
22https://www.oreilly.com/learning/perform-sentiment-analysis-with-lstms-using-tensorflow
05
![Page 23: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/23.jpg)
EDUNEX ITB
Many to Many: Sequence Tagging
https://www.depends-on-the-definition.com/guide-sequence-tagging-neural-networks-python/
Input is a sequence of words, and output is the sequence of POS tag for each word.
23
06
![Page 24: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/24.jpg)
EDUNEX ITB
Many to Many: Machine Translation
http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/
● Machine Translation: input is a sequence of words in source language (e.g. German). Output is a sequence of words in target language (e.g. English).
● A key difference is that our output only starts after we have seen the complete input, because the first word of our translated sentences may require information captured from the complete input sequence.
24
07
![Page 25: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/25.jpg)
EDUNEX ITB
Implementing RNN on Keras: Many to One
from keras import Sequential
from keras.layers import SimpleRNN, Dense
model = Sequential()
model.add(SimpleRNN(10, input_shape=(50,1)))
#simple recurrent layer, 10 neurons & process 5
0x1 sequences
model.add(Dense(1,activation='linear')) #linear
output because this is a regression problem
https://towardsdatascience.com/a-comprehensive-guide-to-working-with-recurrent-neural-networks-in-keras-f3b2d5e2fa7f
Ԧ𝑥 (1)
ℎ (10)
𝑦 (1)
U
V
W
# predict amazon stock closing prices, RNN 50 timestep
08
![Page 26: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/26.jpg)
EDUNEX ITB
Number of Parameter
Ԧ𝑥 (1)
ℎ (10)
𝑦 (1)
U
V
W
Total parameter = (1+10+1)*10+(10+1)*1=131Simple RNN:U: matrix hidden neurons x (input dimension + 1)W: matrix hidden neurons x hidden neuronsV: matrix output neurons x (hidden neurons+1)
09
![Page 27: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/27.jpg)
EDUNEX ITB
Number of Parameter: Example 2model = Sequential() #initialize model
model.add(SimpleRNN(64, input_shape=(50,1), return_sequences=True))#64 neurons
model.add(SimpleRNN(32, return_sequences=True))#32 neurons
model.add(SimpleRNN(16)) #16 neurons
model.add(Dense(8,activation='tanh'))
model.add(Dense(1,activation='linear'))
Total parameter = 8257
= (1+64+1)*64=4224
= (64+32+1)*32=3104
= (32+16+1)*16=784= (16+1)*8=136= (8+1)*1=9
10
![Page 28: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/28.jpg)
EDUNEX ITB
Bidirectional RNNs
• In many applications we want to output a prediction of y (t) which may depend on the whole input sequence. E.g. co-articulation in speech recognition, right neighbors in POS tagging, etc.
• Bidirectional RNNs combine an RNN that moves forward through time beginning from the start of the sequence with another RNN that moves backward through time beginning from the end of the sequence. https://www.cs.toronto.edu/~tingwuwang/rnn_tutorial.pdf
11
![Page 29: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/29.jpg)
EDUNEX ITB
Bidirectional RNNs for Information Extraction
https://www.depends-on-the-definition.com/sequence-tagging-lstm-crf/
29
12
![Page 30: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/30.jpg)
EDUNEX ITB
Summary
Architecture: 1-to-n, n-to-1,
n-to-n
Number of parameter
RNN
Bidirectional RNN
10
LSTM
![Page 31: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/31.jpg)
EDUNEX ITB
05 Backpropagation Through Time
Pembelajaran Mesin Lanjut(Advanced Machine Learning)
Masayu Leylia Khodra([email protected])
KK IF – Teknik Informatika – STEI ITB
Modul 4: Recurrent Neural Network
01
![Page 32: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/32.jpg)
EDUNEX ITB
Backpropagation Through Time (BPTT)
Forward Passget sequence current
output
Backward Passcompute 𝛿𝑔𝑎𝑡𝑒𝑠𝑡, 𝛿𝑥𝑡, ∆𝑜𝑢𝑡𝑡−1, 𝛿U, 𝛿W, 𝛿b
Update Weights𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 − . 𝛿𝑤𝑜𝑙𝑑
BPTT learning algorithm is an extension of standard backpropagation that performs gradients descent on an unfolded network.
02
![Page 33: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/33.jpg)
EDUNEX ITB
Example
Ԧ𝑥 (2)
ℎ (1)
𝑈𝑓 , 𝑈𝑖,
𝑈𝑐 , 𝑈𝑜
𝑊𝑓 ,𝑊𝑖,
𝑊𝑐 ,𝑊𝑜
unfold
Ԧ𝑥1 =12
[0.536]
Ԧ𝑥2 =0.53
[0.772]
0.5 1.25
U0.7 0.95 0.5 0.6
0.45 0.8 0.3 0.4
W0.100 0.800 0.150 0.250
03
0
![Page 34: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/34.jpg)
EDUNEX ITB
LSTM: Backward Propagation Timestep t
𝑓𝑡 = 𝜎(𝑈𝑓𝑥𝑡 +𝑊𝑓ℎ𝑡−1 + 𝑏f)
𝑖𝑡 = 𝜎(𝑈𝑖𝑥𝑡 +𝑊𝑖ℎ𝑡−1 + 𝑏i)
෩𝐶𝑡 = tanh(𝑈𝑐𝑥𝑡 +𝑊𝑐ℎ𝑡−1 + 𝑏c)
𝐶𝑡 = 𝑓𝑡 ⊙𝐶𝑡−1 + 𝑖𝑡 ⊙ ෩𝐶𝑡
𝑜𝑡 = 𝜎 𝑈𝑜𝑥𝑡 +𝑊𝑜ℎ𝑡−1 + 𝑏o
ℎ𝑡 = 𝑜𝑡 ⊙ tanh(𝐶t)
𝛿𝑜𝑢𝑡𝑡 = ∆𝑡 + ∆𝑜𝑢𝑡𝑡
𝛿𝐶𝑡 = 𝛿𝑜𝑢𝑡𝑡 ⊙𝑜𝑡 ⊙ 1− 𝑡𝑎𝑛ℎ2 𝐶𝑡 + 𝛿𝐶𝑡+1 ⊙𝑓𝑡+1
𝛿 ෩𝐶𝑡 = 𝛿𝐶𝑡 ⊙ 𝑖𝑡 ⊙ (1 − ෩𝐶𝑡2)
𝛿𝑖𝑡 = 𝛿𝐶𝑡 ⊙ ෩𝐶𝑡 ⊙ 𝑖𝑡 ⊙ (1 − 𝑖𝑡)
𝛿𝑓𝑡 = 𝛿𝐶𝑡 ⊙𝐶𝑡−1 ⊙𝑓𝑡 ⊙ 1− 𝑓𝑡
𝛿𝑜𝑡 = 𝛿𝑜𝑢𝑡𝑡 ⊙ tanh 𝐶𝑡 ⊙𝑜𝑡 ⊙ 1− 𝑜𝑡
𝛿𝑥𝑡 = 𝑈𝑇 . 𝛿𝑔𝑎𝑡𝑒𝑠𝑡
∆𝑜𝑢𝑡𝑡−1= 𝑊𝑇. 𝛿𝑔𝑎𝑡𝑒𝑠𝑡
04
![Page 35: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/35.jpg)
EDUNEX ITB
Computing 𝛿𝑔𝑎𝑡𝑒𝑠𝑡 for timestep t=2
Last timestep: ∆𝑜𝑢𝑡𝑡= 0; 𝑓𝑡+1 = 0; 𝛿𝐶𝑡+1 = 0
t2: 𝜕𝐸
𝜕𝑜= 0.772 − 1.25 = −0.478→ 𝛿𝑜𝑢𝑡2 = −0.478 + 0 = −0.478
𝛿𝐶2 = −0.478 ∗ 0.85 ∗ 1 − 𝑡𝑎𝑛ℎ2 1.518 + 0 ∗ 0 = −0.071
𝛿𝑓2 = −0.071 ∗ 0.786 ∗ 0.870 ∗ 1 − 0.870 = −0.006𝛿𝑖2 = −0.071 ∗ 0.850 ∗ 0.981 ∗ 1 − 0.981 = −0.001𝛿෪𝐶2 = −0.071 ∗ 0.981 ∗ 1 − 0.8502 = −0.019𝛿𝑜2 = −0.478 ∗ tanh 1.518 ∗ 0.850 ∗ 1 − 0.850 = −0.055
𝛿𝑖𝑡 = 𝛿𝐶𝑡 ⊙ ෩𝐶𝑡 ⊙ 𝑖𝑡 ⊙ (1 − 𝑖𝑡)𝛿𝑓𝑡 = 𝛿𝐶𝑡 ⊙𝐶𝑡−1 ⊙𝑓𝑡 ⊙ 1− 𝑓𝑡𝛿𝑜𝑡 = 𝛿𝑜𝑢𝑡𝑡 ⊙ tanh 𝐶𝑡 ⊙𝑜𝑡 ⊙ 1− 𝑜𝑡
𝛿𝑜𝑢𝑡𝑡 = ∆𝑡 + ∆𝑜𝑢𝑡𝑡𝛿𝐶𝑡 = 𝛿𝑜𝑢𝑡𝑡 ⊙𝑜𝑡 ⊙ 1− 𝑡𝑎𝑛ℎ2 𝐶𝑡 + 𝛿𝐶𝑡+1 ⊙𝑓𝑡+1
𝛿 ෩𝐶𝑡 = 𝛿𝐶𝑡 ⊙ 𝑖𝑡 ⊙ (1 − ෩𝐶𝑡2)
𝐸 =1
2(𝑡𝑎𝑟𝑔𝑒𝑡 − ℎ)2 ∆𝑡=
𝜕𝐸
𝜕ℎ= −(𝑡𝑎𝑟𝑔𝑒𝑡 − ℎ) = ℎ − 𝑡𝑎𝑟𝑔𝑒𝑡
𝛿𝑔𝑎𝑡𝑒𝑠2 =
−0.006−0.001−0.019−0.055
05
![Page 36: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/36.jpg)
EDUNEX ITB
Computing 𝛿𝑥2 and ∆𝑜𝑢𝑡1 for timestep t=2
𝛿𝑥𝑡 = 𝑈𝑇 . 𝛿𝑔𝑎𝑡𝑒𝑠𝑡∆𝑜𝑢𝑡𝑡−1= 𝑊𝑇 . 𝛿𝑔𝑎𝑡𝑒𝑠𝑡
U0.7 0.95 0.5 0.6
0.45 0.8 0.3 0.4
dgates-0.006-0.001-0.019-0.055
dx2-0.047-0.030
W0.100 0.800 0.150 0.250
dout1-0.018
06
![Page 37: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/37.jpg)
EDUNEX ITB
Computing for timestep t=1: ∆𝑜𝑢𝑡1= −0.018
𝛿𝑜𝑢𝑡1 = 0.036 − 0.018 = 0.018𝛿𝐶1 = −0.053𝛿𝑓1 = 0𝛿𝑖1 = −0.0017𝛿෪𝐶1 = −0.017𝛿𝑜1 = 0.0018
𝛿𝑥𝑡 = 𝑈𝑇 . 𝛿𝑔𝑎𝑡𝑒𝑠𝑡∆𝑜𝑢𝑡𝑡−1= 𝑊𝑇 . 𝛿𝑔𝑎𝑡𝑒𝑠𝑡
U0.7 0.95 0.5 0.6
0.45 0.8 0.3 0.4
W0.100 0.800 0.150 0.250
dgates0.0000
-0.0017-0.01700.0018
dx1-0.0082-0.0049
dout0-0.0035
07
![Page 38: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/38.jpg)
EDUNEX ITB
Computing 𝛿U, 𝛿W, 𝛿b
𝛿𝑈 =
𝑡=1
2
𝛿𝑔𝑎𝑡𝑒𝑠𝑡 . 𝑥𝑡 =
0.0−0.0017−0.01700.0018
1 2 +
−0.006−0.001−0.019−0.055
0.5 3 =
𝛿𝑊 = σ𝑡=12 𝛿𝑔𝑎𝑡𝑒𝑠𝑡+1 . ℎ𝑡=
−0.006−0.001−0.019−0.055
[0.536]=
𝛿𝑏 =
𝑡=1
2
𝛿𝑔𝑎𝑡𝑒𝑠𝑡+1 =
dU-0.0032 -0.0189-0.0022 -0.0067-0.0267 -0.0922-0.0259 -0.1626
dW-0.0034-0.0006-0.0104-0.0297
db-0.00631-0.00277-0.03641-0.05362
08
![Page 39: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/39.jpg)
EDUNEX ITB
Update Weights (=0.1) 𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 − . 𝛿𝑤𝑜𝑙𝑑
dU-0.0032 -0.0189-0.0022 -0.0067-0.0267 -0.0922-0.0259 -0.1626
dW-0.0034-0.0006-0.0104-0.0297
db-0.00631-0.00277-0.03641
Unew0.7003 0.9502 0.4527 0.60260.4519 0.8007 0.2592 0.4163
Uold0.7 0.95 0.5 0.6
0.45 0.8 0.3 0.4
Wold0.100 0.800 0.150 0.250
Wnew0.1003 0.8001 0.1510 0.2530
bnew0.1506 0.6503 0.2036 0.1054
bold0.1500 0.6500 0.2000 0.1000
09
![Page 40: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/40.jpg)
EDUNEX ITB
Truncated BPTT
https://deeplearning4j.org/docs/latest/deeplearning4j-nn-recurrent
41
Truncated BPTT was developed in order to reduce the computational complexity of each parameter update in a recurrent neural network.
10
![Page 41: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/41.jpg)
EDUNEX ITB
Summary
Backpropagation through time for
LSTMTruncated BPTT
11
![Page 42: EDUNEX ITB](https://reader033.fdocuments.mx/reader033/viewer/2022042107/6256e0bfe553b922b92bc5e3/html5/thumbnails/42.jpg)
EDUNEX ITB