How to Make a Chatbot – Intro to Deep Learning #12

Lets Make a Question Answering chatbot using the bleeding edge in deep learning (Dynamic Memory Network). We'll go over different chatbot methodologies, then dive into how memory networks work, with accompanying code in Keras.

Code + Challenge for this video:

Nemanja's Winning Code:

Vishal's Runner up code:

Web app to run the code yourself:

Please subscribe! And like. And comment. That's what keeps me going.

More Learning resources:

Join us in the Wizards Slack channel:

And please support me on Patreon:

Follow me:
Facebook: Instagram: Instagram:
Signup for my newsletter for exciting updates in the field of AI:

Hit the Join button above to sign up to become a member of my channel for access to exclusive content!


  1. imagine a chatbot that outputs Bash comands , comand line programs would get an interface more natural than GUI , without having to modify them ( note : non graphical Linux apps work on Windows ) also could i submit that for the coding challenge ?

  2. I think that all Q&A problems are NLP tasks. NLP and linguistics are the foundations from which different levels of abstractions constructs more hard things like disambiguation (to answer factoid questions), translation (to understand or reformulate questions), etc…
    Maybe, there are equivalent. But I don’t think so.
    Great video and code btw! :):)

  3. I’m a beginner student of deep learning. So your videos help me a lot to understand this universe. Thanks.

  4. siraj, have you considered doing more videos where you focus more on the machine learning itself, with just numpy? I love these practical application videos so much but it wouldn’t be a bad idea to go into a more detailed level.

  5. I didn’t know we can have explicit memory in an ANN. Awesome video! just as always it is!

  6. Hey Siraj
    Is it possible to find the similarity of two sentences using Tensorflow? I mean, I would want the result to be a non binary value. Is there a link from where I could find more information about this? I have already looked at Denny Brtiz’s code for text classification. Thank you!

    1. My corpus is pretty small. Could you please direct me to any python implementation of Sequence autoencoders? I am very new to this having no background.

  7. Hello Siraj, and community, anyone know what would I have to change or tweak in this method for build a Q&A System in another language, other than English as shown? any ideas of pre-trained language models for Portuguese? Keep up the nice work Siraj, congrats! Cheers to all.

  8. Hi Siraj, i had a question regarding the chatbot models……How do i combine a generative model with a retrieval based model??Generative model will be for training the network to learn the language and the retrieval model is for the purpose of retrieving domain knowledge. Can you suggest me something?

  9. Things are happening so fast that he had to cover three different architectures in order to catch people up

  10. Wow this was way more complicated than the typical video I’ve seen on your channel. I would need an hour long version of this to understand anything.

  11. Qs after viewing (Notes about video shown below)
    episodic module: what changes on each pass? if sent thru the same wouldn’t output be same? 
    Is the code he was showing in the video available to see? I didn’t see it on the github page 
    Notes on video
    Dynamic Memory Network 
    types of memory; semantic (input txt), episodic (addtl info) 
    Gru cell replaces lstm cell; simplifies by only using two gates & no memory unit; update & reset 
    Create sequence (Glove) vectors from input text 
    Separate training data from testing data  
    Feed to input method that creates hidden states after each sentence 
    hidden states known as facts 
    ??   matrix multiplication, biased term.  gets into how to use the gru cell with updates and resets 
    trying to find if the current fact is relevant to the answer 
    output from input method feeds question module which processes input word by word 
    Q mod outputs vector using same gru and weights as input module 
    model created by replicating the hippa campus function in human brain 
    Attention function assigns 0 or 1 to each fact 
    multiple episodes are created; processes all facts x times 
    helps determine what info is relevant 
    but what changes on each pass? if sent thru the same wouldn’t output be same? 
    loss model cascading loops entropy. Discratic gradient ascent implementation (rmsprop)

  12. #Siraj has this incredible way of blending Advanced Technical knowledge with humor. Awesome #Siraj. U r helping me understand and learn AI easily . Thanks a lot.

Leave a Reply

Your email address will not be published. Required fields are marked *

© 2024 EVERYTHING CHATGPT - WordPress Theme by WPEnjoy