This is Shuochen Wang from R&D department in Flect. In this blog I am to explain how I implemented a Slack bot using Google Dialogflow.
Table of contents
According to a new market report pertaining to the global chatbot market published by Transparency Market Research the global chatbot market was valued at US$ 274.5 Mn in 2019 and is projected to reach US$ 2,358.2 Mn by 2027, at a CAGR of 31.3% during the forecast period from 2020 to 20271. Since the outbreak of the corona virus, the need for automatic chat support is greater than ever before. To create an automatic Q&A bot that can answer various questions about company regulations in Slack via Google Dialogflow. The role of Dialogflow is to get the user intent, mainly the parameter (the user wants to search), and pass the information to a webhook service which makes the query to an external API (database). The webhook then passes the response back to Dialogflow which finally returns the response to Slack. There are many chat bot products out in the market. To name a few: Microsoft Bot Framework, Dialogflow, IBM Watson, Amazon Lex, Wit.ai and Botkit. Wit.ai and Botkit are open source, and the rest are paid services. First of all, to accommodate future consumers, I would consider open-source only if there is no suitable paid services available.To decide which is the best product for the situation, I have considered the following factors: After careful consideration Google Dialogflow is chosen for the following reasons: First of all, I have faced with one important decision of whether to use external API for Dialogflow. In order to get customized response, Google Dialogflow has the ability to connect to either an external API or retrieve such responses from a stored Json file. However, Dialogflow has the limit of 2000 entries for using Json files. This means, if the limit is exceeded, one has to create another Dialogflow instance. For the sake of scalability, I choose to connect to an external API. The nature choice would be connect to an external database. The database stores the list of entries and the definitions. In this case I chose Google Firebase because it maintains consistency with other products and it is a NO-SQL DB which offers greater scalability. Apart from maintaining consistency, Google is the fastest overall cloud platform out of Microsoft, Amazon and Google. Google also has the second cheapest average pricing3. Even after choosing Google as the platform, there are still many products to choose from, namely: Google Function, Google App Engine and Google Cloud Run. After careful testing and evaluation, I chose Google Cloud Run because it has the best latency and most reasonable cost. The programming language used for Cloud Run will be in Python. Although both Google Dialogflow and Firebase Firestore requires customization, these 2 products are mostly used as it is. Cloud Run script is the part that requires most effort. Cloud Run is the most important part of the service for Slack bot because it acts an intermediary between Google Dialogflow and Google Firebase Store. I will explain in detail how it is achieved in the following sections. First of all, there are several ways to implement a web application in Python. For example, Django, flask, web2py and so on. Which API you use should not matter as they can all achieve the same results. In this example, I have chosen Flask because of its simplicity and easy to set up. Secondly, I chose to use Google Firebase Firestore, I would need to include these modules for it to function. Namely, firebase_admin. Originally I used dialogflow_fullment module, but I have removed in my final version for greater flexibility. Finally, we also need serval Slack modules for sending and receiving custom Slack messages/actions. The API I used are Slack API for Python and slack_bolt (Bolt for Python). The main method works like all the other flask application. First declare the Flask instance, then call it in the main method: If you have never seen this code snippets before, what this means if it is launched as a main method (instead of being imported from another file), launch this flask app, listening at the default port for requests. This is the beauty of using the Flask framework, everything is basically one line code and everything else is handled by the framework itself. For readers who are more familiar with Flask, may wonder why it is called "flask_app" instead of the conventional name "app". The reason is because the conventional name "app" is also used by another API, Bolt for Python. In this project, the flask instance only needs to be called once, in contrast to Bolt for Python instance, which will be called many times. So far all this application can do is listening for request in the default port 8080. However, we have not instructed what to do once the request is received, therefore nothing will happen even if we did try to connect to this port. I will post the code for the route function first, then explain in greater detail: In the first line, one has to specify where which directory it listens to. It should listens at the root directory, which is "/". Also it only listens to "POST" request, because all the webhook request will be made in "POST". You can check what format the webhook request will be in this Google documentation: Therefore our goal is to be able to handle the format in the Google documentation. The first thing thing we need to do is to get the request using "request_ = request.get_json(force=True)". This will parse the request Json into a dictionary format. Not every request will be in valid format. Because this is a public url, someone might decides to mess it up and send a false request which may crashes our application. In order to prevent this, use try and except in case the request_ variable failed to contain the the channel information. This does not have to be "channel" key, it can be any other valid Dialogflow key. "Channel" key is used because we need this information to post this information to specific channels. In case if you want to log every specific request, logging module can achieve what you need. I do not and have commented it out. Dialogflow also comes with a key value called "intentDetectionConfidence", which means the level of confidence for this intent. We assume if the confidence is too low, the intent deduction is likely to be incorrect and therefore we reject this request. This section can also be commented out because we do not know how Dialogflow calculates this confidence level. Finally, we are sure of this is a valid request. Now we can go ahead and process this request in the following section. The request handler extract the keyword to be searched, then search the term in Firebase Firestore. If it exists in the database, return the successful message, if not, return the fail message: The first thing we need to do with this handler function is to extraction the keyword that user actually wants to search. We extract this method and then store it in the variable "search". Next, the handler searches the "questions" collection of the DB. I have created the collection called "questions" for my DB. You can name it to anything you like, just make sure the names match. Currently Google does not offer comprehensive search option, as quoted here: Therefore we will only do an exact text search by " query_ref = q_ref.where(u'Term', u'==', search)". In English this would be quest problematic as misspelling is quite common. You might need to include another function that corrects potential misspelled words into the correct English word before doing exact search. This is not so much in the case of Japanese because most of the Japanese words will be written in Kanji, which mean there are limited combinations of words. Misspelling is unlikely because the Japanese default input will not store such input. Success handler will return the exact value from the DB so I will not explain it further. The fallback_handler also works the same way. However, it is not very helpful to just respond to the user "Sorry we cannot find what you are looking for". It would be much more user friendly to at least return the closet matched entry from the DB. This is exactly what final_response contains. However there is one problem. Dialogflow only accepts specific formats for response. What do we do if we want to post polls, buttons, links or slack emotions? The answer is to post the reply to the user first by using the slack API before returning the response to Dialogflow. The standard way of posting a slack message to use clicent.chat_postMessage then specify the channel (we have this channel information from the previous route function), text and the blocks as Json. You can find out more about Slack block kits here: Once you understand how the Slack blocks work, you can create a variety of Slack messages including url, pictures, polls, and even interactive widgets. As I have stated before, it would be helpful to return the closest matched entry from the document collection. How can we achieve this? The following code achieves this: It is actually simple to implement such function. We will borrow a Python built in module, difflib. Difflib returns the closet match to the query word from a list if it exists. All there is left is to first convert all the documentation collection to a list and then call difflib with our search term. Not only I want return a custom message stating the closet match word, I also want to offer an option for the user to report to the developer (either in the form of DM, or to the debug channel for this app). More specifically, this is implemented by sending a button inside the message inside the Slack message block. However, this will not work if we do not setup a listener for it. In this case, "app" refers to the Slack app, not to be confused with the flask app. This is a method from Bolt for Python. Bolt for Python actually supports many listeners, however we only need the action method, namely "button_click". First we retrieve the word from the button. We do not need the ack part therefore, in this line we do not do anything. Bolt for Python's API format is slightly different from Slack API, but say() is the equivalent of client.chat_postMessage(). Now we can successful redirect the message to the developer channel. Once the request is finished, the final job for the webhook app, is to return the response back to the sender. This has to be in valid format as well, otherwise Dialogflow will return a default fallback message. The format again is also in Json format. Since we are returning the message in a lot of occasions, it makes sense to create this as a function and call it when we use it. The function below will return the message in the valid format: In this blog, I have explained how to setup a Slack bot using Google Dialogflow. First, I explained why I chose Google Dialogflow, Google Cloud Run and Google Firebase Firestore. Secondly, I have explained how to implement the Google Cloud Run script in Python. This script listens the request from Google Dialogflow, check for the term in the Database, and then finally returns response back to Google Dialogflow. In the case of failed match, it also allows the user to report back to the developer for future improvement. With all being said, this slack bot is far from being perfect. While implementing this chat bot, I have discovered another big obstacle for chat bot is the knowledge base creation. Currently I use another script which reads all the Q&A data from an Excel file. All the data entries have to be made manually by hand, which is time consuming. Can the data entry method be automated? What if someone wants to update this list? How to make this list always update to the latest information automatically? In the future I want to make the knowledge base creation as automatic as possible. The complete source code will be uploaded to my company github. GitHub - wang-shuochen/Slack-bot: Python implementation of Slack bot using Google dialogflow Slack Web Client Python: https://slack.dev/python-slack-sdk/web/index.html Bolt for Python: https://slack.dev/bolt-python/tutorial/getting-started Python difflib module:Introduction
Why do we need chatbot?
Design brief
Purpose of the Slack bot
The process
Why choose Google Dialogflow? (Figure 1 step 3 and 7)
Why use external API?
What kind of external API? (Figure 1 step 5)
Why choose Google for webhook? (Figure 1 step 4 and 6)
Cloud Run script specifics
Libraries/modules to be used
Setting up the main web application
flask_app = Flask(__name__)
if __name__ == '__main__':
flask_app.run(debug=True)
Setting up the app route
@flask_app.route('/', methods=['POST'])
def webhook() -> Dict:
"""Handle webhook requests from Dialogflow."""
# Get WebhookRequest object
request_ = request.get_json(force=True)
try:
channel = (request_["originalDetectIntentRequest"]["payload"]["data"]["event"]["channel"])
except KeyError:
return ("Something went wrong")
}
}
}
]
} # Don't process if the confidence level is low
elif (request_["queryResult"]['intentDetectionConfidence'] <= 0.5) :
response = returnMessage('この質問を答える自信がありません。')
return response
else:
return_message = handler(request_, channel)
return return_message
Setting up the request handler
def handler(request_, channel) :
search = (request_["queryResult"]["parameters"]["any"])
q_ref = db.collection(u'questions')
query_ref = q_ref.where(u'Term', u'==', search)
if not query_ref.get():
final_response = fallback_handler(search)
attachments_json = setReply(search)
client.chat_postMessage(
channel=channel, text="", blocks = attachments_json)
else:
final_response = success_handler(search)
return final_response
Setting up the fallback handler
def fallback_handler(word):
docs = db.collection('questions').stream()
#store all the definition as list
entries = []
for doc in docs:
entries.append(doc.to_dict()['Term'])
phrase = '申し訳ありません。{}はまだ追加されておりません。\n'.format(word)
closeMatch = difflib.get_close_matches(word, entries)
if closeMatch:
return returnMessage('{}質問リストに追加されている一番近い単語は{}です。'.format(phrase, closeMatch))
else:
return (returnMessage(phrase))
Setting the interactive button for Slack
@app.action("button_click")
def action_button_click(body, ack, say):
# Acknowledge the action
word = (body["actions"][0]["value"])
ack()
say(channel="dialogflowinfo", text="{}が検索されて、回答が見つかりませんでした。".format(word))
Setting the return message function
def returnMessage(phrase):
text = {
"fulfillmentMessages": [{"payload": {"slack": {"text": phrase}}}]
}
return text
Conclusion and future work
Reference