Develop Rust Skill Server Of Kakao Chatbot On AWS Lambda
In the Kakao Talk chatbot service, there is a ‘skill’ feature. It is a kind of rest api that verifies data through an http request sent from the chatbot, or processes data used by the chatbot and sends an http response. I was asked to develop the skill server for a chatbot used by a friend’s company. This post is to record where and how I made the skill server, and what problems I had during the process.
Purpose
The purpose of the skill server I was making was just one. The purpose was to verify whether the person using the chatbot is an employee of the company or not. At first, I thought that the chatbot itself might have a verification function, but since the chatbot is basically open to an indefinite number of people, it does not provide an authentication function specifically for a particular group. However, since I saw that authentication was clearly being done in chatbots like Lotte’s internal chatbot or other chatbots, I thought there must be a way, so I decided to build employee data on my own. Since the chatbot’s authentication block can be used to check user information, I thought I could implement the verification function by comparing it with the employee data. I thought about using the company’s internal computer server for employee data, but since I couldn’t communicate with the IT team without being an employee, I decided to build the employee data myself.
In summary, I implemented three functions.
- Building and automating employee data updates
- Checking user information of the chatbot
- Comparing user information with employee data
First Implementation in Python
Local? Serverless? : AWS Lambda + DynamoDB
First, I needed to decide which server to implement the above functions. Since buying a computer for a chatbot is an unusual thing, I decided to use AWS Lambda, which is now very cheap, for the above functions. AWS Lambda costs only a few dollars per million requests, it can be used almost without cost on an internal chatbot used by a few thousand people. Python can be directly modified in the web and tested, it was also easy to develop.
Since I decided to use Lambda, I also found a DB to store employee data on AWS. I didn’t know much about DBs compared to Lambda, I decided to use DynamoDB, which can be used for free. I was not very satisfied with it, but it was not too bad.
Automating employee data updates
Since the company has several hundred employees, updating employee data is not an easy task. It is also inefficient to open each employee’s DB, and since it is not possible to upload one by one when implementing the first DB, I implemented the Lambda function to update the data first. The function reads the data from csv files uploaded to S3 and updates it to DynamoDB.
First, I created a table for the data to be uploaded to DynamoDB. I wanted to compare employee information with the phone number later, I set the phone number as the primary key. Then, I created a bucket on S3 and uploaded the csv files.
I created a Python function in Lambda and granted it read and write permissions for S3 and DynamoDB.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"dynamodb:BatchWriteItem",
"dynamodb:PutItem",
"dynamodb:UpdateItem"
],
"Resource": [
"arn:aws:s3:::{BucketName}",
"arn:aws:s3:::{BucketName}/*",
"arn:aws:dynamodb:ap-northeast-2:{AccountID}:table/{TableName}"
]
}
]
}
The Python code is as follows.
import json
import boto3 # AWS access package
import csv
def lambda_handler(event, context):
# Open S3 client
s3 = boto3.client('s3')
# Open DynamoDB client
db = boto3.client('dynamodb')
# Read the {File name}.csv file from the {Bucket Name} bucket on S3.
response = s3.get_object(Bucket='{Bucket Name}', Key='{File name}.csv')
# Read the Body(streamBody) item from the loaded data as if it were read by a file handle and split it line by line.
rec_data = response.get('Body').read().splitlines()
# Since the data is in byte form, convert it to a string.
# Since the index value(idx) is 0, it is the first line, so it is skipped.
rec_list = [x.decode(encoding = 'utf8') for x in rec_data[1:]]
# Parse the csv by comma(,)
# Split the items by comma(,) but ignore commas within double quotes.
csv_reader = csv.reader(rec_list, delimiter=',', quotechar='"')
# Loop through the parsed data line by line with the index value(idx).
for idx, row in enumerate(csv_reader):
name = row[1] # Name
pos = row[2] # Position
dept = row[3] # Department
phone = row[4] # Phone number
email = row[5] # Email
if phone == '':
continue
# dynamodb에 쓴다.
# If the primary key already exists in the DB, it updates the Item.
response = db.put_item(
TableName='{Table Name}',
Item= {
'phone':{'S':phone},
'name':{'S':name},
'dept':{'S':dept},
'email':{'S':email},
'position':{'S':pos},
}
)
# 완료
print('succeeded')
The only problem with this code is that the most convenient way to run it is to run it in the web, but the execution time is short, so it cannot upload the csv file at once. The timeout is set to 3 seconds by default, so it needs to be increased. In the Configuration item of the Lambda function, Timeout can be set to about 10 minutes.
Checking chatbot user information
Now, let’s create a new lambda function and give it the function to perform the authentication request of the chatbot. We need to get the user information to be compared with the DB first. The chatbot’s authentication block contains an otp key that can receive user information in the skill.
{
...
"origin": "https://talk-plugin-capi.kakao.com/otp/62247d59d36c554596ea60eb/profile",
"value": "{\"otp\":\"https://talk-plugin-capi.kakao.com/otp/62247d59d36c554596ea60eb/profile\",\"app_user_id\":2113105765}"
...
}
If you request otp key with the rest api key of the chatbot administrator to kakao, it will respond with the user’s name, number, profile photo address, etc. The function to parse the otp key and return the user information is as follows.
def parse_userinfo_api(event):
# event : http request passed to lambda
body = json.loads(event['body'])
otp = body['value']['origin']
return otp
def get_user_info(otp):
url = otp + "?rest_api_key={rest_api_key}"
http = urllib3.PoolManager()
response = http.request('GET', url)
data = json.loads(response.data.decode('utf-8'))
name = data['nickname']
phone = '0'+ data['phone_number'][4:]. # +82-10-2222-3333 to 010-2222-3333
return (name, phone)
Comparing user information with employee data in DB
Now that we have the user’s phone number, we just need to compare it with the DB and respond with the result. Since we need read permission for the DB, let’s modify the lambda function’s permissions as follows.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"dynamodb:BatchGetItem",
"dynamodb:GetItem",
"dynamodb:Scan",
"dynamodb:Query",
],
"Resource": [
"arn:aws:dynamodb:ap-northeast-2:{Account ID}:table/{Table Name}"
]
}
]
}
Then, write the function to check if there is an employee with the user’s phone number as the primary key in the DB as follows.
def return_data_api(phone):
dynamodb = boto3.client('dynamodb')
exist = dynamodb.get_item(TableName = '{Table Name}', Key = {'phone':{'S' : phone}})
if 'Item' in exist:
return {
'statusCode': 200,
'body' : json.dumps({
"version" : "2.0",
"status" : "SUCCESS"
}),
}
else:
return {
'statusCode': 200,
'body' : json.dumps({
"version" : "2.0",
"status" : "FAIL"
}),
}
Since the skill may also pass data, the above response is also valid for authentication because it only needs to pass the success or failure result of authentication. Combining these functions, the entire code is as follows.
import json
import boto3
import urllib3
def lambda_handler(event, context):
otp = parse_userinfo_api(event)
name, phone = get_user_info(otp)
return return_data_api(phone)
def parse_userinfo_api(event):
# ...
def return_data_api(phone):
# ...
def get_user_info(otp_url):
# ...
Convert Into Rust
Cold start problem
Although the function works well with Python, there was a quite big problem. Lambda is designed to sleep if there is no request for a certain period of time, and to wake up and run again when a request comes in. So if you request it after a long time, the response time increases from 0.5 seconds to 1.5 seconds, and since the chatbot will see it as a response delay if it exceeds about 1 second, it will show an authentication failure message. The problem is that cold start occurs if there is no request for 10 to 20 minutes, so authentication failure occurs quite often.
So I thought that maybe if I changed it to Rust from Python, the compile process would be unnecessary, so it would run faster. The actual execution time would also be faster, so it might be able to respond before the timeout even if it is cold start. As a result, the execution time of warm start was reduced to 0.5 seconds to 0.2 seconds, and the cold start was reduced to 1.5 seconds to 0.9 seconds. I thought it would be solved, but unfortunately, the Kakao chatbot also timed out. So I used the method of increasing the number of re-request attempts for authentication failure to 3 times.
Uploading Rust executable files
Rust cannot directly modify the code on AWS Lambda.
You just need to upload the built executable file to Lambda in a fixed format, but this part was more difficult than expected.
First, you need to cross-compile to x86_64-unknown-linux-gnu to match the OS used by Lambda.
The setting for this is as follows.
rustup target add x86_64-unknown-linux-gnu
cargo install cross
Then, you can build it with cargo as follows.
cargo build --release --target x86_64-unknown-linux-gnu
However, in this case, you need to compress it into a file named bootstrap.zip and upload it.
You can do this process at once with cargo-lambda.
cargo install cargo-lambda
cargo lambda build --release --target x86_64-unknown-linux-gnu --output-format zip
cargo-lambda also provides an environment to test the lambda function.
Since the lambda function runs on the server and receives and responds to requests, you need to implement a local server environment to test it.
You can set up a local server with the cargo lambda start command and send a request to the function with the cargo lambda invoke command.
Dependency problem
Compared to Python, a lot of crates were used to write the lambda function in Rust.
Since it needs to be written as an asynchronous function, it uses the tokio crate,
The boto3 package is aws-sdk-dynamodb, lambda-runtime, lambda-http crate,
For JSON parsing, it uses the serde, serde_json crate,
To make an http request, it uses the reqwest crate.
If the reqwest crate is not properly managed, it will cause a compile error due to the version problem of the openssl crate.
I also considered simply lowering the reqwest version, but it didn’t work at all, and the only solution was to change the feature as follows.
reqwest = { version = "0.11.10", features = [ "json", "rustls-tls" ], default-features = false }
Quotation in JSON
The second problem was the quotation problem.
Since Python uses single quotes and double quotes for quotation marks, it was not a problem to express the JSON format.
However, in Rust, single quotes are only used for single characters, and only double quotes are used for strings, so it is not suitable for expressing JSON formats containing quotation marks.
Therefore, it is necessary to remove the extended characters or use the unescape function of snailquote.
use snailquote::unescape;
fn parse_otp(event : Value) -> Result<String, Box<StringError>>{
let mut otp = event["value"]["origin"].to_string();
otp.retain(|c| c != '\"');
Ok(otp)
}
async fn get_user_info(mut otp : String) -> Result<(String, String), Box<StringError>>{
otp.push_str("?rest_api_key={Rest API key}");
let res = reqwest::Client::new().get(otp).send().await.map_err(|e| StringError::new(format!("OTP request failed with error : {:?}", e)))?
.text().await.map_err(|e| StringError::new(format!("Parse response to text failed with error : {:?}", e)))?;
let data : Value = from_str(&res).unwrap();
let name = unescape(&data["nickname"].to_string()).unwrap();
let phone = unescape(&data["phone_number"].to_string().replace("+82 ", "0")).unwrap();
return Ok((name, phone));
}
Result
The crate version and final result are as follows.
[dependencies]
tokio = { version = "1.17", features = ["full"] }
serde = "1.0.100"
serde_json="1.0.79"
log = "0.4.16"
lambda_runtime = "0.5.1"
lambda_http = "0.5.1"
http = "0.2"
aws-config = "0.10.1"
aws-sdk-dynamodb = "0.10.1"
reqwest = { version = "0.11.10", features = [ "json", "rustls-tls" ], default-features = false }
snailquote = "0.3.1"
use aws_config::meta::region::RegionProviderChain;
use aws_sdk_dynamodb::{Client, model::AttributeValue};
use lambda_runtime::service_fn;
use lambda_http::{Response, Request, Body, RequestExt};
use http::status::StatusCode;
use std::convert::Into;
use serde_json::{Value, from_str};
use reqwest;
use snailquote::unescape;
#[derive(Debug)]
struct StringError{
body : String,
}
impl StringError{
pub fn new(s : String) -> Box<Self>{
Box::new(Self{
body : s,
})
}
}
impl std::fmt::Display for StringError{
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", self.body)
}
}
impl std::error::Error for StringError {}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync + 'static>> {
let func = service_fn(func);
lambda_http::run(func).await.map_err(|e| StringError::new(format!("Lambda run failed with error : {:?}", e)))?;
Ok(())
}
async fn func(req: Request) -> Result<Response<&'static str>, Box<dyn std::error::Error + Send + Sync + 'static>> {
let event : Value = match req.body(){
Body::Text(s) => from_str(s).unwrap(),
_ => {return Err(StringError::new("Request body is mal-formed".to_string()));}
};
let _context = req.lambda_context();
let region_provider = RegionProviderChain::default_provider().or_else("ap-northeast-2");
let config = aws_config::from_env().region(region_provider).load().await;
let client = Client::new(&config);
let otp = parse_otp(event)?;
let (name, phone) = get_user_info(otp).await?;
let res = check_exist(&client, phone).await?;
let resp = response_builder(res).unwrap();
Ok(resp)
}
async fn check_exist<T>(client : &Client, phone : T) -> Result<bool, Box<StringError>>
where T : Into<String> + Clone{
let resp = client.get_item().table_name("{Table Name}")
.key("phone", AttributeValue::S(phone.into()))
.send().await.map_err(|e| StringError::new(format!("DB function failed with error : {:?}", e)))?;
let item = resp.item;
match item{
Some(_) => Ok(true),
None => Ok(false),
}
}
fn parse_otp(event : Value) -> Result<String, Box<StringError>>{
let mut otp = event["value"]["origin"].to_string();
// println!("Raw otp : {}", otp);
otp.retain(|c| c != '\"');
Ok(otp)
}
async fn get_user_info(mut otp : String) -> Result<(String, String), Box<StringError>>{
otp.push_str("?rest_api_key={Rest API key}");
let res = reqwest::Client::new().get(otp).send().await.map_err(|e| StringError::new(format!("OTP request failed with error : {:?}", e)))?
.text().await.map_err(|e| StringError::new(format!("Parse response to text failed with error : {:?}", e)))?;
let data : Value = from_str(&res).unwrap();
let name = unescape(&data["nickname"].to_string()).unwrap();
let phone = unescape(&data["phone_number"].to_string().replace("+82 ", "0")).unwrap();
return Ok((name, phone));
}
fn response_builder(res : bool) -> http::Result<Response<&'static str>>{
let builder = Response::builder().status(StatusCode::OK);
let body = match res {
true => {
r#"{"version" : "2.0","status" : "SUCCESS"}"#
},
false => {
r#"{"version" : "2.0","status" : "FAIL"}"#
},
};
builder.body(body)
}