WhatTask Bot for Task Management
6m 12s
Building WhatTask: A WhatsApp Bot for Task Management
View code on GitHubIn the fast-paced world of technology, managing tasks and reminders efficiently has become a necessity. WhatTask is a web application and WhatsApp bot. This blog post explains the technical journey of developing WhatTask, exploring its architecture, challenges faced, and the successes achieved.
The Concept
WhatTask is designed to make task management accessible and convenient. Users can simply send a message to the WhatTask bot on WhatsApp, for example "remind me to clean my car tomorrow”. The bot processes this request to set a reminder. The core functionality revolves around the integration of the GPT-4 API, which interprets the user's message and returns structured data as JSON, that is then used to display reminders and tasks to the user.
Architectural Overview
The architecture of WhatTask is structured around three main components:
- The WhatsApp bot interface,
- The backend processing of the GPT-4 API,
- The task management interface built with Next.js and Firestore.
WhatsApp Bot Interface
The entry point for users is the WhatsApp bot. This interface uses the WhatsApp Business API (Via Twilio) to both receive messages from users and send responses. When a user sends a message, the bot forwards the message to the backend for processing.
// Simplified code for Twilio API route (next.js)
// Excludes error handling etc, for easy reading
// Click the GitHub button below to view the full code
const MessagingResponse = require("twilio").twiml.MessagingResponse;
export async function POST(request: Request)
{
const twiml = new MessagingResponse();
// Access the message body from the request body
const incomingMessageBody = await request.text();
// Parse the incoming message body into key-value pairs
const params = new URLSearchParams(incomingMessageBody);
const messageBody = params.get("Body");
const messageFrom = params.get("From");
const phone = messageFrom as string;
responseMessage = "Task Created Successfully";
const openai = await getResponse(messageBody as string);
const taskObject = await saveDataToUserTasks(phone, openai);
responseMessage = 'Task $ {taskObject} was created successfully';
}
View code on GitHub
Backend Processing with GPT-4
Upon receiving a task reminder, the backend uses the OpenAI GPT-4 API to interpret the users message. The API is able to understanding natural language, allowing it to parse various task descriptions. It returns a JSON array with structured information about the task, which can be shown on the frontend. The array allows multiple tasks to be sent by the user in a single message.A key element of this task was engineering a suitable and reliable system prompt prior to the users message. In addition, user messages will always be followed by the current date and time to allow for accurate context.
// Simplified code for Twilio API route (next.js)
// Excludes error handling etc, for easy reading
// Click the GitHub button below to view the full code
export default async function getResponse(message: string) {
try {
const run = await openai.beta.threads.createAndRun({
assistant_id: process.env.ASST_ID as string,
thread: {
messages: [
{
role: "user",
content:
message +
"- context: the current date/time is " +
getCurrentDateTime(),
},
],
},
});
const thread_id = run.thread_id;
const threadMessages = await openai.beta.threads.messages.list(thread_id);
return threadMessages.data[0].content;
}
View code on GitHub
This is an example of what the JSON array returned by the API should look like:
[
{
"taskName": "Book flight to Bangkok",
"taskDescription": "Arrange for a flight booking to BKK next Monday.",
"priority": "High",
"category": "Personal",
"dueDate": "2024-03-05T16:31:00.000Z",
"tags": ["travel", "flight", "booking", "Bangkok"]
}
]
Task Management with Next.js and Firestore
The JSON object returned by the GPT-4 API is parsed by another endpoint, which acts as the heart of the applications logic. The task information is then stored in Firestore. This allows users to efficiently manage tasks, including adding, editing, and removing them.
// Simplified code sample for saving data to Firestore
// Click the GitHub button below to view the full code
data.forEach(async (item) => {
const parsedItem = JSON.parse(item?.text?.value)[0];
const taskData = {
name: parsedItem.name || parsedItem.title,
description: parsedItem.description,
priority: parsedItem.priority,
category: parsedItem.category,
tags: parsedItem.tags,
dueDate: parsedItem.dueDate,
createdAt: Timestamp.now(),
updatedAt: Timestamp.now(),
taskId: taskId,
status: "open",
};
taskName = taskData.name;
const taskRef = doc(db, 'users/$ {user.uid}/tasks/$ {taskId}');
// Save the data
await setDoc(taskRef, taskData);
});
View code on GitHub
This is what is shown to the user on the frontend:
Challenges and Solutions
Natural Language Processing
One of the significant challenges was ensuring that the GPT-4 API accurately interpreted the wide variety of natural language inputs from users. I addressed this by thoroughly testing the system prompt with diverse task descriptions and refining the query parameters to the API, ensuring a higher accuracy rate in task interpretation. A major issue to begin with was that sometimes the API would return "dueDate" in the JSON, and other times return "due_date". This was resolved through trial and error with the system prompt to clarify the exact object keys, but I created a function to check for both and return the correct one regardless.
User Interface
Creating an intuitive and user-friendly interface for task management was crucial. I focused on a clean design and user experience on the Next.js frontend, allowing users to easily manage their tasks. Implementing features like task editing and deletion required careful consideration of the user flow to ensure a smooth experience, although still has some work remaining to be production ready.
Successes
The development of WhatTask successfully integrated cutting-edge technology like the GPT-4 API into a real-world application, enabling natural language processing in task management. The use of Next.js and Firestore has resulted in a scalable, efficient application.
Conclusion
The technical development of WhatTask showcases the power of combining NLP technology with modern web technologies to solve real-world problems. Leveraging these technologies has resulted in a tool that simplifies task management in a novel and accessible way.