nach oben

Erschienen in:

Open Access 2024 | OriginalPaper | Buchkapitel

Drug Recommendation System for Healthcare Professionals’ Decision-Making Using Opinion Mining and Machine Learning

verfasst von : Pantea Keikhosrokiani, Katheeravan Balasubramaniam, Minna Isomursu

Erschienen in: Digital Health and Wireless Solutions

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

The concern has been raised regarding errors in drugs prescription and medical diagnostics that need to be carefully thought through. Both patient diagnosis and medication prescription are the responsibilities of healthcare providers. As the number of people with health issues rises, the healthcare professionals’ burden is increased. Medical errors may occur in the healthcare sector as a result of healthcare professionals prescribing drugs medicines based on inadequate information related to patient history and drug side effects. Therefore, this study aims to propose a drug recommender system to assist healthcare providers in decision making when prescribing drugs for patients depending on their diagnoses. Drug reviews sentiments are analyzed to find the drug effectiveness among the users. Furthermore, the most suitable recommender algorithm for recommending drugs based on the data from healthcare professionals are selected for this study. Opinion mining is applied on drug reviews, and a hybrid method is implemented to overcome the limitations of content-based and collaborative filtering methods, such as the cold start problem and increasing client preference. The system is developed and tested successfully. The proposed system can assist healthcare professionals in drug decision making and sustain the whole digital care pathway for various diseases.

1 Introduction

Various issues related to medical diagnostics and drug prescriptions have been raised, which require careful consideration. Healthcare professionals are responsible for diagnosing patients and prescribing drugs. There is a lack of healthcare providers as the number of people with health issues rises [1‐6]. When a patient is given the incorrect medication for their condition, a medical error may have occurred. In the worst circumstances, this can result in patient health risks or even death. There are around 99,000 deaths caused by medical errors in hospitals each year [7]. This type of error occurs when healthcare professionals prescribe drugs without checking patient history and drug side effects [8]. A healthcare professional's experience may be limited since they might be unaware of all of the numerous types of drugs on the market, which leads to medical errors. Consequently, a drug recommendation system would be crucial in order to address this problem.

Recommender systems are information systems that predict user preferences and give personalized and subjective product or service recommendations [9‐11]. These systems are used in many industries, particularly e-commerce, to provide customers with a personalized experience. The medical industry can also employ this kind of recommender system, particularly for the recommendation of drugs [8, 12]. The use of a drug recommendation system may not be appropriate for patients because they cannot take any drugs without consulting healthcare professionals, but it will be very useful for healthcare professionals because it may help them select the best drugs to prescribe to their patients. Every day, more and more drugs are being developed, so selecting a proper drug for patients is a major burden for healthcare professionals. This burden can be minimized by adopting a drug recommendation system. It is possible to use features like machine learning and opinion mining to increase the effectiveness and dependability of a drug recommendation system. Despite having similar chemical features and characteristics, many drugs will react differently in patients. As a result, selecting a drug to recommend to patients for a certain health issue is quite difficult for healthcare professionals. Most of the time, healthcare professionals purchase drugs based on a health representative’s recommendation. Therefore, they might not be aware of other drugs that are on the market and could be better to those from the representative.

Product reviews are extremely crucial for any type of product. We may learn more about the product by analyzing the reviews, and the same is applicable for drugs. Patients that take a certain drug will give feedback based on their personal experiences with it. Opinion mining can aid in the subsequent analysis of these reviews in order to obtain some useful information. In short, opinion mining is a technique of natural language processing that extracts information from text. It is possible to conduct opinion mining on drug reviews to determine the reviews’ sentiments. Sentiment analysis will assist healthcare professionals in determining if a review is positive, neutral, or negative [13]. Healthcare professionals can therefore gain a better grasp of overall drug effectiveness in various patients by implementing sentiment analysis on drug reviews.

Last but not least, cold start issues and rising customer preference are the two key issues that need to be addressed when discussing recommendation systems [11, 12]. To solve the cold start issue and rising customer preference, a suitable algorithm will be needed. This is where machine learning can be helpful. In order for healthcare professionals to have a personalized experience, hybrid content-based and collaborative filtering methods can be used. These filtering methods will enable better drug recommendation for healthcare professionals for a particular disease.

Therefore, this study aims to propose a drug recommendation system for healthcare professionals using opinion mining and machine learning. Opinion mining is used to extract side effects from drug reviews and identify drug review sentiments. Machine learning is utilized for recommendation purposes, with a hybrid content-based and collaborative filtering method. The secondary objectives of this paper are:

To classify drug reviews based on sentiment using sentiment analysis to assist health professionals better understand drug effectiveness.
To develop a web-based drug recommendation system for healthcare professionals using hybrid filtering technique.
To facilitate active involvement and knowledge sharing among healthcare professionals.

2 Methodology

2.1 System Architecture

This study aims to develop a web-based drug recommendation system for healthcare professionals using opinion mining and machine learning. The proposed system architecture is depicted in Fig. 1. A three-layer architecture design is implemented for this system. Two main users of this system are healthcare professionals and admin. For frontend of the web application, React JS and Material UI (MUI) library were utilized. React JS, a widely used JavaScript library, was utilized alongside the Material UI Library to develop the user interface. React’s component-based architecture, virtual DOM, and responsive design make it ideal for frontend development. The Material UI Library offers prebuilt components adhering to material design guidelines, simplifying UI development by eliminating the need to create components from scratch. Integration between React and Material UI is seamless, allowing for customized components to meet specific UI requirements. Typescript, a superset of JavaScript, was employed as the programming language for frontend development.

To fulfil the web application’s API functionalities and incorporate machine learning models, FastAPI was selected. It serves as the backend implementation for all the system’s functionality. Furthermore, the database used for this project is Mongo DB which is a NoSQL database. Mongo DB stores data in BSON format, which is compatible with the structure of the drugs, users, and forums data in the system. FastAPI acts as an intermediary between the frontend, machine learning models, and the database. When users request access to data, they make API requests. Similarly, for utilizing machine learning models, API requests are made to FastAPI, passing the necessary parameters. The backend then retrieves the model results and returns them to the frontend for display to the user.

There are four main modules in the proposed system consisting of: (1) user management module, (2) drug review analytics module, (3) recommendation module, and (4) drugs management module. The target customers for this proposed system are healthcare professionals in hospitals. They can use this system to find the best drugs for a certain diagnosis and recommend them to their patients. The admins manage the overall system modules. All healthcare professionals and admins who use this system must be registered. The user management module is important in collecting user behavior data for the drug recommendation module.

For the drug review analytics module, patient drug reviews are classified as positive, neutral, or negative based on sentence polarity. Sentiment analysis is used to identify key features from reviews and then use them to assess the polarity of the review. Feature extraction and feature orientation identification are used in the sentiment analysis process. Depending on the orientation of the feature, the polarity of a review is classified as positive or negative. The features from the reviews are extracted using the Term Frequency–Inverse Document Frequency (TF-IDF) and Bio Bert. Several algorithms are employed to determine the reviews’ sentiment, including perceptron, logistic regression, and long short-term memory networks, along with the Bio Bert and TF-IDF feature extraction algorithms. For this solution, the No Free Lunch (NFL) Theorem is used, in which all of these models are evaluated using the appropriate metrics and the best performing model is selected based on the metrics score. Following the sentiment analysis, each medicine receives an effectiveness score. This effectiveness score is used to rank drugs in the system when healthcare professionals search for drugs. Drugs with higher effectiveness scores will be given more important rate. Sentiment analysis may not be relevant for patients because they are not making decisions about the type of that they need to take. But it is crucial for healthcare professionals to gain a better understanding of drugs and their effectiveness in order to select the best drug to prescribe to a patient for a given diagnosis.

The next module is the drug recommendation module. Cold start issues and overspecialization issues might arise in a recommender system [12]. To address these issues in the proposed system, a hybrid content-based and collaborative filtering recommender system is implemented. The cold start problem is solved by using content-based filtering techniques to recommend drugs to healthcare professionals based on how similar the drugs are to one another. This approach of filtering is primarily concentrated on the characteristics of each drug utilizing an item profile characterizing and looks for similarities to previous drugs that healthcare professionals liked. Healthcare professionals receive recommendations regardless of their user profiles. Term frequency-inverse document frequency (TD-IDF) algorithm is used to weigh the feature from the dataset first, and then it generates step-by-step cosine similarity tables. Drug side effects are one of the features that are considered for content-based filtering. As a result, the lack of drug reviews and ratings is no longer an issue. Furthermore, collaborative filtering is used to tackle the overspecialization problem by enhancing healthcare professionals’ preferences. Drug recommendations are made by looking at the preferences of a group of users who are similar to them. By grouping healthcare professionals who access or buy similar products in the system, the system converts the behaviors of healthcare professionals into implicit rating weightage. After that, a collaborative rating is used to determine the user’s rating. The ratings of healthcare professionals are then matched to those of the target user in order to identify those who share the same preferences. This approach might not be very effective at first, but after many healthcare professionals start using the system, it becomes a very reliable recommendation method.

The final module is the drug management module. Users can search for any drugs in the system related to a particular disease. Once a user views a particular drug, the drug’s details are displayed to them. Furthermore, users have additional features like drug comparison, adding drugs to wish list and forum page. The drug comparison feature will allow healthcare professionals to make comparisons between two drugs and assist them to choose the best one. The forum page on the other hand facilitates communication between healthcare professionals. This feature allows healthcare professionals to share knowledge about the drugs and prescriptions and expand their network.

2.2 Development Methodology

Agile methodology [14, 15] is used as the development methodology in this study since it provides a stable system delivery in a short development time. Figure 2 depicts the agile development methodology diagram used in this study. Agile development involves six main phases of planning, design, development, testing, maintenance, and deployment. In agile, the system development tasks are breakdown into smaller parts and each part is handled in a number of iterations. Each iteration involves all six phases of agile development, and several numbers of iterations are required to complete an entire system. If there are any errors, the debugging process is done during development. For every iteration, a set of system requirements is listed out and followed accordingly. Therefore, requirements changes can be made easily before a new iteration starts. This ensures that agile development methodology adapts to new changes immediately and these changes can be integrated into the system easily without having to make a lot of changes in the system. Release of the system cannot be done after a few iterations as there might not be enough functionalities. After all the modules are developed completely, they are integrated into one module. Then this module is tested to make sure it is ready for final release. Agile methodology implementation helps us to minimize development process risks and focuses on getting products to market fast.

2.3 Overall System Flowchart

Figure 3 illustrates the overall system flowchart for this study. The main users, who are healthcare professionals, must log in before using the system. The users need to create a new account first. After logging in, users can search for drugs by name or condition. All relevant drugs related to the search will be displayed in the search results. These drugs will be ranked according to the effectiveness score. Drugs with higher effectiveness scores will be ranked first. The user can then click on any specific drug in the search results that they want to view. When a user clicks on a specific drug, details about the drug, sentiment analysis results, and recommended drugs will be available to the user. After performing sentiment analysis on drug reviews, the system will display the results including general statistics for the user’s preferences.

Moreover, based on the content-based filtering model, the system will identify drugs that are similar to the drug being viewed or displayed to users. In the search page, another recommendation will be available for users, and they can navigate to the drug comparison page. In this page, the users can select any two drugs available in the system and view their comparison results. Furthermore, users can navigate to the wish list page. There is also a forum page where users can create, edit, and delete posts, view forums created by other users and comment on them. The main purpose of this feature is to facilitate communication between healthcare professionals. The final page assist users to navigate to add drugs data page for adding more drug data to the system or delete the existing drug data. This page and feature are only for the usage of admins. Eventually, users can either search for more drugs or leave the system.

2.4 Sentiment Analysis

This study used sentiment analysis to classify drug reviews as positive, neutral, or negative based on their sentiment. A machine learning-based approach is utilized to implement sentiment analysis on drug reviews, as shown in Fig. 4. This is due to the fact that machine learning-based approaches generally outperform lexicon-based approaches and have higher accuracy scores [16‐20]. The dataset being utilized for sentiment analysis is labeled, making it appropriate for this method. Positive reviews are labeled as 1, neutral reviews as 0, and negative reviews as −1. This dataset is divided into 80% training data and 20% testing data. Data pre-processing includes two steps which are data cleaning and feature extraction. Data cleaning involves removing duplicate data and unnecessary contents like repetitive words, symbols and stop words. Then, text tokenization is performed on the cleaned data where reviews are broken down to a set of words and each word undergoes lemmatization.

Following the data cleaning step, TF-IDF and Bio Bert algorithms are used for feature extraction. Feature extraction is performed to select the most important features from reviews. These features are then used to predict the polarity of drug reviews. The classification process is done in the next step. For classification, supervised machine learning models like Perceptron, Logistic Regression, and Long Short-Term Memory Network are utilized. The models are then evaluated using precision, f1-score and accuracy metrics. Finally, these three algorithms are compared, and the best algorithm is selected to implement sentiment analysis on drug reviews.

2.5 Recommendation Algorithm

Recommendation systems can be built using different types of algorithms like associative rule, content-based filtering, collaborative filtering, and knowledge-based filtering. This study utilized a hybrid content-based and collaborative algorithm [12] to build a drug recommendation system. Figure 5 depicts the proposed hybrid filtering algorithm framework for this study, which is adopted from [12]. This study aims to resolve some of the recommendation system issues such as cold start problem, increasing customer preference, large number of drugs in market and overspecialization problems [12]. Identifying problems in a system is important for the development process. Then, data collection and pre-processing are done. The dataset used in this study are drug reviews data, drug information data and healthcare professionals’ data collected from AskaPatient database, UCI ML Drug Review dataset, Drugs.com, and DrugBank respectively. After data pre-processing, the proposed recommendation model, which combines content-based filtering and collaborative filtering is performed.

For content-based filtering, drugs are recommended to healthcare professionals based on drug similarities without taking user profile into count. The TF-IDF algorithm is used to extract features from data and give weightage to them. Depending on the frequency of each word, this algorithm determines its significance. This will be then used to create a step-by-step cosine similarity table. There are a few steps involved in creating a cosine similarity table. First, TF scores are computed, and the table is normalized. Then, IDF is calculated to find the number of items for each user. Finally, the importance of items is ranked for each user by multiplying TF and IDF scores. Recommendation can be done by picking the top N similar products where N stands for the number of recommendations for a user. Content-based algorithms are chosen because they help to solve cold start problems. Therefore, recommendation of drugs can be done without obtaining any user data.

For collaborative filtering, drugs are recommended to healthcare professionals based on user behavior. First, user behavior like viewing a drug or adding a drug to a wish list is converted into implicit rating weightage. This weightage is then used to generate user rating using collaborative filtering. When user behavior is converted to rating, sparse matrices are formed because there is a lot of missing data. In order to predict the missing values, matrix factorization is utilized. Matrix factorization helps to recommend least popular products to users using the new calculated rating. Collaborative filtering was selected because it helps to solve overspecialization and increasing customer preference problems. Therefore, healthcare professionals are able to gain more knowledge on different types of drugs for different diseases in the market and at the same time least popular products is recommended to them. After the proposed hybrid model is developed, it is evaluated using error metrics like mean absolute error, root-mean-square error, and ranking metrics like precision and recall. Finally, the overall recommendation system is developed and tested before integrating it into the web application.

2.6 Description of Data Source

In this study, two sets of data are required, which are drug reviews data for developing sentiment analysis models and drug details data for the system. Drug review data are obtained from mainly two sources which are AskaPatient database and UCI ML Drug Review dataset. The dataset for UCL ML Drug Review has a total of 232000 reviews for different drugs for different conditions. This dataset has attributes like drug name, health condition, drug review, date of review, drug rating and useful count. The dataset from AskaPatient contains drug review as well, but the attributes are rating, reason, side effects, comments, sex, age, dosage, and date. This dataset is very huge as it contains thousands of drug reviews for many types of drugs in the market. Both datasets do not have positive or negative labels for the reviews. Therefore, the rating of drugs is used to label the drug review data either positive, neutral, or negative for training the model. Positive reviews are labelled as 1, neutral review as 0, and negative review as −1.

Furthermore, another important dataset for this study is the drug details data from Drugs.com and Drug Bank Online. Both datasets include all the necessary information about drugs, which is extracted and shown to healthcare professionals for their reference. Data from both of these sources are combined together into a single drug dataset to be used for Dr. Drugs website. Examples of the data that can be retrieved from these datasets are generic names, dosing information, drug characteristics, associated conditions, and adverse effects.

3 Results and Analysis

3.1 Sentiment Analysis Model

The sentiment analysis model is developed to classify drug reviews of patients into positive, neutral, and negative classes. There are several steps involved in developing this model. The first step is data retrieval from the selected databases. For the sentiment analysis model, only review and rating attributes were used. The preprocessing tasks that were performed on the reviews are stop words removal, special characters removal, removing white spaces, transforming text to lower case, and stemming. A machine learning approach is employed to create this model, necessitating a labeled dataset. However, the existing dataset lacked labels for every review, although it did contain rating information for each entry. The ratings data were transformed into labels as follows: Ratings higher than 7 were classified as positive and given a label of 1. Ratings lower than 4 were classified as negative and assigned a label of −1. Ratings falling between 4 to 7 were considered as neutral and labeled as 0. Table 1 shows ratings data with their respective labels.

Table 1.

Drug Ratings and Labels

Rating	Label
8–10	Positive
4–7	Neutral
1–3	Negative

To handle the large dataset, it was downsized to contain 60,000 rows of data only. Out of 60,000 rows, 20,000 rows were labelled as Positive, another 20,000 as Neutral, and the remaining 20,000 as Negative. This selection ensured a balanced representation of different sentiment categories. Subsequently, only this reduced dataset was utilized for the subsequent modelling tasks. Then, the dataset was split into two parts: 80% for training data and 20% for testing data. This division allowed for the model to be trained on a majority of the data while preserving a separate subset for evaluating its performance. After the completion of data splitting, the feature extraction process was applied to both the training and testing datasets. Two distinct algorithms were employed for this purpose: TF-IDF and Bio BERT. Bio BERT is an adapted version of the BERT algorithm designed specifically for biomedical text analysis. It undergoes pre-training on a vast collection of biomedical literature and clinical text to acquire a deep understanding of word and sentence contexts. By learning from this extensive corpus, Bio BERT gains the ability to generate contextualized representations of biomedical terms and phrases. Once the dataset has been transformed into vectors, the next step involves training the sentiment analysis models. For this task, three distinct algorithms were employed: Logistic Regression, Perceptron, and LSTM (Long Short-Term Memory). A total of six sets of models were developed for sentiment analysis. These models were created by combining two feature extraction algorithms with the three aforementioned training algorithms. This approach allowed for a comprehensive exploration of various combinations, enabling the comparison and evaluation of the different feature extraction techniques and training algorithms for sentiment analysis. Grid search was applied to determine the optimal parameters for training each model across all the training algorithms. The resulting models were then evaluated, and their accuracy, macro-average f1-score, and macro-average precision scores were recorded. Table 2 summarizes and compares the performance metrics of the trained models.

Table 2.

Performance Comparison for Different Sentiment Analysis Algorithms

Algorithm	Bio Bert			TF-IDF
Algorithm	Accuracy	F1-Score	Precision	Accuracy	F1-Score	Precision
Logistic Regression	0.5719	0.5701	0.5708	0.7441	0.7449	0.7444
Perceptron	0.5452	0.5499	0.5252	0.7274	0.7261	0.7264
LSTM	0.5651	0.5659	0.5677	0.7319	0.7310	0.7314

Based on the results shown in Table 2, it is evident that the models utilizing the TF-IDF vectorizer for feature extraction generally outperformed the Bio BERT-based models. Among the TF-IDF models, the Logistic Regression algorithm demonstrated the highest performance compared to Perceptron and LSTM. It achieved an accuracy of 0.7441, an F1-score of 0.7449, and a Precision of 0.7444 respectively. Based on these findings, to develop an effective sentiment analysis model, the TF-IDF algorithm is chosen for feature extraction, while Logistic Regression is selected as the preferred model training algorithm.

The best trained sentiment analysis model and vectorizer are converted into.pkl files and incorporated into the backend, which utilizes FastAPI. This model serves two purposes within the system. Firstly, when an administrator adds a new drug to the system, all the accompanying reviews uploaded for that drug are automatically classified according to their sentiment using the model. This allows for the categorization of reviews as positive, neutral, or negative. Secondly, healthcare professionals are given the option to contribute new reviews for any specific drug, which are also subjected to sentiment classification using the trained model. Before prediction, each review undergoes pre-processing similar to the data cleaning process applied to the dataset. By employing this model, the system ensures that all newly added reviews, whether uploaded by administrators or healthcare professionals, undergo sentiment analysis to provide valuable insights into the sentiment associated with specific drugs. This enables the assignment of a sentiment rating to each drug. A sample interface for the sentiment analysis is illustrated in Fig. 6.

3.2 Content-Based Filtering Model

To provide drug recommendations based on similarity, the system incorporates a content-based filtering model. This model analyses the characteristics of drugs to identify similarities and make recommendations. When a user views a specific drug, the model suggests other drugs that are similar to the one being viewed. The development of this model involves several steps, starting with the acquisition of dataset containing drug information and details. The drugs data used for this model includes the following specific attributes: Generic Name, Brand Name, Common Side Effects, Description, Dosage, Food Interaction, Indications, Ingredients, Manufacturer, Price, Rating, and Things to Avoid. All these attributes are important to determine the similarity between drugs. After reading the data, the data undergoes several text pre-processing steps. Since all data is in string format, the text pre-processing steps include conversion to lower case, removal of stop words, removal of special characters and lemmatization. After these pre-processing steps, the individual attributes of each drug are joined together to create a corpus.

Each drug in the database has its own corpus consisting of its attributes. Then the TF-IDF algorithm is used to perform feature extraction on these corpuses. All corpora are converted to vectors. In order to find similar drugs for a given drug, attributes of that drug undergo text preprocessing, and are then combined to create a corpus as well. This corpus is subsequently vectorized using the TF-IDF algorithm, transforming it into a numerical representation. After completing these steps, the similarity between drugs is determined. The Cosine Similarity algorithm is employed for this purpose. Both the vectors of drugs in the database and the vector of the drug being viewed are fitted into the Cosine Similarity algorithm. The corpus of the drug being viewed is compared with the corpuses of all other drugs in the database and the similarity scores of each of the drugs is computed. These drugs are then sorted based on ranking using the argsort() function depending on the similarity score. The indices of the top 5 drugs are retrieved based on the similarity scores. Using these indices, the corresponding drugs are retrieved from the database. Essentially, this approach constitutes a content-based recommender system, where drugs being viewed by users are compared with the rest of the drugs in the database. The top five most similar drugs are recommended to the user based on this comparison.

During the implementation of the model into the system, an initial step involves generating TF-IDF vectors for all the drugs beforehand. The resulting TF-IDF vectors for all drugs are then stored in a.pkl file for easy retrieval. When a user interacts with the system’s user interface and selects a specific drug, the TF-IDF vectors of all drugs in the database are loaded from the.pkl file. These vectors, along with the vector representation of the drug being viewed, are then fitted to the cosine similarity algorithm. Finally, the recommended drugs are retrieved from the model and displayed to the user. Whenever a new drug is added to the database, the TF-IDF vectors associated with the drug data are regenerated. Subsequently, the existing.pkl file is updated to include these new vectors. Figure 7 and 8 depict the user interface of the content-based filtering model.

As shown in Fig. 7, the focus is on the drug ‘Eletriptan’, commonly used for migraines. The top three recommended drugs are ‘Rizatriptan,’ ‘Sumatriptan,’ and ‘Venlafaxine,’ which are also used for migraine treatment. The remaining two drugs are used for different purposes but are the most similar options available to ‘Eletriptan’ within the limited database of only four drugs for migraine treatment. Another example is demonstrated in Fig. 8, where the drug under consideration is ‘Cephalexin,’ utilized for treating ‘bladder infection.’ The top two recommended drugs are ‘Nitrofurantoin’ and ‘Ciprofloxacin,’ which are also commonly used for ‘bladder infection.’ As there are only three drugs in the database for ‘bladder infection,’ the remaining displayed drugs represent the closest alternatives compared to the others in the database. This successful outcome illustrates the effective functionality of the content-based filtering model.

3.3 Collaborative Filtering Model

The collaborative filtering model has been developed to recommend drugs to users based on their similarity to other users. Data for this model is collected directly from the system, capturing user behavior such as drug views, reviews, and additions to the wish list. These interactions, which can be considered as user ratings, are stored in the database and used to construct the collaborative filtering model. The initial steps involve loading the user behavior data and drug data, followed by preprocessing, and encoding. To ensure a comprehensive set of drugs is considered for recommendations, the unique drugs from both datasets are combined. A Label Encoder is then utilized to assign numeric IDs to each drug, facilitating efficient processing of categorical drug names. The core element of the collaborative filtering approach is the user-item matrix. An empty matrix is created to store user ratings for drugs. The user interactions with drugs are assigned different weights: adding drugs to the wish list is given a weight of 2.0, viewing a drug is given a weight of 1.0, and reviewing a drug is given a weight of 3.0. These weightings reflect the importance placed on each user’s behavior. Using the drug encodings and assigned weights, the user ratings are populated in the user item matrix. Some drugs may have missing ratings, indicating that no users have rated those specific drugs. Handling missing ratings is crucial to address the sparsity of the user-item matrix. If a drug has not been rated by any user, its rating is replaced with the average rating of all drugs. This step ensures that the model can provide reasonable recommendations for unrated drugs based on collective user behavior.

Once the user-item matrix is prepared, it undergoes transformation using Singular Value Decomposition (SVD), which is a dimensionality reduction technique. The TruncatedSVD class from scikit-learn is employed to perform SVD on the user-item matrix, allowing for lower-dimensional representations. After transforming the matrix, the next step involves calculating the similarity between users. Cosine similarity is used to measure the similarity between the transformed user-item matrix representations of different users. This similarity matrix captures the extent of similarity between users’ behaviors in drug ratings. Moving on to the recommendation process, similarity scores between the user in need of recommendations and other users in the system are calculated. These scores are utilized to identify similar users based on their similarity to the target user. Drug ratings from similar users are collected, forming the basis for generating recommendations for the target user. Recommendations are based on drugs that similar users have rated, but the target user has not yet rated. Finally, the inverse transform function is applied to the encoder to retrieve the recommended drug names based on their IDs. The code snippet for the recommendation process is shown below.

The collaborative filtering model is seamlessly integrated into the system, streamlining the data loading and user item matrix transformation steps. The user item matrix is transformed and stored in a.pkl file for efficient retrieval. When a recommendation is requested for a particular user, only the user index is needed to retrieve their similarity scores. Using these similarity scores, the model identifies users who exhibit similar preferences and behaviors. It then collects drug ratings from these similar users, which are aggregated to determine the overall ratings for each drug. Based on these aggregated ratings, the drugs are sorted to prioritize the most highly recommended ones. The updated recommendations are displayed to the user in the main page of the system, specifically in a dedicated section titled “Recommended Drugs for You,” as depicted in Fig. 9. Each user in the system will receive a personalized set of recommended drugs based on their individual behaviors. For newly registered users, recommendations would not be available initially. However, once the model is trained again and they start to interact with the system, they start receiving personalized recommendations. It is important to note that recommended drugs for similar users dynamically adapt over time, reflecting their own behaviors as well as those of other users within the system.

4 Discussion and Conclusion

The goal of this study was to develop a drug recommendation system for healthcare professionals to aid them in selecting the best drug for their patients. For this reason, sentiment analysis and hybrid content-based and collaborative filtering algorithms were implemented. Currently, there is no other recommender system developed for the usage of healthcare professionals, which makes the system unique and very important. All three objectives of the study are fulfilled, which are to classify drug reviews based on sentiment using sentiment analysis, to assist health professionals better understand drug effectiveness, and to develop a web-based drug recommendation system for healthcare professionals using hybrid filtering technique and to facilitate active involvement and knowledge sharing among healthcare professionals. The system was successfully implemented and tested.

The implementation strategy adopted for this study follows a bottom-up approach. This means that the entire system is divided into smaller sub-parts, which are built and tested independently before being integrated into a complete system. Each module represents a distinct feature of the system and developing them separately offers several benefits. By developing the modules independently, the possibility of errors is decreased, and the complexity of the entire system is effectively managed. This system’s development was carried out methodically, with a focus on finishing, testing, and debugging every component before integrating it into the main application. The first step was the development of the main user interface (UI).

Three machine learning models were then developed and integrated into the system, with thorough testing carried out at each step. This methodical integration process made certain that each model’s usability and functionality were carefully evaluated before continuing. In addition, additional features were developed individually so they could be tested and improved before being integrated into the overall system. The main benefit of this bottom-up approach was that it reduced the risks involved in developing a large, complex system. Any concerns or issues that arose may be readily addressed and handled by breaking it down into smaller, manageable components. Throughout the development process, we ensure that each module matched the required criteria for performance and usability by frequent testing and debugging. This approach not only enhanced the overall functionality of the system but also allowed for better flexibility in adjusting and refining individual components before integrating them into the overall system.

The first part of the system development includes the development of three different intelligent computing models. The first model is a sentiment analysis model which is developed using Bio Bert, TF-IDF, Logistic Regression, Perceptron and LSTM algorithms. This model was used to classify each drug review according to polarity positive, neural, and negative. The second model is content-based filtering model which was developed using TF-IDF algorithm and cosine similarity algorithm. This model is used to recommend similar drugs to users based on the current drug they are viewing. The final model is collaborative filtering algorithm, which was developed using cosine similarity algorithm and Singular Value Decomposition (SVD) algorithm. This model was used to recommend drugs to users based on similar users which was determined using user behavior in the system. All the three developed models were working well without any issues, which meant that the first and second objectives were met.

The second part of system development focused on a web application called Dr.Drugs. The web application was developed using FARM (FastAPI, React, MongoDB) stack and Material UI library. A three-layer architecture was used to develop the system. Overall, the developed system has achieved the requirements planned during the system requirement and analysis phase. The development methodology used to develop this system is a bottom-up approach where all the small components of modules were developed first and then integrated into a bigger module. The entire system was tested in terms of unit testing, integration testing and system testing, and it can be said that the system passed all the tests without any issues and the expected results were achieved. This indicates that the developed web application is a fully functional website. In the web application, additional features like drug comparison, forums and adding drugs to wish list were added. These features are added to give more assistance for users when selecting a drug. The forums feature helps to achieve the third objective. This feature helps healthcare professionals to communicate with each other, expand their knowledge and expand drug knowledge.

Unit testing, integration testing, and system testing were done to ensure the proposed system is free of bugs and functions efficiently as intended. Based on the test results, it can be concluded that the proposed system has met the expected requirements that were established during the system requirement and analysis phase. The system is able to function under different input conditions without any major errors. This is due to the fact that unit testing and integration testing was done very carefully for every module. Thus, Dr.Drugs web application is a fully functionally web application that meets its objectives and requirements.

In terms of strengths, the system’s unique features of content-based and collaborative filtering algorithms recommendation are its primary strength. There are limited recommender systems focused for healthcare professionals on recommending drugs [12]. The collaborative model is able to recommend personalized drugs to users based on their user behavior while the content-based filtering model is able to recommend similar drugs based on drugs being viewed accurately. Moreover, sentiment analysis is another strength of the system. Drug reviews are usually not utilized to the fullest potential. By implementing sentiment analysis on this system, healthcare professionals are able to understand drug effectiveness in various patients. This is another unique feature of the proposed system. Besides that, the system also includes a forum page feature. This is implemented to facilitate communication between healthcare professionals. By doing so, healthcare professionals will be able to share drug knowledge with each other.

In terms of limitations, the system does not provide specific information based on the healthcare professional’s specialization. Thus, personalized recommendations based on healthcare professionals’ specialization could not be made. Moreover, another limitation is drug data added to the system could not be edited. Once admin have added data to the system, they are only able to delete the drug data do not edit it. Additionally, when adding drug data, admin can upload reviews, but the file must be in csv file in a specific format. If not, review data would not be added correctly.

As future work, there are some suggestions that can be made to the system to improve it even further. The first suggestion is implementing a personalized, patient-centric, or disease specific drug recommendation. Currently, the system recommends drugs based on similar drugs and similar user behavior. This recommendation can be improved to include the patient or disease-based drug recommendation as well. This will assist healthcare professionals to select drugs that are specific to a patient as different patient will have different reaction or effectiveness with different drug.

Moreover, another feature that can be included is the drug availability feature. This feature will allow healthcare professionals to know the location at which they can purchase the drugs they need. The system needs to show all the pharmacy locations that are currently selling a particular drug.

Disclosure of Interests

There is no conflict of interest.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Vorheriges Kapitel A Hybrid Images Deep Trained Feature Extraction and Ensemble Learning Models for Classification of Multi Disease in Fundus Images

Nächstes Kapitel Enhancing Arrhythmia Diagnosis with Data-Driven Methods: A 12-Lead ECG-Based Explainable AI Model

Keikhosrokiani, P.: Perspectives in the Development of Mobile Medical Information Systems: Life Cycle, Management, Methodological Approach and Application (2019)

Keikhosrokiani, P., Kamaruddin, N.S.A.B.: IoT-based in-hospital-in-home heart disease remote monitoring system with machine learning features for decision making. In: Mishra, S., González-Briones, A., Bhoi, A.K., Mallick, P.K., Corchado, J.M. (eds.) Connected e-Health. Studies in Computational Intelligence, vol. 1021, pp. 349–369. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-97929-4_16

Augustine, C.A., Keikhosrokiani, P.: A hospital information management system with habit-change features and medial analytical support for decision making. Int. J. Inf. Technol. Syst. Approach (IJITSA) 15, 1–24 (2022). https://doi.org/10.4018/IJITSA.307019CrossRef

Keikhosrokiani, P., Mustaffa, N., Zakaria, N.: Success factors in developing iHeart as a patient-centric healthcare system: a multi-group analysis. Telematics Inform. 35, 753–775 (2018). https://doi.org/10.1016/j.tele.2017.11.006CrossRef

Keikhosrokiani, P., Mustaffa, N., Zakaria, N., Abdullah, R.: Assessment of a medical information system: the mediating role of use and user satisfaction on the success of human interaction with the mobile healthcare system (iHeart). Cogn. Technol. Work 22, 281–305 (2020). https://doi.org/10.1007/s10111-019-00565-4CrossRef

Jinjri, W.M., Keikhosrokiani, P., Abdullah, N.L.: Machine learning algorithms for the classification of cardiovascular disease- a comparative study. In: Proceedings of the 2021 International Conference on Information Technology, ICIT 2021 (2021)

Silvén, H., Savukoski, S.M., Pesonen, P., et al.: Association of genetic disorders and congenital malformations with premature ovarian insufficiency: a nationwide register-based study. Hum. Reprod. 38, 1224–1230 (2023). https://doi.org/10.1093/humrep/dead066CrossRef

Garg, S.: Drug recommendation system based on sentiment analysis of drug reviews using machine learning. In: 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence), pp. 175–181 (2021)

Zhao, X., Keikhosrokiani, P.: Sales prediction and product recommendation model through user behavior analytics. Comput. Mater. Continua 70, 3855–3874 (2022). https://doi.org/10.32604/cmc.2022.019750

10.

Xian, Z., Keikhosrokiani, P., XinYing, C., Li, Z.: An RFM model using k-means clustering to improve customer segmentation and product recommendation. In: Keikhosrokiani, P. (ed.) Handbook of Research on Consumer Behavior Change and Data Analytics in the Socio-Digital Era, pp. 124–145. IGI Global, Hershey, PA, USA (2022)

11.

Bhat, S., Aishwarya, K.: Item-based hybrid recommender system for newly marketed pharmaceutical drugs. In: 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp 2107–2111 (2013)

12.

Keikhosrokiani, P., Fye, G.M.: A hybrid recommender system for health supplement e-commerce based on customer data implicit ratings. Multimed. Tools Appl. (2023). https://doi.org/10.1007/s11042-023-17321-6

13.

Cavalcanti, D., Prudêncio, R.: Aspect-based opinion mining in drug reviews. In: Oliveira, E., Gama, J., Vale, Z., Cardoso, H.L. (eds.) Progress in Artificial Intelligence, pp. 815–827. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65340-2_66CrossRef

14.

Chugh, M., Chugh, N.: A deep drive into software development agile methodologies for software quality assurance. In: Agile Software Development, pp. 235–255 (2023)

15.

Kayanda, A., Busagala, L., Oyelere, S., Tedre, M.: The use of design science and agile methodologies for improved information systems in the Tanzanian Higher Education context. Electron. J. Inf. Syst. Dev. Countries 89, e12241 (2023). https://doi.org/10.1002/isd2.12241

16.

Keikhosrokiani, P.: Big Data Analytics for Healthcare: Datasets, Techniques, Life Cycles, Management, and Applications (2022)

17.

Keikhosrokiani, P.: Handbook of research on consumer behavior change and data analytics in the socio-digital era (2022)

18.

Keikhosrokiani, P., Asl, M.P.: Handbook of Research on Artificial Intelligence Applications in Literary Works and Social Media (2022)

19.

Suhendra, N.H.B., Keikhosrokiani, P., Asl, M.P., Zhao, X.: Opinion mining and text analytics of literary reader responses: a case study of reader responses to KL noir volumes in goodreads using sentiment analysis and topic. In: Keikhosrokiani, P., Pourya, A.M. (eds.) Handbook of Research on Opinion Mining and Text Analytics on Literary Works and Social Media, pp. 191–239. IGI Global, Hershey, PA, USA (2022)

20.

Keikhosrokiani, P., Asl, M.P.: Handbook of Research on Opinion Mining and Text Analytics on Literary Works and Social Media. IGI Global (2022)

Titel: Drug Recommendation System for Healthcare Professionals’ Decision-Making Using Opinion Mining and Machine Learning
verfasst von: Pantea Keikhosrokiani
Katheeravan Balasubramaniam
Minna Isomursu
Verlag: Springer Nature Switzerland
Buch: Digital Health and Wireless Solutions
Print ISBN: 978-3-031-59090-0

Electronic ISBN: 978-3-031-59091-7

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-3-031-59091-7_15

Springer Professional