Artificial intelligence (AI) has transformed every industry worldwide, including dentistry. While ever more dental professionals attempt to adopt AI in studies and daily routines, most AI researchers come from engineering fields, leading to potential knowledge gaps in these techniques. This article aims to bridge that gap by presenting AI’s fundamental principles and their applications to clinicians. With minimum essential engineering knowledge, the article will help dentists recognize the basic mechanisms, benefits, and limitations associated with modern AI. Thereupon, clinicians can harness the power of this technological tool to enhance their daily practices and research effectively while avoiding potential pitfalls.
AI fundamentals: what, why, and how
Artificial intelligence, machine learning, and deep learning
Artificial intelligence (AI), in its literal sense, represents a methodology where humans employ computers and machines (artificial) that emulate or simulate human thinking processes (intelligence) to address issues. Although there is no exact agreement on the definition, it can be traced back to the 1950s and is not a new idea. While AI encompasses all programs capable of executing or simulating human tasks, traditional AIs are predominantly “rule-based”. Humans provide computer programs with a defined set of rules or instructions which the programs follow to execute specific tasks.
Take, for example, a simple if-else program designed for specific decision-making processes. A program can be designed for diagnoses according to the periodontal disease diagnostic guideline, which takes clinical examinations as inputs, such as probing depths, bleeding conditions, and tooth mobility. By strictly adhering to the given instructions, such a program can provide diagnoses as outputs. In its simple replication of the clinician’s decision-making process, this program qualifies as a basic form of AI.
Machine learning (ML) is a subset of AI. Rather than relying on explicit, predetermined logic between input-output relationships, ML algorithms go one step further by enabling computers to learn from and make decisions or predictions based on data. ML comes into play when AI algorithms are not governed by manually crafted rules for every possible input; instead, the algorithms are designed to discern hidden information, patterns, or correlations within data and establish the desired input-output behavior (Jordan and Mitchell 2015). Unlike traditional rule-based methods, ML algorithms are often referred to as “model-based” algorithms. Such approaches have proven powerful across various tasks; thus, modern AI generally refers to ML algorithms or methods incorporating at least one ML algorithm.
For example, ML algorithms can discern the correlation between clinical raw data and various risk factors and treatment prognoses. In this scenario, the input consists of medical and dental records and one or more algorithmic structures (models) without predefined parameters. Instead of relying on preset conditions, the program learns from the data, optimizes the parameters, and evolves into a predictive model. This model can perform classification, regression, clustering, or decision-making tasks. A strong mathematics foundation is typically crucial for designing an effective approximation model.
Deep learning (DL) in the scope of machine learning has emerged as a significant group of algorithms driven by advances in computational power. Inspired by the cognitive systems of biological brains, DL models are crafted to autonomously learn and discover abstract representations from data by training on vast amounts of information. (Goodfellow et al. 2016; LeCun et al. 2015). This learning is facilitated by unique algorithmic structures known as deep artificial neural networks, a series of interconnected layers of nodes, or “neurons,” each performing specific linear and non-linear operations as part of the overarching computation. The term “deep” in deep learning denotes the number of layers within these neural networks. Due to their complex non-linear nature, DL models can approximate nearly all algorithms by adjusting algorithmic parameters based on given data. Consequently, it has successfully processed unstructured data such as images, videos, and texts.
For instance, in a medical image recognition task, the inputs are typically clinical images paired with their corresponding diagnoses, known as “labels.” When an image is introduced into the network, it traverses multiple layers, each identifying distinct features or patterns. This neural network can be envisioned as a sophisticated algorithm that processes image features progressively, layer by layer. The initial layer might recognize basic features like edges and curves. The next discerns more intricate shapes formed by these foundational elements, such as circles or squares. Subsequent layers can then identify combinations of these shapes that form specific patterns. The terminal layers consolidate the information from all preceding layers to interpret these features, ultimately determining the content of the image. Possible output can be the classification or interpretation of various pathological or radiological patterns.
Regardless of the specific subgroups of AI approaches (AI/ML/DL), their primary purpose remains to approximate real-world situations and observed data through statistical and mathematical techniques. In summary, AI encompasses all methodologies that employ computer programs to address problems. Traditional algorithms, which are hand-engineered, rely on predetermined logic, features, and parameters, while machine-learning algorithms adapt and fine-tune these features and parameters based on targeted datasets. When a program incorporates deep neural networks as a component of its algorithm, it is classified as a deep learning program.
Training policies: supervised, unsupervised, and reinforcement learning
As previously outlined, AI can be understood as the algorithmic or mathematical approach designed to simulate real-world problems. Within this context, it is worth introducing a fundamental term used in the AI/ML/DL realm: “training” the algorithm. This refers to the process wherein an algorithm is “trained” to perform a task by optimizing the parameters within its inherent algorithmic structure.
The learning methods can be broadly categorized into three main types – supervised, unsupervised, and reinforcement learning – based on their unique learning processes and use cases (Alloghani et al. 2020). Datasets are also divided for different purposes within a given program. The dataset used in the training phase is termed the “training dataset,” the other portion used to test the algorithms’ efficacy is called the “testing dataset.” The training procedures’ specifics pivot on the model’s learning method.
Supervised learning is similar to learning under the guidance of a teacher or supervisor. This means the correct answers (labels) are provided alongside the data during the learning phase. This approach trains algorithms on datasets, including the inputs and their corresponding labeled outputs. In every training cycle, the parameters are updated with the widely employed backpropagation technique. This technique leverages the gradients or derivatives of the loss function, which quantifies the discrepancy between the predicted output and the actual label. After training sessions, the algorithm discerns the relationship between the inputs and the outputs, allowing it to predict the outcome for new instances (Mahesh, 2015).
For instance, a supervised learning algorithm can be trained on a dataset of dental radiographs. In this scenario each periapical image (the input) comes labeled with a specific dental disease diagnosis (the output). By studying this dataset, the algorithm learns to correlate particular features within the X-ray images with their corresponding diagnoses, thus enabling it to predict diagnoses for new, unlabeled photos.
Unsupervised learning, in contrast, resembles the learning process without explicit guidance or supervision (Hastie et al. 2009). Algorithms are provided with datasets without being labeled. Their primary objectives are to identify underlying patterns and correlations within the data that are therefore suitable for clustering, grouping similar data, and mining associations. Unsupervised learning is also often used to categorize unlabeled data, setting the stage for subsequent supervised learning tasks. In a dental context, unsupervised learning algorithms can be used to analyze datasets of patient records and cluster the patients based on similarities within their records, helping dentists identify patterns or trends in diverse oral health conditions.
Reinforcement learning in unsupervised learning is more like learning through trial-and-error search. Like a child exploring its environment, an agent in reinforcement learning makes decisions and takes action by interacting with its environment (Sutton & Barto, 2018). Each action either results in a reward or a penalty. Without the presence of direct supervision, the agent optimizes its policy or strategy over time in an attempt to maximize the cumulative reward of its actions.
Thanks to its innate ability to perceive and interact with the environment, reinforcement learning has been adopted in multiple real-world interactive decision-making situations. Notable examples include AlphaGo and AlphaZero, which integrated reinforcement learning into their training policies. The realm of autonomous vehicles is another burgeoning field with a fewcommercial applications (Kiran et al. 2021).