Understanding Artificial Intelligence: A Comprehensive Glossary of Terms and Definitions

August 24, 2019

Artificial Intelligence Glossary

It’s not a matter of surprise that the world is moving ahead with fast pace due to marvels of artificial intelligence. The technology has added new values and innovation to our personal and professional lives.

The sudden change can be daunting at times but on an optimistic edge the AI technology has complemented humankind with certain new aspects. It has given some new terms to our daily vocab which we haven’t heard of before. Artificial Intelligence has also given new meanings to prevailing terms altogether.

The perks of AI serving current generation are obvious implication that the technology is here to transform lives and will stay for sure. Here is the glossary of trending AI terms and frameworks that are break down into much simplified words. The comprehension will enable you explore the various aspects of this technology and how it avails its benefits to its customers.


Accuracy – Refers to the percentage of correct predictions the classifier made.

Adversarial Machine Learning – A research field that lies at the intersection of machine learning and computer security. It aims to enable the safe adoption of machine learning techniques in adversarial settings like spam filtering, malware detection, and biometric recognition.

Adversarial Example – A very specific transformation of an image, typically featuring very small, deliberate changes to an image that can completely disrupt a previously tuned classifier.

Application Programming Interface (API) – A set of commands, functions, protocols, and objects that programmers can use to create software or interact with an external system.

Artificial General Intelligence (AGI) – AGI is a computational system that can perform any intellectual task a human can. Also called “Strong AI.” At this point, AGI is fictional.

Artificial Intelligence (or Weak AI) – A computational system that simulates parts of human intelligence but focuses on one narrow task. Also called narrow AI, in contrast to AGI.

Artificial Neural Network – A model for AI and machine learning inspired by the neural network configurations of the human central nervous system, especially the brain.


Brute Force Search – A search that isn’t limited by clustering/ approximations; it searches across all inputs. Often more time-consuming and expensive but more thorough.


Content Moderation – The practice of monitoring and applying a predetermined set of rules and guidelines, especially to user-generated submissions, to determine best if the communication of the input is permissible.

Convolutional Neural Network – Convolutional neural networks are deep artificial neural networks that are used primarily to classify images (e.g. name what they see), cluster them by similarity (photo search), and perform object recognition within scenes.

CPU (Central Processing Unit) – The electronic circuitry within a computer that carries out the instructions of a computer program by performing the basic arithmetic, logical, control, and input/output (I/O) operations specified by the instructions.

Custom Model – A small artificial neural network which takes inputs particular to a user, such as images or videos of their products, and returns predicted concepts, based on what the model is trained to see in the inputs.

Custom Training – The process of teaching a model to make certain predictions.


Data – Any collection of information converted into a digital form.

Data Mining – The process by which patterns are discovered within large sets of data with the goal of extracting useful information from it.

Deep Learning – The general term for to machine learning using layered (or deep) algorithms to learn patterns in data. It is most often used for supervised learning problems.

Deep Neural Network – An artificial neural network (ANN) with multiple layers between the input and output layers. It uses sophisticated mathematical modelling to process data in complex ways.

Detection – To discover an event or object.

Domain Adaptation – Learning a discriminative classifier or other predictor in the presence of a shift between training and test distributions.


Explorer – A web application that allows you to preview applications.


F Score – A weighted average of the true positive rate of recall and precision.

Facial Recognition – A computer application capable of identifying or verifying a person from a digital image or a video frame from a video source. One of the ways to do this is by comparing selected facial features from the image and a face database.

False Negatives – An error where a model falsely predicts an input as not having a desired outcome, when one is actually present. (Actual Yes, Predicted No).

False Positives – An error where a model falsely predicts the presence of the desired outcome in an input, when in reality it is not present (Actual No, Predicted Yes).

Feature Extraction

1) When image features at various levels of complexity are extracted from the image data. Typical examples of such features are:

  • Lines, edges, and ridges.
  • Localized interest points such as corners, blobs, or points.
  • More complex features may be related to texture, shape, or motion.

2) The process by which data that is too large to be processed is transformed into a reduced representation set of features such as texture, shape, lines, and edges.


Generative Adversarial Networks (GANs) – A class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks contesting with each other in a zero-sum game framework. This technique can generate photographs that look at least superficially authentic to human observers, having many realistic characteristics (though in tests people can tell real from generated in many cases).

GPU (Graphics Processing Unit) – A specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobile phones, personal computers, workstations, game consoles.


Human Workforce (“Labelers”) – Workers who can help to complete work on an as-needed basis, which for purposes usually means labelling data (images).


Image Recognition – The ability of software to identify objects, places, people, writing, and actions in images.

Image Segmentation – The process of dividing a digital image into multiple segments/fragments, with the goal of simplifying or changing the representation of an image into something that is easier to analyze. Segmentation divides whole images into pixel groupings, which can then be labelled and classified. Put simply, segmentation is to put a bounding box around the desired object in an image and do a pixel-by-pixel outline of that object, removing the background.

ImageNet – A large visual database designed for use in visual object recognition software research. Over 14 million URLs of images have been hand-annotated by ImageNet to indicate what objects are pictured; in at least one million of the images, bounding boxes are also provided.

ImageNet Challenge – A competition where research teams evaluate their algorithms on the given data set and compete to achieve higher accuracy on several visual recognition tasks.

Input – Any form of data – text, audio, code, music notation, essentially anything that can be encoded digitally.


Machine Learning (ML) – A general term for algorithms that can learn patterns from existing data and use these patterns to make predictions or decisions with new data.

Misclassification Rate – Rate used to gauge how often a model’s predictions are wrong.

Model – A processing block that takes inputs, such as images or videos, and returns predicted concepts.


Natural Language Processing (NLP) – A branch of artificial intelligence that helps computers understand, interpret, and manipulate human language. This field of study focuses on helping machines to better understand human language in order to improve human-computer interfaces with use cases like moderation, information extraction, summarization, etc.

Noise – Signals with no causal relation to the target function.

Not Suitable for Work (NSFW) – Shorthand tag used to mark certain content as being profane, offensive, and/ or otherwise potentially disturbing, which a platform may not wish to have posted on their site or may want to mark as mature.

Null Error Rate – How often one would be wrong if one always predicted the majority prediction. (e.g. if you make 100 predictions, 60 “yes” and 40 “no”, the null error rate would be 40/100=0.40 because if you always predicted yes, you would only be wrong for the 40 “no” cases).


Object Detection – A computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos. This technique also involves localizing the object in question, which differentiates it from classification, which only tells the type of object.

Object Recognition (or Object Classification) – A computer vision technique for identifying objects in images or videos.

Object Tracking – The process of following a specific object of interest, or multiple objects, in a given scene. It traditionally has applications in video and real-world interactions where observations are made following an initial object detection.

One Shot Classification – A model that only requires that you have one training example of each class you want to predict on. The model is still trained on several instances, but they only have to be in a similar domain as your training example.

On-premises Software – Software that is installed and runs on computers located on the premises of the organization using that software versus at a remote facility such as a server farm or on the cloud.

Optical Character Recognition (OCR) – A computer system that takes images of typed, handwritten, or printed text and converts them into machine-readable text.

Output – Predictions made after the input uploaded to or fed into a model are processed by the model.

Overfitting – A machine learning problem where an algorithm is unable to discern information that is relevant to its assigned task from information which is irrelevant within training data. Overfitting inhibits the algorithm’s predictive performance when dealing with new data.


Parameter – Any characteristic that can be used to help define or classify a system. In AI, they are used to clarify exactly what an algorithm should be seeking to identify as important data when performing its target function.

Precision (Recognition) – A rate that measures how often a model is correct when it predicts ‘yes.’

Predictive Model – A model that uses observations measured in a sample to gauge the probability that a different sample or remainder of the population will exhibit the same behaviour or have the same outcome.

Positive Predictive Value (PPV) – Very similar to precision, except that it takes prevalence into account. In the case where the classes are perfectly balanced (meaning the prevalence is 50%), the positive predictive value is equivalent to precision.

Prevalence – The rate of how often the “yes” condition actually occurs in a sample.

Python – An interpreted high-level programming language for general-purpose programming.


Recall (Sensitivity) – The fraction of relevant instances that have been retrieved over the total amount of relevant instances.

Recurrent Neural Network – A type of artificial network with loops in them, allowing recorded information, like data and outcomes, to persist by being passed from one step of the network to the next. They can be thought of as multiple copies of the same network with each passing information to its successor.

Regression – A statistical measure used to determine the strength of the relationships between dependent and independent variables.

Reinforcement Learning – A type of machine learning in which machines are “taught” to achieve their target function through a process of experimentation and reward receiving positive reinforcement when its processes produce the desired result and negative reinforcement when they do not. This is differentiated from supervised learning, which would require an annotation for every individual action the algorithm would take.

ROC (Receiver Operating Characteristic) Curve – This is a commonly used graph that summarizes the performance of a classifier over all possible thresholds. It is generated by plotting the True Positive Rate (y-axis) against the False Positive Rate (x-axis) as you vary the threshold for assigning observations to a given class.


Search Query – A query that a user feeds into a search engine to satisfy his or her information needs. If the query itself is a piece of visual content, then that is what is known as a “visual search query.”

Selective Filtering – When a model ignores “noise” to focus on valuable information.

Siamese Networks – A different way of classifying image where instead of training one model to learn to classify image inputs it trains two neural network that learns simultaneously to find similarity between images.

Signal – Inputs, information, data.

Software Development Kit (SDK) – A set of software development tools that allows for the creation of applications on a specific platform.

Specificity – The rate of how often a model predicts “no,” when it’s actually “no.”

Standard Classification – The process by which an input is assigned to one of a fixed set of categories. In machine learning, this is often achieved by learning a function that maps an input to a score for each potential category.

Supervised Learning –

1) A type of machine learning in which human input and supervision are an integral part of the machine learning process on an ongoing basis. In supervised learning, there is a clear outcome to the machine’s data mining and its target function is to achieve this outcome, nothing more.

2) A class of machine learning algorithms that learn patterns from outcome data. Supervised learning algorithms make predictions based on a set of examples.


Target Function – The end goal of an algorithm.

Taxonomy – The formal structure of all the types of objects within a particular domain. They can follow either a flat or hierarchical format and provide names for each object in relation to the other objects, often capturing the membership properties of each. There are usually specific, complete, consistent, and definitive rules for classifying all objects in the domain. This ensures any newly discovered object fits into one and only one category of the structure.

TensorFlow – An open-source software library also used for machine learning applications such as neural networks. It is used for both research and production at Google and was released under the Apache 2.0 open source license in 2015.

Test Data Set – In machine learning, the test data set is the data given to the machine after the training and validation phases have been completed. This data set is used to check the performance characteristics of the algorithms produced after the completion of the first two phases when presented with unknown data. This will give a good indication of the accuracy, sensitivity, and specificity of the algorithm’s predictive powers.

Torch – A scientific computing framework with wide support for machine learning algorithms, written in C and lua. The main author is Ronan Collobert, and it is now used at Facebook AI Research and Twitter.

Training Data Set – In machine learning, the training data set is the data given to the machine during the initial “learning” or “training” phase. From this data set the machine is meant to gain some insight into options for the efficient completion of its assigned task through identifying relationships between the data.

True Negatives – Actual negatives that are correctly identified as such (Actual No, Predicted No).

True Positives – Actual positives that are correctly identified as such (Actual Yes, Predicted Yes).

Turing Test – A test developed by Alan Turing 1950, used to identify true artificial intelligence. It tested a machine’s ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human.


Unsupervised Learning – A class of machine learning algorithms that learns patterns in data without knowing outcomes. Here, the machine is presented with totally unlabelled data, then asked to find the intrinsic patterns in or draw its own conclusions from the data.


Validation Data Set – The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning model hyper parameters. The evaluation becomes more biased as skill on the validation dataset is incorporated into the model configuration.

Vision Processing Unit (VPU) – As of 2016, it is an emerging class of microprocessor and a specific type of AI accelerator, designed to accelerate machine vision tasks.

Visual Recognition – The ability of software to identify objects, places, people, writing, and actions in images and videos.

Visual Search – The ability of software to find visually similar content based on an image or video query.


Web Crawler (Spider) – An internet bot that systematically browses the World Wide Web, typically for the purpose of Web indexing, copying pages for processing by a search engine which indexes the downloaded pages, allowing users to search more efficiently.

Web Scraper – The automated processes implemented using a bot or web crawler. It is a form of copying, in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.


About Analytics Insight:

Analytics Insight is an influential platform dedicated to insights, trends, and opinion from the world of data-driven technologies. It monitors developments, recognition, and achievements made by AI, big data and analytics companies across the globe. The Analytics Insight Magazine features opinions and views from top leaders and executives in the industry who share their journey, experiences, success stories, and knowledge to grow profitable businesses.

For More Information Visit: http://www.analyticsinsight.net