Glossary

Assortativity

Quantifies the degree to which an individual’s network exclusively consists of other similar or different individuals.

Classifier

Fundamentally, it is a function that takes data with specific features and categorizes them based on these features.

Decision Tree

A machine learning classifier. Used to classify items in a dataset from a tree-model constructed from features. An item is classified by making a decision at each node of a tree on whether said items meets certain standards -- the item is classified when a decision leads to the branch of the model.

Dynamic Feature

A feature derived as a result of measurement.

Features

Descriptive parameter(s) that enable the separation of a dataset into the categories. It is hard to be specific about features because any property of an object (its physics, geography, biology, etc.) can be a potential feature.

Homophily

The propensity of individuals to connect with those who are similar to them. Perhaps the concept is best captured by the popular idiom “birds of feather flock together”. Many have attributed this social principle to shaping various interpersonal relationships such as marriages and friendships.

K-Nearest Neighbour

A machine learning classifier. Based on previously mapped classified items, this classification method classifies new items by mapping them on the same graph and assigning it the class of its nearest neighbour(s), The “k” indicates the number neighbours it considers when determining a class.

Latent Attribute

An attribute of an individual that is not overtly stated, either biographical (such as age, gender, geography) or personal (dietary choices, political affiliation).

Naive Bayes

A machine learning classifier. A probability classification method that calculates the probability of an item in a dataset belonging to a certain class depending on the its features.

Neural Network

A machine learning classifier. A layered classification method with each layer being composed of nodes. A function is applied to an input data item at each of these layers, the output feeding into subsequent layers as input. Depending on the input data, certain nodes will be signaled at each layer. Depending on the final node signaled, a class will be assigned.

N-grams

Prominent for the purposes of natural language processing, they are sequences of “n” number of consecutive words (bigrams maintain 2 consecutive words, trigrams 3 consecutive words, etc.)

Static Feature

A feature that reflects inner properties of an data item.

Support Vector Machine

A machine learning classifier. This binary classification method plots items in a dataset on a graph based on their features. The SVM then tries to find the best way to separate the items in each class on the graph -- items plotted on certain parts of the graph fall in a certain class.