This contains brief description of the key terms for this survey.
Quantifies the degree to which an individual’s network exclusively consists of other similar or different individuals.
Fundamentally, it is a function that takes data with specific features and categorizes them based on these features.
A machine learning classifier. Used to classify items in a dataset from a tree-model constructed from features. An item is classified by making a decision at each node of a tree on whether said items meets certain standards -- the item is classified when a decision leads to the branch of the model.
A feature derived as a result of measurement.
Descriptive parameter(s) that enable the separation of a dataset into the categories. It is hard to be specific about features because any property of an object (its physics, geography, biology, etc.) can be a potential feature.
The propensity of individuals to connect with those who are similar to them. Perhaps the concept is best captured by the popular idiom “birds of feather flock together”. Many have attributed this social principle to shaping various interpersonal relationships such as marriages and friendships.
A machine learning classifier. Based on previously mapped classified items, this classification method classifies new items by mapping them on the same graph and assigning it the class of its nearest neighbour(s), The “k” indicates the number neighbours it considers when determining a class.
An attribute of an individual that is not overtly stated, either biographical (such as age, gender, geography) or personal (dietary choices, political affiliation).
A machine learning classifier. A probability classification method that calculates the probability of an item in a dataset belonging to a certain class depending on the its features.
A machine learning classifier. A layered classification method with each layer being composed of nodes. A function is applied to an input data item at each of these layers, the output feeding into subsequent layers as input. Depending on the input data, certain nodes will be signaled at each layer. Depending on the final node signaled, a class will be assigned.
Prominent for the purposes of natural language processing, they are sequences of “n” number of consecutive words (bigrams maintain 2 consecutive words, trigrams 3 consecutive words, etc.)
A feature that reflects inner properties of an data item.
A machine learning classifier. This binary classification method plots items in a dataset on a graph based on their features. The SVM then tries to find the best way to separate the items in each class on the graph -- items plotted on certain parts of the graph fall in a certain class.