Data Sets & Its types

Data Set - collection of data objects

Data objects can be of different types—quantitative or qualitative

Other names for a data object are record, point, vector, pattern, event, case, sample, observation, or entity.

Data objects are described by a number of attributes that capture the basic characteristics of an object.

Other names for an attribute are variable, characteristic, field, feature, or dimension.

An attribute is a property or characteristic of an object that may vary, either from one object to another or from one time to another.

A measurement scale is a rule (function) that associates a numerical or symbolic value with an attribute of an object.

The properties of an attribute need not be the same as the properties of the values used to measure it.

Four types of attributes:

Categorical (Qualitative)

Nominal

just different names

Eg. Zip Codes, Employee ID

Ordinal

ordering objects

Eg. Grades, Rating

Numeric (Quantitative)

Interval

difference between values

Eg. Dates, Temperatures

Ratio

both differences & ratio

Eg. Age, Length

Value of Attribute

Discrete

Discrete attributes are often represented using integer variables.

Binary attributes are a special case of discrete attributes and assume only two values (eg. yes/no, 0/1)

Continuous

Continuous attributes are typically represented as floating-point variables

Type of Datasets

Record data
Graphical data
Ordered data

Characteristic of Datasets

Dimensionality

No of attributes that the objects in the dataset possess.

Sparsity

Fewer than 1% of the entries are non zero.

Resolution

obtain data at different levels of resolution, and often the properties of the data are different at different resolutions.

The difficulties associated with analyzing high-dimensional data are sometimes referred to as the curse of dimensionality.

Record Data

collection of records (data objects), each of which consists of a fixed set of data fields (attributes).

Types of Record data

Transaction or Market Basket data
Data Matrix
Sparse Data Matrix

Graphical data

the graph captures relationships among data objects

the objects contain sub objects that have relationships, then such objects are frequently represented as graphs.

Ordered data

the attributes have relationships that involve order in time or space

Types of Ordered data

Sequential data (Temporal data)

record data associated with time

Sequence data

data set that is a sequence of individual entities, such as a sequence of words or letters.

there are no time stamps; instead, there are positions in an ordered sequence.

Time Series data

sequential data in which each record is a time series

Spatial data

Some objects have spatial attributes, such as positions or areas, as well as other types of attributes.

The Data Ilm

Search This Blog

Data Sets & Its types

Labels

Popular posts from this blog

Exercise 1 - Amdahl's Law

Exercise 2 - Amdahl's Law

Gaussian Elimination - Row reduction Algorithm