Introduction to Data Structures

  • A data structure is a particular way of organizing data in a computer so that it can be used effectively. The idea is to reduce the space and time complexities of different tasks.
  • The data structure is a storage that is used to store and organize data. It is a way of arranging data on a computer so that it can be accessed and updated efficiently.
  • Depending on your requirement and project, it is important to choose the right data structure for your project. For example, if you want to store data sequentially in the memory, then you can go for the Array data structure.

Types of Data Structure

Linear data structures

  • In linear data structures, the elements are arranged in sequence one after the other. Since elements are arranged in a particular order, they are easy to implement.
  • However, when the complexity of the program increases, the linear data structures might not be the best choice because of operational complexities.
  • Types of Linear Data Structures are

  • Array Data Structure

    • In an array, elements in memory are arranged in continuous memory. All the elements of an array are of the same type. And, the type of elements that can be stored in the form of arrays is determined by the programming language.
    • Note:
      • Array indices always start from 0. That is, the first element of an array is at index 0.
      • If the size of an array is n, then the last element of the array will be at index n-1.

  • Stack Data Structure

    • In the stack data structure, elements are stored in the LIFO principle. That is, the last element stored in a stack will be removed first.
    • It works just like a pile of plates where the last plate kept on the pile will be removed first. To learn more, visit Stack Data Structure.
    • There are some basic operations that allow us to perform different actions on a stack.
      • Push: Add an element to the top of a stack
      • Pop: Remove an element from the top of a stack
      • IsEmpty: Check if the stack is empty
      • IsFull: Check if the stack is full
      • Peek: Get the value of the top element without removing it

  • Queue Data Structure

    • Unlike stack, the queue data structure works in the FIFO principle where the first element stored in the queue will be removed first.
    • It works just like a queue of people at the ticket counter where the first person on the queue will get the ticket first. To learn more, visit Queue Data Structure.
    • A queue is an object (an abstract data structure - ADT) that allows the following operations:
      • Enqueue: Add an element to the end of the queue
      • Dequeue: Remove an element from the front of the queue
      • IsEmpty: Check if the queue is empty
      • IsFull: Check if the queue is full
      • Peek: Get the value of the front of the queue without removing it
    • There are four different types of queues:
      • Simple Queue: - In a simple queue, insertion takes place at the rear and removal occurs at the front. It strictly follows the FIFO (First in First out) rule.
      • Circular Queue: - In a circular queue, the last element points to the first element making a circular link. The main advantage of a circular queue over a simple queue is better memory utilization. If the last position is full and the first position is empty, we can insert an element in the first position. This action is not possible in a simple queue.
      • Priority Queue: - A priority queue is a special type of queue in which each element is associated with a priority and is served according to its priority. If elements with the same priority occur, they are served according to their order in the queue.
      • Double Ended Queue: - In a double-ended queue, insertion, and removal of elements can be performed from either the front or rear. Thus, it does not follow the FIFO (First In First Out) rule.

  • Linked List Data Structure

    • In the linked list data structure, data elements are connected through a series of nodes. And, each node contains the data items and address to the next node.
    • You have to start somewhere, so we give the address of the first node a special name called HEAD. Also, the last node in the linked list can be identified because its next portion points to NULL.
    • Here's a list of basic linked list operations
      • Traversal - access each element of the linked list
      • Insertion - adds a new element to the linked list
      • Deletion - removes the existing elements
      • Search - find a node in the linked list
      • Sort - sort the nodes of the linked list
    • There are three common types of Linked List.
      • Singly Linked List: - It is the most common. Each node has data and a pointer to the next node.
      • Doubly Linked List: - We add a pointer to the previous node in a doubly-linked list. Thus, we can go in either direction: forward or backward.
      • Circular Linked List: - A circular linked list is a variation of a linked list in which the last element is linked to the first element. This forms a circular loop. A circular linked list can be either singly linked or doubly linked.
        • for a singly linked list, the next pointer of the last item points to the first item
        • In the doubly linked list, the prev pointer of the first item points to the last item as well.

Non-linear data structures

  • Unlike linear data structures, elements in non-linear data structures are not in any sequence. Instead, they are arranged in a hierarchical manner where one element will be connected to one or more elements.
  • Non-linear data structures are further divided into a graph and tree-based data structures.

  • Graph Data Structure

    • In the graph data structure, each node is called vertex and each vertex is connected to other vertices through edges.
    • Graph Terminology
      • Adjacency: A vertex is said to be adjacent to another vertex if there is an edge connecting them. Vertices 2 and 3 are not adjacent because there is no edge between them.
      • Path: A sequence of edges that allows you to go from vertex A to vertex B is called a path. 0-1, 1-2, and 0-2 are paths from vertex 0 to vertex 2.
      • Directed Graph: A graph in which an edge (u,v) doesn't necessarily mean that there is an edge (v, u) as well. The edges in such a graph are represented by arrows to show the direction of the edge.
    • Graph Operations
      • Check if the element is present in the graph
      • Graph Traversal 
      • Add elements(vertex, edges) to graph
      • Finding the path from one vertex to another
    • Graph Representation: - Graphs are commonly represented in two ways:

    • Adjacency Matrix

      • An adjacency matrix is a 2D array of V x V vertices. Each row and column represent a vertex.
      • If the value of any element a[i][j] is 1, it represents that there is an edge connecting vertex i and vertex j.
      • Since it is an undirected graph, for edge (0,2), we also need to mark edge (2,0); making the adjacency matrix symmetric about the diagonal.
      • Edge lookup(checking if an edge exists between vertex A and vertex B) is extremely fast in adjacency matrix representation but we have to reserve space for every possible link between all vertices(V x V), so it requires more space.

    • Adjacency List

      • An adjacency list represents a graph as an array of linked lists.
      • The index of the array represents a vertex and each element in its linked list represents the other vertices that form an edge with the vertex.
      • The adjacency list for the graph we made in the first example is as follows: 
      • An adjacency list is efficient in terms of storage because we only need to store the values for the edges. For a graph with millions of vertices, this can mean a lot of saved space.

  • Trees Data Structure

    • Similar to a graph, a tree is also a collection of vertices and edges. However, in the tree data structure, there can only be one edge between two vertices.
    • Tree Terminologies
      • Node: - A node is an entity that contains a key or value and pointers to its child nodes. The last nodes of each path are called leaf nodes or external nodes that do not contain a link/pointer to child nodes. The node having at least a child node is called an internal node.
      • Edge: - It is the link between any two nodes. 
      • Root: - It is the topmost node of a tree.
      • Height of a node: - The height of a node is the number of edges from the node to the deepest leaf (ie. the longest path from the node to a leaf node).
      • Depth of a Node: - The depth of a node is the number of edges from the root to the node.
      • Height of a Tree: - The height of a Tree is the height of the root node or the depth of the deepest node.
      • Degree of a Node: - The degree of a node is the total number of branches of that node.
      • Forest: - A collection of disjoint trees is called a forest.
    • Types of Tree
      • Binary Tree
      • Binary Search Tree
      • AVL Tree
      • B-Tree

Linear Vs Non-linear Data Structures

What are Algorithms?

  • The word Algorithm means “a process or set of rules to be followed in calculations or other problem-solving operations”. Therefore Algorithm refers to a set of rules/instructions that step-by-step define how a work is to be executed upon in order to get the expected results.
  • Characteristics of an Algorithm
    • Clear and Unambiguous: The algorithm should be clear and unambiguous. Each of its steps should be clear in all aspects and must lead to only one meaning.
    • Well-Defined Inputs: If an algorithm says to take inputs, it should be well-defined inputs.
    • Well-Defined Outputs: The algorithm must clearly define what output will be yielded and it should be well-defined as well.
    • Finite-ness: The algorithm must be finite, i.e. it should not end up in infinite loops or similar.
    • Feasible: The algorithm must be simple, generic, and practical, such that it can be executed upon with the available resources. It must not contain some future technology or anything.
    • Language Independent: The Algorithm designed must be language-independent, i.e. it must be just plain instructions that can be implemented in any language, and yet the output will be the same, as expected.

Space Complexity

  • The space complexity of an algorithm refers to the amount of memory that this algorithm requires to execute and get the result. This can be for inputs, temporary operations, or outputs.
  • How to calculate Space Complexity?
    • The space complexity of an algorithm is calculated by determining the following 2 components:
    • Fixed Part: This refers to the space that is definitely required by the algorithm. For example, input variables, output variables, program size, etc.
    • Variable Part: This refers to the space that can be different based on the implementation of the algorithm. For example, temporary variables, dynamic memory allocation, recursion stack space, etc.

Time Complexity

  • The time complexity of an algorithm refers to the amount of time that this algorithm requires to execute and get the result. This can be for normal operations, conditional if-else statements, loop statements, etc.
  • How to calculate Time Complexity?
    • The time complexity of an algorithm is also calculated by determining the following 2 components:
    • Constant time part: Any instruction that is executed just once comes in this part. For example, input, output, if-else, switch, etc.
    • Variable Time Part: Any instruction that is executed more than once, say n times, comes in this part. For example, loops, recursion, etc.

Comments

Popular posts from this blog

Advanced Data Structures

Sorting Algorithms