Data Structures
Almost all programing languages have primitive datatypes like: Boolean, String, Integer, Float, Null/None, etc.
But what if you want to store lots of booleans or lots of strings in just one structure? For example: If you have a list of todo's (strings) you would want to put them together in a list like this:
1todos = ['make the bed', 'clean the house', 'pay taxes']
For these types of situations the world of computer science has created more complicated data-types that are called Data Structures
.
There are many publicly known data-structures and you can also create your own, but lets start by reviewing the most popular ones:
Type | What is it | Purpuse and Efficency |
---|---|---|
Array or List | It's a way to have many values or the same type under just one variable | Grouping values, relating them together, to make them more accesible like we did with the todo list above. The computer will reserve successive spaces in Memory for all values making them faster to access |
Matrix | It's a two (or more) dimensional array where you access elements using 2 positions like this: print(matrix[0][1]) | Matrices are ideal for coordinates; you can easily do a cartesian graph |
Stack | Stack is a linear data structure, you can think of it like an Array but with two additional methods: stack.pop() that removes the last element and stack.push() that adds one element at the end. LIFO(Last In First Out)/FILO(First In Last Out) order. | Being linear, makes it very efficient in memory because the computer only writes on the edges of the structure |
Queue | The queue is like a stack but it can get the element at the first position: FIFO(First in First Out) | It's a little less efficient that the stack, but still pretty fast and used in many real life scenarios |
Hash Table | Hash table is like an array but you can use letters to identify the positions in the list. E.g: print(person["name"]) . In Python we call them Dictonaries, in Javascript we call them Object Literals. | Being able to access elements using a string key is ideal for some situations. For example: we could have one structure for each language and print both languages like this: print(german["Hello World"]) and print(spanish["Hello World"]) |
Graph | Graphs are data structures where any element (called node) can have a pointer to any other node, you can use graphs to represent hiearchies, streets in a city, etc. | Graphs are very efficient because they let you point directly to other nodes instead of having to loop in a sequence, for example Google Maps Traffic uses graphs to calculate the Estimated Time of Arrival |
Tree | Trees are a type of Graph with hierarchy (parents and childs), everything starts at the top node and you can use node.childs() to get the childs of any element and keep drilling down the hierarchy. | Trees are efficient in many scenarios where real life hierarchies are present like: Representing a family, The File Directory of any computer, Website Menu, etc. |
☝️ There are other types of data structures and you can also create your own structures but we are covering these cases because they represent the strong mayority of the real-life situations you are going to encouncer while coding.
Every programming language has a different array implementation but the most basic and original idea was to have a very low level efficient way to store related values. Arrays are an efficient way of storing because:
Here is how you can manipulate arrays in almost any programing language:
1# retrieving a value 2first_todo = todos[0] 3 4# reseting the value on the 3rd position 5todos[2] = "buy food" 6 7# adding a new value to the end of the list 8todos.append("Call mom")
If you store arrays within arrays you get a Matrix, e.g:
1# tic tac toe board 2board = [ 3 [ 0,0,0 ], 4 [ 0,0,0 ], 5 [ 0,0,0 ] 6]
In the example above we replicated a Tic Tac Toe board using a matrix (two dimensional array), if we wanted to re-set any of its value you would have to do somethign like this:
1board[0][1] = "x" 2 3# the board will then look like this: 4board = [ 5 [ 0,1,0 ], 6 [ 0,0,0 ], 7 [ 0,0,0 ] 8]
There are plenty of things you can represent in a matrix: Maps, Cartesian Graphs, Game Boards, Blueprints, etc.
Stacks are great because they are also very fast and efficient. Stack follow the FILO rule: First In Last Out.
In real life there are plenty of situations that replicate a stack:
1my_stack = [] 2 3my_stack.append("A") 4my_stack.append("W") 5my_stack.append("F") 6 7# If I want to retrieve A I will have to pop (remove from the end) F and W first 8f = my_stack.pop() # removes F and returns it 9w = my_stack.pop() # removes W and returns it 10a = my_stack.pop() # removes A and returns it 11
Just like the stack, the queue has many real life scenarios like:
The Queue its like a Stack but with a FIFO behavior: First In First Out.
1my_queue = [] 2 3my_queue.append("A") 4my_queue.append("W") 5my_queue.append("F") 6 7# A will be the first one to be retrieved because it was the first to enter 8a = my_queue.pop(0) # dequeue A and returns it 9w = my_queue.pop(0) # dequeue W and returns it 10f = my_queue.pop(0) # dequeue F and returns it 11
Some of the most popular hash table implementations in hash tables are:
1language_hash_table = { 2 "hello world": { 3 "german": "Hallo Welt", 4 "spanish": "Hola Mundo" 5 } 6} 7 8# Now we can get any language instantly based on english: 9print(language_hash_table["hello world"]["german"]) 10print(language_hash_table["hello world"]["spanish"]) 11
Graphs are a whole new way of storing accessing data, now instead of having an array or list in a sequence, you can have elements that point to each other.
Graphs are ideal to represent relationships, hierarchies, and connections, for example:
A graph can be represented using a Hash Table like this:
1graph = { 2 'A': ['B', 'C'], 3 'B': ['C', 'D'], 4 'C': ['D'], 5 'D': ['C'], 6 'E': ['F'], 7 'F': ['C'] 8}
You can think about Trees like a subtype of Graph, because tree nodes are conected to each other, but the only type of conection that can exist is Parent -> Child relationthip.
Trees are used in a variety of situations like:
A graph can be represented using a Hash Table like this:
1small_family = { 2 'A': { 3 "children": ['C', 'D'] 4 }, 5 'B': { 6 "children": ['C', 'D'] 7 } 8 'C': { 9 "children": ['E'] 10 } 11}
Note: A and B are probably spouses, C is A & B children and E is C's child.