Things to Consider Before choosing a Data structure

Things to Consider Before choosing a Data structure

In our last article, we emphasize the important influence of data organization on the performance of algorithms. From the article, we understood that all programs operate on data and consequently the way the data is organized can have a profound effect on every aspect of the final solution. We also listed some of the data structures. But it is easier to get confused about when and how to use these data structures. This episode seeks to clarify that.

One of the most important decisions we have to make in formulating a computer solution to a problem is the choice of appropriate data structures. In particular, an inappropriate choice of data structure might lead to clumsy, inefficient, and difficult implementation. Meanwhile, an appropriate choice usually leads to a more simple, transparent, and efficient implementation. This means a key to effectively solving many problems boils down to making appropriate choices about the associated data structure.

A small change in data organization can have a significant influence on the algorithm required to solve the problem. Hence, the need to re-analyze to get to know our data structure more. That being said, let take a look at some major factors that determine when to use this data structure.

To have an idea of when you should use any of these data structures, you need to understand the approach to do some major Traversal to them. This information is useful when deciding the appropriate data structure for your algorithm.

For instance, In linear data structures, each element is connected to either one or two more elements (the next and previous). Traversal in these structures is linear. This means that insertion, deletion, and search work is in O(n). Arrays, linked lists, stacks, and queues are all examples of linear data structures. Non-linear data structures, the exact opposite of linear data structures, each element can be connected to several other data elements. Traversal is not linear; hence, search, insertion, and deletion can work in O(log n) or even O(1) time. Trees, graphs, and hash tables are all non-linear data structures.

When selecting a data structure to solve a problem, you should follow these steps.

  1. Analyze your problem to determine the basic operations that must be supported. What are you going to do often? Examples of basic operations include inserting a data item into the data structure, deleting a data item from the data structure, and finding a specified data item.

  2. Quantify the resource constraints for each operation. i.e., consider the amount of data you'd have to store or if it is predictable.

  3. Next, consider what is more important? For instance, if the search speed is more important than the insertion speed, you'd likely use an ordered array and vice-versa.

  4. Select the data structure that best meets these requirements.

Now, let's consider an example, using the same list of numbers in our previous article; 27, 31, 42, 26, 10, 44, 35, 19, 33, 14. Let's try to find the maximum difference between the numbers.

To choose a data structure, while following the steps above; the basic operation would be to find the maximum and minimum numbers to know their difference.

Now, let's say we know that the amount of data cannot be more than 100, then our data is predictable. We can use a data structure like an array or linked list.

Considering what is more important will further help sieve our choice of data structure. In this case, since we are looking for the minimum and maximum value, searching is more important. So an ordered array will be the correct data structure to use.

When next you need to use a data structure, remember that: only when a data structure is symbolically linked with an algorithm can we expect high performance.

I don't mean for this particular episode to be too long. In our subsequent article, we shall, in no particular order, have a look at some traversal operations on one of our data structures as well as their pros and cons.