Data for the Masses Part 3: Do I NEED a Server?
This is the final part of a series whose goal is to introduce what databases are and give some opinions about the available database technologies, why you should use them, and when you should use them in a non-technical way. In the previous post, we compared different SQL technologies and explored use cases and why we would choose one over the other. In this blog post, we tackle questions of scale and answer the question “Do I need a server?”
The goal of this blog post is to give you an idea of scale, so you can narrow down your database decisions. To answer the titular question: No. You do not need a server. You can run MySQL on a Raspberry Pi and get as much or more performance than you would from some 10u mega-server whose per month maintenance costs more than a Pi; I say that not only for shock value, but because it’s probably true.
The questions you must ask yourself in the context of your project/product are:
- How much data do I need to save?
- How much data do I need to access?
- How often do I need to access that data?
- How many connections need to edit the same data simultaneously?
The answers to these questions can provide you with an idea of scale.
To give some examples of what can be considered a lot of data, here are some possible scenarios:
- You capture rotational data triggered by encoder counts for 10 inputs at varying speeds, the motor runs 24/7
- Getting instantaneous data for an average of 100 rotations once a day for the maintenance period of the motor is not a lot of data
- Streaming all the data to a database for the duration of the maintenance period is a great deal of data
- You have a quality control system that performs tests on about 50 products a day and store information for 100 or so inputs
- Storing the data is easy. For this use case, there is generally no upper limit and because product information is often archived and not accessed very often once the product has been sold, most of the data has very low access.
- If you have a highly distributed company where there are millions of products a day, where data must be accessed by logistics and by a number cruncher running efficiency statistics, a server may be required to serve all those connections.
Once you have an idea of the data, it’s important to have a clear idea of how you plan to use it. While opening 100MB worth of ASCII data in an Excel spreadsheet is sometimes asking a lot, the same process in a database is a walk in the park.
A database can generally serve data fast enough where caching or buffering the data is unnecessary. The crutch is when multiple people (or applications) need write-access to data; something that isn’t a normal use case for Excel and is often a basic limitation of file-based data storage. SQL can do this, but to make the SQL service accessible remotely, a server may be the most reasonable solution.
Databases can significantly complicate software development effort, especially for the first time, but offer significant long-term benefits and easily fit into the category of over-developing or future-proofing an application when implemented early in the project life cycle.
This concludes the three-part blog series on databases. We hope you have learned something new!
- By Evan-Amos - Own work, Public Domain, https://commons.wikimedia.org/w/index.php?curid=56262833
Write Software for others as you wish they would write for you.
Introduction At Erdos Miller, we are firm believers in making data-driven [..]