eliable data storage is a challenge every business faces. But as the volume of information increases, so do the requirements for reliable data storage. You need a storage system to get the best out of your information.
This article explains what storage systems are and how they work, what problems they solve, how they’re categorised, and what features to look for first if you’re new to the industry.
What is Data Storage System, and what are the challenges are
A SAN (Data Storage System or Storage Server) is a device for storing, managing and backing up data. It is designed to solve typical problems associated with the growing volume of information in any organisation.
While in the past, all data could literally be stored on a single hard drive, Nowadays, any functional system requires separate storage – for example, e-mail servers, DBMS, domain, etc. This is why storage can decentralise information (disperse it across different storage locations).
An avalanche of information size growth, The increase in competition, on the one hand, has led to increased regulation and the requirement to retain more and more business-related information. On the other hand, stricter competition requires more in-depth details on the market, customers, preferences, orders, and competitors’ actions. But the number of hard drives you can install on a particular server may not cover the capacity needed by the system. This, too, is where storage can help.
Data storage — is not the only function of modern storage systems. They also offer storage space savings through deduplication and compression. Compression allows the system to compress files by eliminating redundant information, while deduplication helps save storage space by eliminating redundant files and leaving only references to them.
Some companies find it difficult to control and restrict access due to the company’s security policy. For example, it concerns access to data via existing channels (local network) and physical access to media.
We also note the high cost of resources used to keep the entire enterprise information system up and running, ranging from maintaining a large staff of qualified personnel to the need for numerous costly hardware solutions.
The main components of a typical storage system are a hard disk array (HDD or SSD), a cache, a disk array controller, an external enclosure and several power supplies.
The critical thing about storage is the speed of the disk system. For example, if your disks are inside a server, they will not run at the same performance as a server connected to the storage.
What kind of storage systems exists?
There is a classification of storage systems: they are divided into file-based, block-based and object-based. Each type of storage determines how data is stored, how it is accessed, and, as a result, how easy it is to manage and how quickly the data can be accessed.
Stores information in the form of files assembled into directories (folders). Files are organised and retrieved thanks to metadata, which tells you where a file is located. Conventionally, such a system can be thought of as a directory.
Data is stored independently of each other. Each such block is assigned an identifier, which allows the system to place each block where it suits it. Block stores do not rely on a single data path (unlike file stores).
Split files into ‘objects’ that reside in a single, shared repository. It can be divided into volumes, each of which can have a unique identifier and detailed metadata that allows the objects to be found quickly. Such an approach is a distributed system.
How storage works – NAS, SAN and DAS
Several hardware components, software, and protocols ultimately give storage solutions their special properties.
Based on the classification above, there are two main types of storage solutions: they differ in the level of data storage, reading and writing.
- The first variant deals with file-level data. This means that such storage essentially functions as a server with its own file system. The client-server gives commands such as “write X bits to this file” or “extract X bits from this file”. This type of storage is called NAS.
- The second option is to access data at the block level. This speeds up data exchange between the server and the repository because it is direct, i.e. “block write X” or “block call X”. Such repositories are linked to each other and to the server either as DAS or via SAN.
We will talk about each of them in more detail.
NAS stands for Network Attached Storage, conventionally translated as network storage. Because data is handled at the file level, a NAS server appears as a network server with its own file system.
To put it simply, imagine a desktop computer that is connected to a home router. It stores photos, videos, documents and other data. Network access is allowed to all users – roughly what a NAS looks like.
NAS storage can take many forms. For instance, a production server can be connected to other servers, virtual machines, or so-called disk stations that house several removable hard disks.
The benefits of NAS::
- Affordable and low cost.
- Easy to connect and manage.
- Flexibility is the ability to quickly increase storage capacity.
- The versatility of clients (computers running any operating system can access files).
Disadvantages of NAS:
- Data storage is only in the form of files.
- There is slow access to information via network protocols (compared to the local system).
- The inability of some applications to work with network drives.
DAS stands for Direct Attach Storage (direct connection to a workstation, storage). For example, connecting an external drive via USB can be called DAS.
The basic simplicity of the DAS architecture has its main advantages: affordable price and relative ease of implementation. In addition, this configuration is easier to manage, if only because the number of system elements is small.
Inside the system are the power supply, cooling and RAID controller, which provides reliability and fault-tolerance of the storage. It is managed using an integrated operating system.
The advantages of DAS:
- Easy to deploy and administer.
- High-speed data transmission.
- Low cost of equipment.
Disadvantages of DAS:
- Requires a dedicated server).
- Limits connections (no more than two servers).
SANs, on the other hand, are storage networks. They are usually external storage on multiple network block devices and are implemented as FC (Fiber Channel) or iSCSI (Internet Small Computer System Interface) protocol. This is block access directly to a storage device – a disk or sets of disks in the form of RAID groups or logical devices.
The DAS can be mighty and is often cheaper than a SAN. However, a drawback of DAS is that it cannot easily be expanded – the number of connected computers is limited by the physical number of SAS ports on the DAS (usually only four). Therefore, many companies and institutions prefer to opt for block storage connected via SAN.
The advantages of the SAN:
- High operating speed, low latency.
- Flexible and scalable.
- Data storage in blocks.
- High reliability of data exchange and storage.
- Subnet offloading from service traffic.
The disadvantages of SAN:
- Difficult to design
- High cost.
- The inability of some applications and systems to work with the iSCSI protocol.
How to choose a storage system?
The first thing to do is to understand what tasks it will perform. It is vital to decide on a few basic parameters.
Different data types require different access speeds, processing techniques, compression, etc. For example, a virtual storage system to handle large media files is another system that will handle unstructured data for a neural network.
The choice of disk drives depends on this. Sometimes a consumer-grade SSD is fine – if you know that even in the worst case, the storage capacity will not exceed 300GB and the access speed is not critical.
You need to know the cost of losing data over a certain period. This will help calculate the RPO (Recovery-Point Objective) and RTO (Recovery Time Objective) and avoid unnecessary backup costs. Backups, backups and backups.
If a storage system is being purchased for a new project (the load is difficult to predict), it is best to talk to colleagues who have already solved the problem or test the storage system.
Sometimes a budget or mid-range solution (StarWind, Huawei, Fujitsu) is suitable even for a resource-intensive service. However, the top vendors – NetApp, HPE, and Dell EMC – have a broad enough product line that relatively inexpensive storage solutions can also be found. It is advisable not to expand the number of vendors on the same infrastructure.