Managing storage array risk when space runs low

Anyone who manages a storage area network has probably already learned (perhaps the hard way) that performance and protection are almost mutually exclusive concepts, particularly in an older storage array. But choosing performance over protection always comes with great risk. On the surface, many storage arrays offer great features — like deduplication, which preserves disk space by eliminating duplicate data. Likewise, almost all support some implementation of snapshots, which captures the state of your system at set intervals throughout the day. It all sounds great in principle, but both of these technologies often disappoint in real-world situations.

Dedupe can be a clock killer

Data Duplication (i.e. Dedupe) involves the elimination of duplicate data on the storage center. For instance, if everyone in your company receives an email with the same file attached, only a single version of the file would be saved to the [deduped] storage, and index pointers would be used to connect everyone’s email message to that one occurrence. Obviously, this can save a lot of the free storage capacity, which is why dedupe is so desirable.

There are different iterations and implementations of deduplication, called “file-level” and “block-level” deduplication. As their names imply, file-level dedupe eliminates duplicate files, while block-level eliminates duplicate blocks. Of the two, block-level is the most efficient because a file can be divided into many blocks, allowing deduplication to function with a finer granularity. Another distinction among deduplication methods involves when the actual process occurs. There is “inline deduplication” which analyzes data enroute to the storage, so duplicate data never even gets written to disk. There is also “post-processing deduplication”, which performs dedupe periodically on a fixed schedule, but only after all data is initially written to disk.

Inline dedupe vs post-processing

Inline deduplication is most desirable because it consumes less storage capacity, but it also taxes the heck out of the CPU (Central Processing Unit), often driving latency up to an intolerable level – even hundreds of milliseconds! The impact on the network is so severe that most storage vendors recommend disabling the feature on primary storage in order to maintain peak performance.

Post-processing dedupe isn’t much better. Let’s say you scheduled dedupe to run at the top of every hour. Like clockwork, the network will experience huge latency increases starting at the top of every hour, lasting “as long as it takes” to clean up the storage. This will not sit well with your employees or your customers!

             The final word on deduplication seems to be, “It’s a great space-saving feature… just don’t use it during business hours.”

The storage array snapshot snafu

Storage snapshot technology is nearly three decades old and has long been the de facto method used to protect storage against data corruption or other disasters. The idea is fairly ingenious. Rather than running a full backup (which takes a very long time), only take periodic “snapshots” that contain the changes made to storage since the previous snapshot. Storage vendors like to tout that a snapshot interval of 15-minutes will ensure that you never lose more productivity than that in the event of a catastrophic data-loss event occur.

Again, that sounds really good in principle. But in practice, when you enable snapshots at the 15-minute interval, you soon realize that after a month, you have nearly 3,000 snapshots that are quickly consuming your free storage! Snapshots eat up space, and unless you want to commit 25-40% of your total storage capacity to hold them, you find yourself where many storage administrators often do: trading a larger recovery point objective for more free storage in a storage array.

For instance, dialing back snapshot frequency to 1-2 times per day will create fewer snapshots and consume less free space. But now you are at risk to lose as much as a full day’s productivity should it become necessary to restore from those snapshots. It really doesn’t make sense to dial back your data protection during the most productive hours of the day, but sadly, it often becomes necessary.

            The final word on snapshots is, “Great feature, but you probably don’t have enough free storage to use it to its full potential.”

What’s a storage admin to do?

Recent innovations have found ways to solve the shortcomings of both deduplication and snapshot storage within a storage array. In the case of dedupe, the solution came from the introduction of powerful multi-core CPUs and new software designed to take full advantage of them. This allows modern storage arrays to perform inline deduplication (and even compression!) on the fly, so redundant data never makes it to disk, and the data that does is already space-optimized. That combination keeps free storage space at a maximum without taking clock cycles away from I/O processes.

If you are responsible for managing a storage array and find you are constantly fighting the battle for free space, the advantages of going with Cloud storage starts to stand in more relief. Your Cloud provider needs to offer more types of storage workloads than you’d want to buy on your ow. To learn more about the latest innovations in modern storage and what it can mean to your business, contact the storage professionals at Ozone IT Services.

Share:

Accessibility Toolbar

Privacy Policy

1. Introduction

Welcome to Ozone IT Services (“we,” “our,” or “us”). We are committed to protecting your personal information and your right to privacy. This Privacy Policy explains how we collect, use, disclose, and safeguard your information when you visit our website https://ozoneitservices.com/ (the “Site”).

Please read this privacy policy carefully. If you do not agree with the terms of this privacy policy, please do not access the site.

2. Information We Collect

We collect information in two ways:

  1. Information you provide to us:
    • Personal information that you voluntarily provide to us when you fill out forms on our Site.
    • This may include your name, email address, and any other information you choose to provide in the form fields.
  2. Information collected automatically:
    • We use Google Site Kit, which integrates several Google services to collect and analyze data about our website visitors.
    • This may include information such as your IP address, browser type, operating system, referring URLs, device information, pages visited, and the dates/times of visits.

3. How We Use Your Information

We use the information we collect for the following purposes:

  • To respond to your inquiries or requests
  • To provide you with information or services you have requested
  • To improve our website and user experience
  • For internal record keeping and administration
  • To analyze website traffic and optimize user experience using Google Site Kit

4. Google Site Kit

We use Google Site Kit to help us understand how visitors interact with our website and to improve our services. Google Site Kit integrates several Google services, which may include:

  • Google Analytics: for website traffic analysis
  • Google Search Console: for search performance data
  • Google AdSense: for advertising performance (if applicable)
  • Google PageSpeed Insights: for website performance data

These services collect non-personally identifiable information which may include:

  • Website traffic data
  • Search query data that led to our site
  • Indexing data
  • Data about how visitors interact with our site
  • Website performance metrics

This information helps us to improve our website and its content. Google’s ability to use and share information collected by Google Site Kit is restricted by the Google Site Kit Terms of Service and the Google Privacy Policy. You can learn more about how Google uses data when you use our site by visiting https://www.google.com/policies/privacy/partners/.

5. How We Protect Your Information

We are committed to ensuring that your information is secure. We have implemented suitable physical, electronic, and managerial procedures to safeguard and secure the information we collect online to prevent unauthorized access or disclosure.

6. Third-Party Sharing

We do not sell or lease your personal information to any third parties. However, aggregated, anonymized data collected through Google Site Kit may be shared with Google as part of the service’s functionality.

7. Cookies and Tracking Technologies

We use cookies to improve your experience on our website. These cookies may collect non-personal information. You can choose to accept or decline cookies. Most web browsers automatically accept cookies, but you can usually modify your browser setting to decline cookies if you prefer.

Google Site Kit may use cookies to collect information. You can learn more about how Google uses cookies by visiting https://www.google.com/policies/privacy/partners/.

8. Your Rights

Depending on your location, you may have certain rights regarding your personal information, such as the right to access, correct, or delete your data. Please contact us if you wish to exercise these rights.

9. Changes to This Privacy Policy

We may update our Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page.

10. Contact Us

If you have any questions about this Privacy Policy, please contact us