External Hard Drive Storage 101

 

Overview

In today’s world of digital media, storage is paramount. The countless hours you have spent shooting, editing, finishing, and exporting your project can all be lost in a millisecond in the event of catastrophic incident such as mechanical failure, theft, or even accidental damage from dropping your hard drive. The purpose of this article is to give you a better understanding of how storage drives work and how to select which one works best for storing and accessing your media project.

Capacity

When buying a drive, storage capacity is one of the first things you must consider, but in the world of media production one size does not necessarily fit all. In today’s digital age, almost everything you generate will need to be stored. That includes HD video dailies, project files, documents, images, and tons of other digital information. In order to get the best bang for you buck when shopping for new storage, having command of the language of capacity will do you justice.

Terms (Symbols)

In order to get the best bang for you buck when shopping for new storage, having command of the language of capacity will do you justice:

  • Bit (b) – A short term for binary digit, it is the basic capacity of information in computing. A bit represents either a zero (0) or a one (1) only.
    • Relative size: 1 bit = about 1 character of text
  • Byte (B) – A unit of digital information that consists of 8 bits.
    • Relative size: 6.3 bytes = a few words of text
  • Kilobyte (kB) – Approximately* 1,000 bytes of digital information.
    • Relative size: 1 kB = about a half page of text
  • Megabyte (MB) – Approximately* 1,000,000 bytes or 1,000 kB of digital information.
    • Relative size: 1 MB = about 500 pages of text
  • Gigabyte (GB) – Approximately* 1,000,000,000 bytes or 1,000 MB of digital information.
    • Relative size: 1 GB = about 1min of 1080p ProRes 422 video footage
  • Terabyte (TB) – Approximately* 1,000,000,000,000 bytes or 1,000 GB of digital information.
    • Relative size: 1 TB = about 16 Hrs 30 Mins of 1080p ProRes 422 video footage
  • Petabyte (PB) – Approximately* 1,000,000,000,000,000 bytes or 1,000 TB of digital information.
    • Relative size: 1 PB = about 2 years of 1080p ProRes 422  video footage

Bit-Byte

* A kilobyte is a kilobyte, right? Not exactly. To help further confuse us all, there are actually two common standards in which data capacity is measured: Hard Disk Drive (HDD) manufactures and Mac OS X (version 10.6 Snow Leopard and forward) uses a unit of 1,000 bytes, called metric prefix, to equal 1 kilobyte; while the Windows operating system’s virtual storage and processors uses a unit of 1,024 bytes, called a binary prefix, to equal 1 kilobyte. Both recognize that 8 bits = 1 byte, so they are fundamentally the same, but all multiples of that unit of measurement are different. This seems like small issue except that it creates a significant discrepancy when you begin multiplying kilobytes into megabytes, gigabytes, terabytes, petabytes, and so on as the apparent difference becomes increasingly magnified as you up the demand for capacity. There are actually different terms used by computer scientist to indicate this difference in measurement (i.e. kibibyte (KiB), mebibyte (MiB), or gibibyte (GiB)), but they are not coomnly used by consumers). Because of this difference in measurement is why that 500 GB HDD you bought appears as approximately 465 GB on all Windows OS computers and approximately 500 GB on a Mac OS X (10.6+) computer. However, remember that you are not losing any storage capacity by using a Windows based computer system, as your HDD will always hold the same amount of data, just that the unit of measurement used by each operating system and HDD manufactures are different. Just as the same distance is measured when comparing 1 standard yard to a 0.9144 metric meter.

The Precision of Language

The language, characters, and symbols used to describe digital information is very particular. For instance, when someone writes TB it means terabyte, but when Tb is written (with a lower case b), that symbol represents a terabit. The main difference being that a terabyte represents 1,000,000,000,000 bytes of digital information and a terabit represents 8,000,000,000,000 binary digits (bits).

As you may have already noticed, there are common variation in symbols you will see. One in that you will deal with regularly is kB/s or kbps. It is import to recognize the difference between them as kB/s represents kilobytes per second and kbps represent kilobits per second. So 1 kB/s is equal to 8 kbps. That means if a website such as Vimeo is asking you to make sure that the HD video you are uploading is set at a bitrate of 5000 kbps that is equivalent to them asking you to upload a video with a bitrate of 625 kB/s. If you uploaded a 5000 kB/s video by mistake you just tried to deliver a video with a bitrate of 40,000 kbps. That’ is 8 times the data rate they asked for. The key to all of this is to remember that bits have a constant 8:1 ratio with bytes.

Pro Tip: When performing data rate conversions such as the Vimeo video example given in the above paragraph, the trend has been to use the decimal prefix standard of 1,000 bytes = 1 kilobyte. Use a Bit Calculator to insure conversion accuracy.

Calculating Storage Demand

Now that you understand how to identify the capacity of any given storage device, and that you are moderately aware of the existence that media files have a variable data rate, you now have the basic skills required to calculate what size drive you will need to purchase for properly storing all of your project media and documents.

File System

Also casually known as format, a file system is how information is stored and retrieved on a drive. Before you begin filing your hard drive with your precious data, you should take the time to understand file system formats and which one best works for you. Many drives come preformated in a file system that will probably work just fine, but you can always change a drive to best suit your needs. When dedicating on a file system format there are several variables to consider. Here is a list of the most commonly used file systems on personal computers and media production:

Hierarchical File System Plus (HFS +)

Also known as Mac OS Extended, a proprietary file system developed by Apple, Inc. for use in computer systems running Mac OS. Replaced HFS, also known as Mac OS Standard. As of Mac OS X 10.6, Apple has dropped support to format or write HFS drives and images. They are now supported as read-only volumes. Both HFS + formated drives and images can only be read and written to by Mac OS X operating systems unless you install a third party utility such as Mac Drive for read and write ability in Windows NT operating systems.

  • Maximum individual file size limit – 8 Exabytes (8 Billion Gigabytes).
New Technology File System (NTFS) 

A proprietary file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows NT family. A NTFS formated drive can be read by Mac OS X, but it can not be written to unless a third party utility such as NTFS for Mac is installed.

  • Maximum individual file size limit – 16 Exabytes (16 Billion Gigabytes).
File Allocation Table 32 (FAT32)

Also know as MS-DOS (FAT) or exFAT, a simple and robust legacy film system. It offers good performance even in light-weight implementations, but cannot deliver the same performance, reliability and scalability as some of the more modern file systems can. It is supported, for compatibility reasons, by virtually all existing operating systems for personal computers, and thus is a well-suited format for data exchange between computers and devices of almost any type and age from the early 1980s up to the present day. Great for formatting flash drives to move small data from different workstations, but certainly not ideal for formatting a hard drive you plan to use for editing video media due to rather small file size limitation.

  • Maximum individual file size limit – 4 Gigabytes.

Enclosure Type

Portable hard drives come in a variety of enclosures that allow for diverse drive configurations. Depending on what your needs, there are different advantages to each type and their features. Most external drives are for either expanding the capacity of a workstation, providing an on the go solution or both.

Desktop

Larger, fully functional hard drive enclosure that typically contains a 3.5″ hard drive. The largest hard drive capacities are available in these enclosures. All of these enclosures can be transported to other computer, but some are not as convenient as others.

Portable

Smaller, light weight, and easy to transport. The ideal storage hardware for someone who has to transport their media projects on a regular basis. Capable of all the functionality of a regular desktop external drive with an exception to the limits of maximum drive capacity. Typically contains a 2.5″ hard drive.

A Variety of Hard Drive Enclosure Types from G-Technology

A Variety of Hard Drive Enclosure Types from G-Technology

Drive Type

In principal, hard drives have fundamentally been  the same for quite a while. However, there are new technologies emerging that have created significant changes to digital storage. Here are your options:

Hard Disk Drive

Hard disk drives (HDD) are currently the defacto standard of data storage for desktop computing. They use an electromagnetic method and are comprised of either single or multiple non-magnetic flat circular platters (disks) that are coated with a layer of magnetic material. Information can be read from and written to direct areas of this platter via the read-and-write head which is placed on a very precise armature.

  • Pros: Cheap capacity, reliable when handled properly, and a proven technology.
  • Cons: HDDs have become very reliable equipment that most of the world’s data depends on, but they can be quite fragile when mishandled. The platters inside are spinning at thousands of revolutions per minute (rpm). That obviously creates a lot of momentum. Read-and-write heads operate very close to the surface of the platter, so you should take special care in not moving the drive when it is in use. Sudden impacts and axis movements may jar the head, causing accidental contact with the platter surface. This can cause irreparable damage to the hardware and the permeant loss of your data. Can be loud. Lengthy start up times are caused when the disks have to spin up from a stationary position (typically happens when the drive has been to sleep when not in use).

Hard Disk Drive (HDD) Diagram

Solid-State Drive

Solid-State Drives (SSD) have been around for awhile, but are considered an emerging technology. Data is stored electronically in integrated circuits assembles with no moving parts. They work very similarly to USB flash drives you may be familiar with. This allows for reliable, fast drives in small form factors, but currently comes at significant cost compared to traditional HDDs.

  • Pros: High read-and-write speeds and no disk spin up wait times means very little latency, high performance, quite, and not sensitive to drive orientation or movement when operating (as is the case with traditional rotating HDDs). Because they are so fast and reliable, higher than average data rates can be achieved using these drives allowing effective throughput to climb closer to an interface’s theoretical maximum speed.
  • Cons: They are expensive. Right now, in 2013, the price per GB is 3 to 4 times higher than HDDs. Capacities are growing but are limited. There are 512GB SSDs available on the market, but few portable drives have them as a built in option at this time. They do pose an increased chance of catastrophic failure. Due to the way they handle information, SSD drives are very hard to recover in the event of a failure. Because HDD write in a linear fashion on a the surface of a physical disk, it is just easier to go in and retrieve the data in the event of the worst case scenario.

Solid-State Drive (SSD) Diagram

Form Factor

The rigid housing the drive components are assembled in. There are many sizes when it comes to drive form factor, but let’s just stick to the two most commonly used with modern desktop computing and digital media storage:

3.5″

A size you will typically find in most desktop computing such as your Mac Proor PC tower. It’s a form factor typical associated with traditional HDDs and optical drives like DVD-ROMs. HDD at this form factor size are capable of rotational speeds of 10,000rpm or more (for premium drives), but typically operate at 7,200rpm.

2.5″

A size you will typically find in laptops and smaller electronic devices like your MacBook Pro, iMac, or Playstation 3. It’s a form factor that houses both smaller HDDs and SSDs. HDD at this form factor size tend to max out at 7,200rpm (for premium drives), but typically operate at 5,400rpm.

2.5″ HDD Resting Comfortably on Top of 3.5″ HDD

Drive Quantity

Single

The majority of external drive enclosures are built with just one hard drive. This simple setup is usually cheap, compact, and reliable.

Multiple

There are some external drive enclosures that are built with two, three, four, or even five hard drives in one single enclosure. Many of these multi-drive enclosures feature easy to remove drives that can be replaced in the event of an individual drive failure or capacity upgrade.

Drive Configuration

Most external drive enclosures have a static drive configuration, but some allow for modification to improve performance or to create data redundancy.Here are the most common configurations you will run across in the world of video editing:

Normal

Simple. Just one formatted hard drive.

Redundant Array of Independent Disks (RAID)

Complex. RAID is a storage technology that combines multiple drives into a logical unit for the purposes of data redundancy and/or performance improvement. Data is distributed across the drives in one of several ways, referred to as RAID levels, depending on the specific level of redundancy and performance required. Here is a list of the RAID levels most commonly used in media production:

  • RAID 0 – Data striping without parity or mirroring. Provides improved performance and additional storage but no fault tolerance. Any drive failure destroys the array and any data stored on it, and the likelihood of failure increases with more drives in the array. This setup provides no additional protections and is purely for improving drive read/write performance to increase data rates.
    • Example: Two 1 TB drives setup as RAID 0 will have improved performance, no redundant data, and have an operating capacity of 2TB.

raid-level-0

  • RAID 1 –  Data mirroring without parity or striping. Data is written identically to two drives, thereby producing a mirrored set to protect your data against mechanical failure.
    • Example: Two 1 TB drives setup as RAID 1 will have no improvement in performance, redundant data, and have an operating capacity of 1TB.

raid-level-1

  • RAID 5 – Data redundancy with striping and distributed parity. The array distributes parity along with the data and requires that all drives but one be present to operate. The array is not destroyed by a single drive failure. Requires at least three disks for setup. The advantage of this setup is an increase in drive read/write performance along with efficient data backup that does not require a 1-to-1 capacity ratio.
    • Example: Four 2 TB drives setup as RAID 5 will have an improvement in performance, redundant data, and have an operating capacity of 6 TB.

RAID-level-5

Use this handy RAID Disk Space Calculator to determine your total storage capacity after a RAID level is applied to your drives.

Important Note Concerning Backing Up Your Data

Redundant RAID arrays do not protect your data from file corruption, physical theft, or catastrophic damage such as a house fire or being submerged in water. It only protects your data from loss in the event of mechanical failure of the primary data disks.

Just a Bunch of Disks (JBOD)

A JBOD is an array of drives not setup in a RAID. These drives can be setup up as separate, independent volumes or they may be combined (using software such as Mac OS X’s Disk Utility) to create one single volume. The advantage of this configuration, inside of a single enclosure, is that you able to interface several hard drives to your computer without having to have multiple external drive enclosures connected.

JBOD

Interface

In layman’s terms, a hardware interface is how you connect your drive (or other device) to your computer. It is the point of interaction between a hard drive and the computer. It is comprised of the mechanical, electrical and logical signals at the interface and the protocol for sequencing them.

Throughput

After you determine how much storage you will need, the next thing you have to figure out is how much throughput (or data rate) the media you will be storing, and eventually editing, will require. Throughput is the average rate of the successful delivery of information packages over a network. Which means it is not a constant number, it is always moving due to variations in data location on the disk, file size, network traffic, and other variables. A drive’s interface plays a crucial role with this average.

There are many types of interfaces that allow you to connect a portable external drive to your computer system. Here is a breakdown of the most commonly used in computerized video editing:

Universal Serial Bus (USB)

USB was initially developed in the mid-1990s and since become the dominate bus connection used for most computer peripherals devices such as keyboards, printers, hard drives, and for linking to electronic devices such as phones and iPods.

  • USB 2.0 (which most modern peripherals use) has an effective throughput to an external hard disk drive (HDD) of about  289 Mbps with a theoretical maximum interface speed of  480 Mbps. Because throughput information is just an averaging of what is being transmitted, this range is too limited to use when editing HD video footage  (such as Apple ProRes 422 codec’s average data rate of 153 Mbps). SD video footage (such as DV NTSC with an average data rate of 48 Mbps) can be supported, but in practice we find using USB 2.0 tends to bottleneck any video editing workflow, as the throughput will occasionally drop below the data rate demand during video playback. USB 2.0 is best used for just everyday computing needs and archiving digital media.
USB 2.0 Connectors

USB 2.0 Connectors

  • USB 3.0 (which launched at the end of 2008 and is now beginning to show up as the new standard in modern peripherals) is a different story. It has an effective throughput to an external HDD of about 920 Mbps with a theoretical maximum interface speed of 5 Gbps. This means its average data rate is more than enough to handle 2K, 1920×1080 HD, and 720×480 SD video workflows.
Firewire (IEEE 1394)

Primarily developed by Apple in the late 1980’s – the early 1990’s and released to the public in 1995, its introduction as a peripheral interface had a major impact in the world of digital video editing as its significant throughput allowed for inexpensive realtime video capturing for nonlinear editing workstations.

  • Firewire 400 – FW 400 has an effective throughput to an external HDD of about 320 Mbps with a theoretical maximum interface speed of 393 Mbps. Because throughput information is just an averaging of what is being transmitted, this range is too limited to use when editing most HD video footage  (such as Apple ProRes 422 codec’s 153 Mbps) as the throughput will occasionally drop below the data rate demand during playback. SD video footage (such as DV NTSC with an average data rate of 48 Mbps) can be properly supported.
  • Firewire 800 – FW 800 has an effective throughput to an external HDD of about 600 Mbps with a theoretical maximum interface speed of 786 Mbps. Editing most 2K and HD video footage  (such as Apple ProRes 422 codec’s 153 Mbps) can be properly supported with this interface.
firewire_connectors

Firewire Connector Variants

eSATA

Standardized in 2004, eSATA (e standing for external) provides a variant of the traditional SATA interface (used inside desktop computers) meant for external connectivity with portable drives. It is about four times faster than USB 2.0 and Firewire 400, twice as fast as Firewire 800. It has an effective throughput to an external HDD of about 984 Mbps with a theoretical maximum interface speed of 3 Gbps. It is robust enough to easily handle 2K, 1920×1080 HD, and 720×480 SD video workflows. The major drawback with eSATA is its limited standard deployment by most major computer and portable hard drive manufacturers.

eSATA Connectors

eSATA Connectors

Thunderbolt (1 & 2)

Developed by Intel  and first released by Apple in 2011, Thunderbolt is a peripheral interface that combines traditional PCIe and Display Port (DP) functionality into a Mini Display Port (MDP) connector. This merger of functionality has created a fast, powerful, and flexible peripheral interface that is used to connect everything from external monitors, component expansion slots, to portable hard drives. The technology was presented as having an initial theoretical maximum interface speed of 10 Gbps and promising a final speed of 100 Gbps in the future. The current effective throughput average is about 920 Mbps due to Hard Disk Drive rotation limitations and other drive side operational limitations. Solid-Stage Drives should allow for a faster data rate average of 1.65 Gbps  as their costs drop and are eventually adopted into video editing. It is robust enough to easily handle 2K, 1920×1080 HD video and 720×480 SD video workflows and support for some 4K workflows. Thunderbolt 2 is expected to have enough bandwidth to be able to transfer 4K video files while also being able to simultaneously playback 4K video over an external video monitor.

Thunderbolt Connectors

Thunderbolt Connectors

Power Supply

AC Power Adapter

A simple AC to DC power supply that allows you to power your external drive from a traditional 110v Edison wall outlet (US).

Hub Powered

Many smaller portable drives can be powered by an interface hub such as USB, Firewire, or Thunderbolt. They do not always require an external power source making it easier to transport and connect to computer workstations. However, sometimes interface hubs can become over saturated by power demands (such as multiple hub powered drives drawing from the same interface hub). In this situation an external power supply for the drive is usually required. If you plan on operating multiple hard drives at one time, it is best not to over invest in hard drives that are only hub powered cable.

Summary

Hard drives are the primary vessel for editing and storing your precious data. An entire project’s production value and potential exists on just a few metallic disks. It is crucial that the new generation of media producers develop an understanding and have a strong command of the nomenclature and technologies used in data storage to ensure that their projects are safely completed and are archived successfully for future use.

A data backup guide will be produced in the near future to review successful practices in protecting your data.