petabyte (PB)

Contributor(s): Erin Sullivan

A petabyte (PB) is a measure of memory or data storage capacity that is equal to 2 to the 50th power of bytes. There are 1,024 terabytes in a petabyte and approximately 1,024 PBs make up one exabyte.

Petabyte storage vendors

Barely a decade ago, data storage vendors would boast of selling an aggregate of a petabyte or two in all of their storage systems sold. Due to the continued rapid increase in storage capacity requirements, it's now common to see individual companies and even single storage systems with more than a PB of storage capacity.

In 2015, Fujitsu released its Eternus DX S3 block storage devices, which can scale from 4.6 PB to 13.8 PB of raw capacity. The HGST Active Archive System, released in 2015, scales to 4.7 PB of raw data. DataDirect Networks released EXAScaler storage arrays with up to 14 PB of capacity across two racks. And the latest EMC Isilon network attached storage (NAS) arrays can scale up to 50 PB.

Joshua Hall provides an animated
explanation of what a petabyte
looks like.

Petabyte storage and backups

Petabytes are not suited to traditional backups, which have to scan the entire system every time a backup or archiving job occurs. Traditional NAS is scalable and capable of handling petabytes of data, but it can take too much time and use too many resources when going through the system's organized storage index. However, there are a number of other data storage technologies that can back up and archive at a petabyte scale:

  • Snapshots and other disk-based backup technologies provide a local copy of the data, enabling a rapid restore.
  • Tape and the cloud provide relatively low-cost backup options for petabytes of data, but are more often used as off-site archival storage rather than primary storage.
  • Solid-state storage can scan petabytes of data at a much higher speed without sacrificing data integrity.
  • Object storage assigns each object a unique identifier, allowing the system to search large amounts of data in a flat space as opposed to examining a complete storage index to find a specific file.

Petabytes and big data

There is no specific quantity of data that qualifies as big data, but the term often refers to information in the petabyte, or even exabyte, range. Mining for information across petabytes of data is a time-consuming task. Organizations working with big data often use the Hadoop Distributed File System because it facilitates rapid data transfer and allows a system to operate uninterrupted while working with petabytes of data.

Also see petaflop.

This was last updated in August 2016

Continue Reading About petabyte (PB)

Dig Deeper on Storage management and analytics



Find more PRO+ content and other member only offers, here.

Join the conversation

1 comment

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

What is your storage strategy for dealing with petabytes of data?


File Extensions and File Formats