Definition

petabyte

By

Rodney Brown, TechTarget
Erin Sullivan, Senior Site Editor

What is a petabyte?

A petabyte is a measure of memory or data storage capacity that is equal to 2 to the 50th power of bytes. There are 1,024 terabytes (TB) in a petabyte and approximately 1,024 PB make up one exabyte.

Traditional network-attached storage (NAS) is scalable and capable of handling petabytes of data, but it can take too much time and use too many resources when going through the system's organized storage index.

In terms of memory, a typical laptop or desktop computer contains 16 GB of random access memory (RAM). A top-end server can contain as much as 6 TB of RAM. That means it would take 170 top-end servers -- or roughly 61,000 desktops -- to add up to a single petabyte of RAM.

For another example of how large a petabyte is, a typical DVD holds 4.7 GB of data. That means a single terabyte of storage could hold 217.8 DVD-quality movies, while a single petabyte of storage could hold 223,101 DVD-quality movies.

petabyte comparison

Petabyte storage vendors

Barely a decade ago, data storage vendors would boast of selling an aggregate of a petabyte or two in all of their storage systems sold. Due to the continued rapid increase in data storage capacity requirements, it's now common to see individual companies and even single storage systems with more than a petabyte of storage capacity.

Storage vendors that offer petabyte-level storage include the following:

Fujitsu
Qnap
Spectra Logic
StoneFly
Vast Data

Petabyte backups and storage

Other data storage technologies can back up and archive at a petabyte scale.

Snapshots and other disk-based backup technologies provide a local copy of the data, enabling a rapid restore.
Tape and the cloud provide relatively low-cost backup options for petabytes of data, but they are more often used as off-site archival storage rather than primary storage.
Solid-state storage can scan petabytes of data at a much higher speed without sacrificing data integrity.
Object storage assigns each object a unique identifier, enabling the system to search large amounts of data in a flat space as opposed to examining a complete storage index to find a specific file.

storage capacity measurements

Petabytes and big data

There is no specific quantity of data that qualifies as big data, but the term often refers to information in the petabyte, or even exabyte, range. Mining for information across petabytes of data is a time-consuming task. Organizations working with big data often use the Hadoop Distributed File System because it facilitates rapid data transfer and enables a system to operate uninterrupted while working with petabytes of data.

To get a sense of how big some data warehouse stores have become, in July 2017, the European research center CERN announced that its data center had 200 PB archived in its tape library.

With the increased use of 4K video and the advent of the internet of things, IDC predicted that by 2025 there will be 175 zettabytes -- or approximately 175,000,000 PB -- of data that needs storage.

Editor's note: This article was revised in 2022 by TechTarget editors to improve the reader experience.

This was last updated in December 2022

Continue Reading About petabyte

Is demand for data storage or supply driving increased storage?

Differences in scale-up vs. scale-out storage

What's driving the resurgence in tape storage use?

Dig Deeper on Storage management and analytics

Disaster Recovery

New SIOS console enables high availability visualization
IT generalists on Linux systems can avoid the complexity of HA management for mission-critical apps or databases with a new ...
4 disaster recovery plan best practices for any business
Disaster recovery plans are unique, built around an organization's size, type and industry. However, there are some key best ...
Free business continuity testing template for IT pros
Business continuity testing can be a major challenge for any organization. This free template offers ways to incorporate testing ...

Rubrik IPO to grow platform, reach
Rubrik goes public, becoming the first data backup vendor to do so in years. It plans to expand its security cloud software and ...
Veeam acquires Coveware for incident response capabilities
Coveware will remain operationally independent with its cyberincident capabilities and ransomware research complementing the data...
Cohesity adds confidential computing to FortKnox
Cohesity is partnering with Intel to bring confidential computing technology to its FortKnox vault service -- a welcome, if ...

How to calculate data center cooling requirements
Data center cooling requirements are affected by several factors, including the equipment's heat output, floor area, facility ...
Lenovo, AMD broaden AI options for customers
Lenovo is expanding its partnership with AMD to bring more options for servers and HCI devices aimed at AI. It also launched an ...
Infrastructure for machine learning, AI requirements, examples
Infrastructure for machine learning, deep learning and AI has component and configuration requirements. Compare hardware and how ...

Sustainability
and ESG

Green IT audit: What it is and how to prepare
A green IT audit uses standards to help companies understand the ways an organization's tech practices affect the environment. ...
Businesses need to prepare for SEC climate rules, EU's CSRD
While the SEC's new climate rules and the EU's CSRD are both facing delays, businesses still need to identify methods for ...
A green IT assessment: Why it's important, what to include
A company's technology systems and devices can have a profound effect on sustainability efforts. Learn how a green IT assessment ...

Close