What is Object Storage?
Object storage, also known as object-based storage, is a data storage architecture designed to handle unstructured data at scale. While block storage breaks data into blocks with unique identifiers, and file systems manage data with a file hierarchy, object storage stores data as objects. Each object acts as a self-contained discrete data repository that has three components: a globally unique identifier that is used to address the object, metadata, and the data payload (or raw data). Metadata contains information about the data payload but can also contain system data such as protection policies or custom metadata.
Objects are stored in a flat organization without any folders or hierarchical structures typically used in a file system. Similar to a key-value pair in databases, object storage uses every object’s unique identifier (key) to refer to the data payload (value). The metadata specific to each object is customizable and can include any user-defined attributes that will help to streamline file search, query, index, and analytics.
Object storage creates namespaces that provide an easy way to access the data stored as objects. Typically, in an object storage cluster, a global namespace forms a single logical access portal for users and applications. Object storage allows data to be accessed through a variety of protocols including S3 and HTTP(S).
According to IDC, 80% of all the data in the world will be unstructured by 2025. This presents the opportunity for all this data to be stored in either file or object storage. Object storage is usually chosen over file storage when data needs to be preserved in an archive for a long period of time – such as cold/inactive data, backup and replicas, compliance data, analytical data, media files, etc. Object storage is significantly more cost-efficient and scalable than block and file storage and is typically implemented on-premises, as a private cloud or on a pay-per-use model as a public cloud service.
Evaluating Object Storage Solutions
By Eric Burgener from IDC and Object Storage Experts from DataCore Watch the Webinar
Key Capabilities of Object Storage
Scalability
One of the main reasons organizations use object storage is its unprecedented scalability. Object storage clusters can scale from Terabytes to Exabytes and store billions of files accessed by millions of tenants. Instead of storing files in a file system, files are stored as objects in a flat address space.
By decoupling file management from the low-level block management, every disk in the object storage cluster participates in the namespace and data services are layered on top of that. This level of separation provides excellent manageability at scale. In addition, extensible metadata-based data management simplifies data accessibility. This makes object storage practically limitless in terms of handling any capacity of data.
Distributed Access
Object storage enables content access to users and applications via various protocols and access methods including S3 and HTTP(S), and REST APIs. Distributed access allows multiple users across distributed locations to access files at the same time.
If any change is needed to be performed by a user at an object level, a new version of the object will be created instead of editing the existing one. The number of object versions that can be created is often configurable by the administrator. There is also additional policy-based object locking functionality that restricts modification of an object or group of objects for a specific period of time or indefinitely. This level of flexibility for access, editing, and locking enables true distributed access to end users and control for administrators.
Custom Metadata
In object storage solutions, the metadata stores information about the data and is used for data retrieval and governance. Object storage allows users to enrich a file’s metadata with any number of custom attributes.
This simplifies content search, indexing, and analytics. Metadata can also be used to set object-level or bucket-level policies that define how data should be protected and stored.
Data Durability
While data availability refers to system uptime, data durability is focused on ensuring long-term data protection. Object storage solutions offer strong data protection capabilities to ensure data durability and eliminate single points of failure.
- Replication of data to one or more nodes ensures data copies are synchronously orasynchronously replicated within the system or externally to a DR (disaster recovery) site. If there is an issue, the object storage system will automatically create additional copies based upon the policy defined to maintain data redundancy. Data copies are also recovered from the DR site in the event of a site failure.
- Erasure coding combines data with parity information and then segments and distributes it across the object storage cluster. In the event of data loss, data is immediately rebuilt from erasure-coded segments and restored to its original state.
Immutability
This is the capability of the object storage platform to make objects immutable, i.e., not able to be deleted or modified. Immutability helps preserve records and maintain data integrity. When a user tries to edit a file, a new object version gets created and the original object is preserved as-is.
In case of a threat vector (such as a ransomware attack) or accidental deletion or overwrite of data, administrators can easily roll back to the original data. Compliance regulations also demand data is stored on non-erasable and non-rewritable media and legal holds be applied to files for specific periods of time. Immutability helps meet all these requirements and makes object storage WORM (Write-Once-Read-Many)-compliant.
Intelligent Data Management
One of the primary use cases for object storage is as an economical archive; however, it is more than “cheap and deep” storage. Object storage can help manage the data lifecycle from creation, change, deletion, access, collaboration, and protection. Object storage supports creating copies of data through replication, locking and encrypting files, protecting against deletion, applying access controls, searching files with rich metadata, integrating with applications through REST APIs, and more.
Some object storage solutions allow content streaming directly from the storage layer without having to download it to a local repository. An intuitive user interface usually serves as the content portal for users to access files over the internet. This expands the use case of object storage from an archive to a full-fledged content management and delivery platform.
Multi-Tenancy
A multi-tenant object storage architecture can be used to share resources across a cluster with multiple tenants (internal users or external subscribers). This is a popular implementation by service providers who are hosting storage resources for their clients from a centralized and shared storage cluster – either in a public cloud or private cloud. Tenants are given different levels of access permissions and controls to access data based on their level of subscription. Even within an organization, when different departments need to store and protect their data based on specific policies, they can adopt a multi-tenant deployment within their data center.
Key characteristics of object storage multi-tenancy include:
- Integration with identity management systems such as AD and LDAP systems to implement access controls and security policies
- Capacity usage quotas and metering to provide the ability to define and manage utilization across different tenants
- Auditing and reporting capabilities to help service providers with billing and account management
- End-user self-servicing options to allow tenant-level management of data with granular access controls, user permissions, quotas, etc.
Object Storage Performance Benchmarking
Utilizing the Science and Technology Facilities Council’s
Super Data Cluster Environment
Popular Use Cases for Object Storage
Here are some popular use cases of object storage catering to different industry requirements. There are numerous other applications of object storage in organizations and service provider environments.
Active Archive
Offloads data from primary NAS storage
Immutable Storage for Backups
Defends against data loss and threat vectors
Nearline Archive
Supports digital media workflows both in-facility and on-set (edge)
Origin Storage
For OTT/VOD services and content delivery
Medical Imaging Archive
Stores medical images, PACS, and VNA for healthcare sectors
Archive for Digital Asset Management
Protects assets, enabling low-latency access
Data Lake Storage
Handles massive workloads in research, big data, IoT, and AI/ML
Multi-tenant Storage
Facilitates various cloud service offerings (e.g., StaaS)
Long-term Data Preservation
Future-proofs content protection – no forklift upgrades
Alternative to Public Cloud and LTO Tape
Best-suited for online, on-premises data storage
Differences Between Block, File, and Object Storage
Let’s look at some of the key differences between the architecture, characteristics, and applicability of block storage, file storage, and object storage.
Block Storage | File Storage | Object Storage | |
---|---|---|---|
:Best suited for | Block Storage:Highly transactional data, low-latency applications, databases, and VM workloads | File Storage:Distributed file sharing and collaboration, scale-out file system | Object Storage:Distributed content access, delivery, and archive |
:How data is stored | Block Storage:Data broken into blocks within disk tracks and sectors | File Storage:Files organized in folders with a directory/path | Object Storage:Self-contained objects |
:Data Structure | Block Storage:Volume, LUN | File Storage:Hierarchical | Object Storage:Key/Value |
:Popular protocols used for data access | Block Storage:iSCSI and Fibre Channel | File Storage:NFS and SMB | Object Storage:S3 and HTTP(S) |
:Metadata support | Block Storage:No | File Storage:Yes, fixed file system metadata. Stored separately from the file. | Object Storage:Yes, custom metadata. Stored with the object itself. |
:Namespace support | Block Storage:No | File Storage:Yes | Object Storage:Yes |
:Performance | Block Storage:Very high | File Storage:High | Object Storage:High (optimized for throughput) |
:Scalability | Block Storage:Comparatively low | File Storage:High | Object Storage:Very high |
:Cost | Block Storage:Very high | File Storage:High | Object Storage:Low |
Benefits of DataCore Swarm Object Storage
DataCore Swarm provides an on-premises object storage solution that radically simplifies the ability to manage, store, and protect data while allowing S3/HTTP(S) access to any application, device, or end-user. Swarm transforms your data archive into a flexible and immediately accessible content library that enables remote workflows, on-demand access, and massive scalability.
Swarm provides a platform for data protection, management, organization, and search at enterprise scale. You no longer need to migrate data into disparate solutions for long-term preservation, delivery, and analysis. Consolidate all files on Swarm, find the data you are looking for quickly, and reduce total cost of ownership by continuously evolving hardware and optimizing resources.
Get Started with Swarm, Software-Defined Object Storage
Store your growing data on scalable & secure active archive. No more insane cloud costs!
Join thousands of IT pros who have benefitted from DataCore solutions.
“Scalability is one of the slickest things about Swarm. One sweet thing with the platform is when you drop a new node in, it builds itself from the ground up. If you want to add a 72 TB or PB node you just drop it in.”