|
> Application
consistency
> Appliance
based replication
> Array based
replication
> Consistency
groups
> Continuous data
protection (CDP)
> Continuous remote
replication (CRR)
> Crash consistency
> De-duplication
> Five 9s availability
> Input/Outputs
per second (IOPS)
> Point-in-time
data recovery
> Redundant Array
of Independent Nodes (RAIN)
> Recovery Point Objective (RPO)
> Recovery Time
Objective (RTO)
> Replication
> Snapshot
> Write Splitter
Guarantees that the application's data is in a consistent
state at the point-in-time when it is replicated,
backed-up or snapped.
The application will be placed into a quiescent state
which will commit in-memory transactions and then halt
writes to the database and log files.
For example Microsoft Volume Shadow Copy Service (VSS)
provides a framework for application consistent backup,
snapshot and replication of Microsoft Exchange and SQL
Server. The alternative is crash
consistency.
Replication
is performed by an appliance connected either to the
hosts, SAN Fabric or storage array. Two options are
available:
The appliance resides outside of the primary data
path therefore an applications I/O does not
flow through the appliance. Implementing an out-of-band
appliance delivers replication without impacting an
applications I/O operations (i.e. performance
and reliability).
The appliance resides in the primary data path therefore
an applications I/O flows through the appliance.
Implementing an in-band appliance delivers replication
which can have an impact on an applications
I/O operations (i.e. performance and reliability).
Replication
is performed within the storage processors of the array,
no additional hardware or host resources are required.
A collection of volumes that when replicated remain
in a consistent state with respect to each other. All
data will be synchronised to the exact same point-in-time.
For example you can protect data from one or more applications
that use multiple volumes.
Automatically saves a copy of every change made to a
volume locally, essentially capturing every version
of the data that the user saves, allowing the administrator
to restore data to any point-in-time.
Application
consistency points can be periodically scheduled
to avoid having to recovery from a crash
consistent image. The RPO
is zero.
CDP is different from traditional backups in that there
are no backup schedules and you don't have to specify
the point-in-time
to which you would like to recover until you are ready
to perform a restore. Traditional backups can only restore
data to the point at which the backup was taken.
Automatically saves a copy of every change made
to a volume remotely, essentially capturing every version
of the data that the user saves, allowing the administrator
to restore data to any point-in-time.
Application
consistency points can be periodically scheduled
to avoid having to recovery from a crash
consistent image. The RPO
is typically seconds or greater and it supports unlimited
distances between storage devices. Two options are available:
Each write transaction is acknowledged locally at
the source side and then sent to the target side.
The primary advantage of continuous asynchronous replication
is its ability to provide synchronous-like replication
without degrading the performance of host applications.
Transfers data that has changed between one consistent
image of the storage subsystem and the next. The use
of high-frequency snapshots largely overcomes the
shortcomings of the snapshot not being up-to-date.
Typically powerful bandwidth reduction compression
technologies can be applied resulting in a significant
savings in bandwidth.
The application's data is not put into a consistent
state when it is replicated,
backed-up or snapped.
The data is in the same state as if there had been a
power outage, hardware failure or software crash. Most
applications have a built-in crash recovery mechanism
that will allow it to recover a crash consistent copy
of its data. The alternative is application
consistency.
Enterprise data is highly redundant, with identical
files and sub-file data segments stored within systems.
De-duplication solutions assign each data segment a
unique ID, based on its content, which is used to compare
it with other data segments that have already been backed
up. Only new, unique data segments are stored and typically
de-duplication occurs across sites and servers, hence
the term 'global de-duplication.
De-duplication can occur at the data source or the
backup target. With source-based de-duplication, data
is de-duplicated as the backup process begins and before
the data is sent over the network. This provides the
benefit of shorter backup windows and lowered bandwidth
requirements, making it ideal for remote or WAN-based
backup, VMware, large file servers, and other environments
where the backup process is hampered by network or other
resource bottlenecks.
For target de-duplication the main challenge being
addressed is the growth of back-end storage. The backup
application sends data to the target storage device
and the data is de-duplicated at the device, either
immediately or at a scheduled time. It is found in VTLs
and LAN backup to disk appliances or platforms and provides
the benefit of plug and play with existing backup applications.
Unlike source based de-duplication this will not remove
bottlenecks in getting the data to the backup storage
device.
The equivalent of an average of 5.26 minutes of unplanned
downtime per year, or 99.999% system availability. Nowadays
this level of enterprise class availability is required
for most critical business data.
The total number of reads (typically around 70%) and
writes (typically around 30%) per second provided by
a disk system.
Fibre Channel/SAS disks can provide twice the IOPS
provided by SATA disks and Flash disks can provide thirty
times the IOPS provided by Fibre Channel/SAS disks.
Journals all data changes to a dedicated volume allowing
recovery to any point-in-time. For example if a volume
is corrupt then it could be recovered to the point prior
to the corruption occurring. Also see CDP.
RAIN works in a similar fashion to RAID to deliver
high availability, but rather than protecting against
disk failure it protects against server failure. It
uses a grid architecture, which allows for online expansion
for increased scalability.
The acceptable amount of data as defined by an organisation
that can be lost in the event of a disaster measured
in time. For example an RPO of 2 hours requires the
data to be restored at a point-in-time no earlier than
2 hours prior to the disaster occurring.
The RPO in conjunction with the RTO
is the basis on which a business continuity strategy
is developed.
The duration of time as defined by an organisation within
which a business process must be restored after a disaster.
For example an RTO of two hours requires systems to
be back up and running and accessible within 2 hours
of the disaster occurring.
The process of copying or mirroring data from one storage
device to another, within the same storage array, or
to a different array located locally or remotely. Typically
only protects the most recent copy of the data and if
it becomes corrupted will simply "protect"
the corrupt data. CDP
will protect against the effects of data corruption
by allowing a restore to a previous, uncorrupted version.
Two options are available:
Guarantees zero data loss by mirroring writes
to a secondary storage device. A write is not considered
complete until acknowledged by both storage devices.
Performance drops proportional to distance, as latency
increases, therefore it is only suitable when there
is limited distance (100km or less) between storage
devices. The RPO
is zero.
The write is considered complete as soon as the primary
storage device acknowledges it. The secondary storage
device is updated, but lags behind the primary. Performance
is not impacted therefore it supports unlimited distances
between storage devices. The RPO
is typically 30 minutes or greater.
A copy of a set of files and directories as they were
at a particular point-in-time. Snapshots can be mounted
read-only, or read-write, used to instantly restore
the current data to a given point-in-time, and can be
used for parallel processing such as accelerated backups,
reporting and testing. Two options are available:
Maintains a log of changes and combines the production
volume with these changes to create a logical point-in-time
volume. Takes seconds to create and requires significantly
less space than a clone.
Physically independent full copy of the production
volume. Can take a considerable amount of time to
initially create and requires the same space as the
production volume.
Replicates data to a secondary storage device by intercepting
application writes. Options include:
Requires an agent to be installed on each server and
therefore will have a small impact on CPU utilisation
on the host.
Provided within the FC switch, from vendors such as
Brocade and Cisco, therefore will not have any impact
on host performance and will not require the installation
of a host agent.
Provided within the array's storage processor therefore
will not have any impact on host performance and will
not require the installation of a host agent.
|