外文文献阅读笔记(2)

2018-12-20 22:05

Dynamic replica placement and selection strategies in data grids----A comprehensive survey

--- Journal of Parallel and Distributed Computing

merit, demerit, tedious, namely, whereas, various, literature, facilitate, suitable, comparative, optimum, retrieve, rapid, evacuate, invoke, identical, prohibitive, drawback, periodically, with respect to in particular in general

as the name indicates far apart

consist of , consist in

Data replication techniques are used in data grid to reduce makespan, storage consumption, access latency and network bandwidth.

Data replication enhances data availability and thereby increases the system reliability.

Managing dynamic architecture of the grid, decision making of replica placement, storage space, cost of replication and selection are some of the issues that impact the performance of the grid.

Benefits of data replication strategies include availability, reliability, scalability, adaptability and improved performance.

As the name indicates, in dynamic grid, nodes can join and leave the grid anytime.

Any replica placement and selection strategy tries to improve one or more of the following parameters: makespan, quality assurance, file missing rate, byte missing rate, communication cost, response time, bandwidth consumption, access latency, load balancing, maintenance cost, job execution time, fault tolerance and strategic replica placement.

Identifying Dynamic Replication Strategies for a High-Performance Data Grid

--- Grid Computing 2001

identify, comparative, alternative, preliminary, envision, hierarchical, tier, above-mentioned, interpret, exhibit, defer, methodology, pending, scale, solely, churn out

large amounts of pose new problems denoted as adapt to

concentrate on doing conduct experiments send it off

in the order of petabytes as of now

Dynamic replication can be used to reduce bandwidth consumption and access latency in high performance “data grids” where users require remote access to large files.

A data grid connects a collection of geographically distributed computer and storage resources that may be located in different parts of a country or even in different countries, and enables users to share data and other resources.

The main aims of using replication are to reduce access latency and bandwidth consumption. Replication can also help in load balancing and can improve reliability by creating multiple copies of the same data.

Group-Based Management of Distributed File Caches

--- Distributed Computing Systems, 2002

mechanism, exploit, inherent, detrimental, preempt, incur, mask, fetch, likelihood, overlapping, subtle, in spite of contend with

far enough in advance take sth for granted (be) superior to

Dynamic file grouping is an effective mechanism for exploiting the predictability of file access patterns and improving the caching performance of distributed file systems.

With our grouping mechanism we establish relationships by observing file access behavior, without relying on inference from file location or content.

We group files to reduce access latency. By fetching groups of files, instead of individual files, we increase cache hit rates when groups contain files that are likely to be accessed together.

Further experimentation against the same workloads demonstrated that recency was a better estimator of per-file succession likelihood than frequency counts.

Job scheduling and data replication on data grids

--- Future Generation Computer Systems

throttle, hierarchical, authorized, indicate, dispatch, assign, exhaustive, revenue, aggregate, trade-off, mechanism, kaleidoscopic, approximately, plentiful, inexact, anticipated, mimic, depict, exhaust, demonstrate, superiority, namely, consume,

to address this problem data resides on the nodes a variety of aim to

in contrast to for the sake of by means of

play an important role in have no distinction between in terms of

on the contrary with respect to and so forth by virtue of

referring back to

A cluster represents an organization unit which is a group of sites that are geographically close.

Network bandwidth between sites within a cluster will be larger than across clusters.

Scheduling jobs to suitable grid sites is necessary because data movement between different grid sites is time consuming.

If a job is scheduled to a site where the required data are present, the job can process data in this site without any transmission delay for getting data from a remote site.

RADPA: Reliability-aware Data Placement Algorithm for large-scale network storage systems

--- High Performance Computing and Communications, 2009

ever-going, oblivious, exponentially, confront, as a consequence that is to say

subject to the constraint it doesn't make sense to do

Most of the replica data placement algorithms concern about the following two objectives, fairness and adaptability.

In large-scale network storage systems, the reliabilities of devices are different relevant to device manufacturers and types.

It can fairly distributed data among devices and reorganize near-minimum amount of data to preserve the balanced distribution with the changes of devices.

共4页:

外文文献阅读笔记(2).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档