HANDLING UNCERTAINTY IN DATA INTENSIVE APPLICATIONS ON A DISTRIBUTED COMPUTING ENVIRONMENT (CLOUD COMPUTING)

Abstract: 

Given the advances in memory capabilities, larger data sources are now stored. With the spread of the WWW, these data sources (and many other new ones) are rapidly becoming available. Often the information that such data can provide is not precise or certain. However, the need for accessing data that provide useful yet inconclusive information is recognized by various communities, in the Sciences and elsewhere.

We name a few such applications:

  • molecular biology and genomics,
  • creation of value-added services for a mobile workforce and populace,
  • diagnostic processes in radiology and other medical subspecialties,
  • communities and interaction (Web 2.0),
  • creation of "Web 3.0" applications that aim to transform the Web into a database, the Data Web.

Given the vast amount of resources that require consolidation, and the dynamic nature of the needs that we try to satisfy, it is also desirable that the management of data should be performed efficiently. The great success of the Google MapReduce paradigm (or the open source equivalent Hadoop) in a few but very important applications, offer a promising computational environment for hugedata processing on clusters of low-end machines with fault-tolerance. Many researchers have noticed the limited applicability of such systems however and a new trend is to extend them to be applied to broader classes of problems (e.g., use other functions besides Map and Reduce or handle recursive algorithms).

In this project we will introduce a new model of uncertain databases and define query answering mechanisms which will be suitable in this uncertain framework. New algorithmic techniques and programming languages will be developed so that answers meet a specific quality threshold. To implement our techniques efficiently, systems such as Hadoop will be extended for the full potential of the new distributed computation environment to be exploited.

 

MIS 380153

Project info

Acronym:
ntua121
Scientific Coordinator:
Afrati Foto
Research Team 2 Leader:
Rondogiannis Panagiotis
Research Team 3 Leader:
Gergatsoulis Manolis

Stats

I.D.:
807
Mis:
380153
Duration (months):
44
Budget:
600 000.00
Diavgeia:
ΑΔΑ: Β4139-3ΙΤ

Document Library