The Global Distributed Diary (GD2)

1996 - 2000

 

Terje Fallmyr, Gunnar Hartvigsen and Tage Stabell-Kulø

Department of Computer Science,
Institute of Mathematical and Physical Sciences,
University of Tromsø, N-9037 Tromsø

September 16, 1996

 

Summary

The Global Distributed Diary project is an experimental project aimed at gaining knowledge of how systems can support integration of mobile nodes, and how distributed applications can be built that behave properly under the very variable conditions that may occur in such integrated systems.

The project has two main scientific goals. First, to address and devise solutions to the data consistency problems that arises when small, personal, variably connected mobile machines are integrated into an infrastructure for data sharing dominated by stationary, well connected computers. Second, to devise a distributed application (a distributed diary), along with appropriate system support (which includes security), that handles consistent read/write sharing of replicated data between a large set of heterogeneous, variably connected computers.

The Global Distributed Diary will be used as our research vehicle, and has few scientific merits per se. However, by exploiting the possibilities provided by the Global Distributed Diary, a wide range of problems can be investigated.


Index


1.0 Problem domain

The increasing integration of mobile (and/or portable) machines into the existing statically connected infrastructures, will make it more and more common that each person uses a set of machines to view and modify data. At any moment, the most convenient/productive machine will be chosen. However, the set of machines is heterogeneous in terms of processing power, storage capacity, user interface capabilities, and network services. Especially, the portable machines are relatively resource poor, and they are disconnected most of the time.

Since the whole set of machines is almost never connected, some required subset of the data will be replicated on each machine in order to support basic operations in the face of varying connectivity. Not all data can be replicated on all machines since portable or mobile machines often do not have sufficient resources. One might expect that some additional data can be fetched on demand, provided sufficient network services. In other cases, data will never be accessible from a mobile machine due to limited processor power, storage capacity, or user interface capabilities.

Updates performed one machine must be made available to the other machines in the set in order to satisfy data consistency constraints. The tolerated degree of inconsistency may be application dependent, which would render useless a consistency policy based on a single copy. Variable connectivity makes replication management hard, hence it should not be left to the applications alone either. Instead, the system should provide a selected set of strategies to the applications, a set that can satisfy the needs of many different application. The applications, on the other hand, must be flexible enough to select one of the policies, and utilise it in the best way.

We believe that mobile machines will be integrated in future information systems. It is therefore important that we gain knowledge of how systems can support such integration, and how we can build large scale distributed applications that behave properly under the very variable conditions that may occur in such integrated systems.

The integration also require solutions both at the system and application levels to technical issues on security, authentication and privacy, and on the ethical sides of these issues. These issues will also be addressed in the project.

 

2.0 Goals and method

2.1 Main scientific goals

The main scientific goals are to:

  1. Address and devise solutions to the data consistency problems that arise when small, personal, variably connected mobile machines are integrated into an infrastructure for data sharing dominated by stationary, well connected computers.
  2. Devise a distributed application (a distributed diary), along with appropriate system support (which includes security), that handles consistent read/write sharing of replicated data between a large set of heterogeneous, variably connected computers.
2.2 Sub-goals
  1. State the application requirements (including availability, and consistency requirements and trade-offs) necessary to construct the Global Distributed Diary (GDD).
  2. Develop protocols (with associated system support) capturing the consistency issues that allows programmers to express the application’s consistency constraints easily, and that have suitable implementations in a variable connected environment with many resource poor nodes.
  3. Device an infrastructure that can safeguard security and privacy for this environment in general and in this application in particular.
  4. Explore the integration of intelligent agents into the distributed diary. This sub-goal includes the examination of how intelligent agents can extend and/or substitute the distributed diary functions, and how such agents can meet the system security requirements.
2.3 Method

The method in the project is primarily experimental. The research is therefore driven by a distributed application—the Global Distributed Diary (GDD)—that will exercise all of the important problem areas.

Diary activities (meetings, appointments, to-do items) consist of core data and attachments. Core data includes items like title, start time, stop time, location, etc. Attachments are optional information that the user associates with each activity—such as notes, minutes from last meeting, related information (including multimedia), etc. Core data is compact and can be fully replicated, but the attachments cannot due to size and general resource requirement. The diary will include replication management, security, authentication, authorisation, and privacy, all which allows experimentation with key problem areas.

 

3.0 Project overview

The project contains the parts described below. The description outlines the central issues of each part, and what our expected results from the parts are.

All of the parts below is either a research area or is needed for the realisation of the diary application. Together they constitute the core parts of the GDD, and they exercise all of the important problem areas. We believe that together the parts will contribute to answering how mobile nodes can be integrated into well connected computer systems.

3.1 The GDD application

This part will develop the necessary requirements and fundamental design choices prior to the construction of the Global Distributed Diary, including requirements and trade-offs for:

The expected result from this part is the requirements specification, design and implementation of core parts of the GDD.

3.2 GDD distributed data repository

The distributed data repository for the GDD is a core research area, and is vital for the operation of the diary application.

Replicated systems are usually based on the assumption that they are connected. In case of partition and conflicting updates, the correctness is based on progress in the majority partition. In the type of system we are considering, desired progress may well be initiated from a disconnected machine, that may be (part of) a minority partition. Our solution will take this fundamental difference into account.

The results obtained may contribute directly to the main problem area of the project.

3.2.1 Replication management for fully replicated data

Fully replicated data in GDD is typically the diary core data. It consists of small data items of discrete data types.

Full replication ensures availability, hence the major issues are update strategies in the face of various GDD consistency demands and system consistency policies, and how to handle conflicts.

The expected result is a replication management scheme for fully replicated small size, discrete data items in the variable connected environment of the GDD.

3.2.2 Replication management for partially replicated data

Partially replicated data in GDD is typically attachments. They may be large data items of a large range of types—including continuos types (multimedia).

Partial replication brings challenges to availability, scalability, update strategies in the face of various GDD consistency demands and system consistency policies. The handling of conflict resolution will also be of importance.

The expected result is a replication management scheme for partially replicated large size, discrete and continuos data items suitable for the variable connected environment of the GDD.

3.3 GDD—system interface

The interface between the GDD and system services must be designed such that is allows co-operation between the application and the system. It is our hypothesis that such co-operation is necessary to achieve the application and system flexibility that is required in the type of system we are investigating.

To the core of the problem is the variable amount of services and resources that the system services on mobile computers can supply to its applications.

The results obtained may contribute directly to how to construct flexible applications that behave well over a system that provides varying resources and services, including dynamic and seamless switching between wired and wireless networks.

3.3.1 GDD design for flexibility and co-operation with system

In this part of the project we will investigate:

The expected result is a requirements specification, design and implementation of a GDD that is able to exhibit flexibility in the face of variable resource provisions.

3.3.2 System design for flexibility and co-operation with application

In this part of the project we will investigate:

The expected result is system that is able to exhibit flexibility in the face of variable resource provisions.

3.4 Security and privacy

Little has been published about security and privacy in mobile personal computing. In particular, related contemporary systems tend to assume the existence, and availability, of a trusted third party. This assumption is convenient in a well-connected environment, but does not hold when the system is blended with mobile computers.

Small, personal machines will increasingly contain personal information, and they will be seen as the manifestation of the user in security demanding applications. Privacy and security of communication and storage are therefore paramount. This part of the project will investigate issues related to striking the balance between privacy and usability.

We envision a scenario where the user depends on the well functioning of a small personal machine. The user is the principal figure in this scenario, and access to information in the personal machine must not be possible without the user’s permission (e.g., if a personal machine is lost or stolen and then used). On the other hand, the user must be granted access to his or her resources even if he/she does not have a particular personal machine at hand, for example after loosing it.

In this part of the project we will investigate:

The expected result is a security and privacy system for the GDD that will provide a balance between ensuring access to services and the required privacy.

3.5 Integration of intelligent agents into the GDD

The agent paradigm represents another interesting approach to some of the tasks the Global Distributed Diary is intended to solve. Of specific interest to us is to investigate how agents may behave and maybe improve the usability of the GDD.

In this part of the project we will investigate:

The expected result is a study on how agents may behave and maybe improve the usability of the GDD. If the approach seems promising, we also expect to specify, design, and implement rudimentary agents in the GDD.

 

4.0 Approach and resource usage

The approach to obtain the expected goals for each part of the project is described below. The description outlines how the central issues of each part are to be obtained, including estimated resource usage. The resource usage is preliminary.

The equipment will be Unix/NT workstations, portable computers and palmtop computers, all connected to the network via wire and wireless connections. Each researches will have a set of computers at his/her disposal. (See Appendix)

4.1 The GDD application

The basic approach in this part is to carry out the requirements specification, design, and implementation of the main parts of the GDD application.

Estimated personnel resource usage: 1-3 man-years

4.2 GDD distributed data repository

Both of the issues below will methodologically build upon and modify previous and contemporary work on replication and consistency. The project will partly be evaluated by leading international researchers.

4.2.1 Replication management for fully replicated data

Based on the general comments above, the replication management for fully replicated data in GDD will be specified, designed and implemented.

Estimated personnel resource usage: 1-2 man-years

4.2.2 Replication management for partial replicated data

Based on the general comments above, the replication management partial replicated data in GDD will be specified, designed and implemented.

Estimated personnel resource usage: 1-2 man-years

4.3 GDD—system interface

Both of the issues below, will methodologically build upon previous and contemporary work on co-operation over application and system interfaces, combined with work done on Quality of Service (QoS).

4.3.1 GDD design for flexibility and co-operation with system

Based on the general comments above, the GDD application interface will be specified, designed and implemented.

Estimated personnel resource usage: 1 man-year

4.3.2 System design for flexibility and co-operation with application

Based on the general comments above, the system interface will be specified, designed and implemented. We will aim towards modifying an existing operating system and its interface instead of building a new operating system.

Estimated personnel resource usage: 1 man-year

4.4 Security and privacy

A prototype security and privacy system for the GDD will be designed and implemented.

Estimated personnel resource usage: 2-3 man-years

4.5 Integration of intelligent agents into the GDD

We will study fundamental properties of software agents, and how they may improve the functioning and usability of the GDD. If the results of the study turns out positive, we will seek to specify, design, and implement secure software agents in the GDD.

Estimated personnel resource usage: 1-3 man-years

 

5.0 External contacts

This project will build on, and is linked to several ongoing activities: The Pasta project and the MobyDick project.

5.1 The Pasta project

The Pasta project seeks simple but working solutions to whole file cashing between variable connected heterogeneous “workplace’’ machines. The project studies suitable consistency models and cache management schemes for this environment, and implements a prototype system based on one of them. Pasta is a joint project with University of Pisa, Italy.

5.2 The MobyDick project

The overall goal of the MobyDick project is to design an architecture that is capable of releasing the full potential of a small hand-held computer, the Pocket Companion. The design challenges lie primarily in the creation of a single architecture that integrates: security functions (e.g., payment), externally offered services (e.g., airline ticket reservation), personality (i.e., these devices know what their owners want), and communication services.

MobyDick is a joint project with Universiteit Twente, the Netherlands, and University of Pisa, Italy, and has been granted 80.000 ECU for a one yeas feasibility study as ESPRIT-IV LTR 20422 (“MobyDick, The Pocket Companion”).

 

6.0 Overall project deliverables

6.1 Expected deliverables 1996-2000

The expected deliverables in the project period (1996-2000) are:

In addition, the project will be presented through popular science articles in newspapers, etc.

6.2 Milestones

December 1996:

December 1997:

December 1998:

December 1999:

December 2000:

 

7.0 Total costs

Since part of the equipment is in-house, the major cost in the project will be operational costs (travelling costs for the project members, conference presentations, etc.). The total costs for the University of Tromsí, Department of Computer Science and researchers at Hedmark College, are presented in Table 1.

TABLE 1. The costs below do not include the expenditures of the foreign partners.

1996 1997 1998 1999 2000 Sum
Equipment 160 80 80 320
Operational costs 75 150 150 150 525
Research Scholar 1 162,5 325 325 162,5 975
Visit abroad (res. scholar) 90 90
Total NFR 397,5 555 645 312,5 0 1910
Internal Funding 50 100 100 100 100 450
Total costs 447,5 655 745 412,5 100 2360

All local funds (provided by the University of Tromsø) available to the faculty members involved will be tied to this project.

7.1 Operational costs

The main activity financed through grants from the Research Council of Norway (Norges Forskningsråd) will be within the part “Security and privacy” (sections 3.4 and 4.4). However, solutions to the problems addressed in this area depend on results received in the other parts of the project. In addition, we expect publishable results to appear in all parts of the project.

7.1.1 General cost estimates per year

Operational costs per year (from 1997):

Conference travels require accepted paper.

 

[Detailed annual budget deleted]

 

8.0 Institutions and Personnel

The project will involve personnel from four different institutions. This project will expand co-operation that already is in place even further.

8.1 University of Tromsø (principal institution)

The following personnel will work on the project at the Department of Computer Science, University of Tromsø:

  1. Professor dr. Gunnar Hartvigsen, project leader
    Responsibility: Supervision (D.Sc., M.Sc., siv.ing.), Research
    Hours/year: 1,000
  2. Assistant professor Terje Fallmyr
    Responsibility: Supervision (M.Sc., siv.ing.), Research
    Hours/year: 1,000
  3. Research Scholar (universitetsstipendiat) Tage Stabell-Kulø
    Responsibility: Research, Co-supervision (M.Sc., siv.ing.)
    Hours/year: 1,500
8.2 Participating institution (Norway)

Personnel at the Hedmark College {Høgskolen i Hedmark} (Rena, Norway)

Personnel at the Stavanger College {Høgskolen i Stavanger} (Stavanger, Norway) include:

  1. Associate professor Stig Johansen
    Responsibility: Research
    Hours/year: 1,000
8.3 Participating institution (Abroad)

Personnel at the University of Pisa:

  1. Researcher dr. Marco Avvenuti
  2. Researcher dr. Alberto Bartoli
  3. Researcher dr. Gianluca Dini
  4. Assistant Professor dr. Luigi Rizzo

Personnel at the University of Twente:

  1. Professor dr. Sape Mullender
  2. Associate professor dr. Gerard Smith
  3. Research Scholar (AIO) Arne Helme

At all institutions, numerous students will take part in the project.

 

Appendix: Equipment

The investigations we propose require modern computers and networking facilities. Today, we have at our disposal

4++ Unix/NT workstations (HP)

8++ Portable computers (IBM/HP)

3 Palm-top computers (HP)

connected by Ethernet, wireless radio-based (Xircom), wireless infra-red (HP/IBM) and GSM telephones.

We will require that students and staff have access to laptop-sized and palm-top sized computers for their everyday use. We estimate the need to be 1-2 laptops and 1-3 palmtops each year, plus continuous upgrade of the oldest machines.