CHAPTER 1 PROBLEMS 1. Q: What is the role of middleware in a distributed system?
A: To enhance the distribution transparency that is missing in network operating systems. In other words, middleware aims at improving the single-system view that a distributed system should have.
2. Q: Explain what is meant by (distribution) transparency, and give examples of different types of transparency.
A: Distribution transparency is the phenomenon by which distribution aspects in a system are hidden from users and applications. Examples include access transparency, location transparency, migration transparency, relocation transparency, replication transparency, concurrency transparency, failure transparency, and persistence transparency.
3. Q: Why is it sometimes so hard to hide the occurrence and recovery from failures in a distributed system?
A: It is generally impossible to detect whether a server is actually down, or that it is simply slow in responding. Consequently, a system may have to report that a service is not available, although, in fact, the server is just slow.
4. Q: Why is it not always a good idea to aim at implementing the highest degree of transparency possible?
A: Aiming at the highest degree of transparency may lead to a considerable loss of performance that users are not willing to accept.
5. Q: What is an open distributed system and what benefits does openness provide?
A: An open distributed system offers services according to clearly defined rules. An open system is capable of easily interoperating with other open systems but also allows applications to be easily ported between different implementations of the same system.
6. Q: Describe precisely what is meant by a scalable system.
A: A system is scalable with respect to either its number of components, geographical size, or number and size of administrative domains, if it can grow in one or more of these dimensions without an unacceptable loss of performance.
7. Q: Scalability can be achieved by applying different techniques. What are these techniques?
A: Scaling can be achieved through distribution, replication, and caching.
8. Q: What is the difference between a multiprocessor and a multicomputer?
A: In a multiprocessor, the CPUs have access to a shared main memory. There is no shared memory in multicomputer systems. In a multicomputer system, the CPUs can communicate only through message passing.
9. Q: A multicomputer with 256 CPUs is organized as a 16 × 16 grid. What is the worst-case delay (in hops) that a message might have to take?
A: Assuming that routing is optimal, the longest optimal route is from one corner of the grid to the opposite corner. The length of this route is 30 hops. If the end processors in a single row or column are connected to each other, the length becomes 15.
10. Q: Now consider a 256-CPU hypercube. What is the worst-case delay here, again in hops?
A: With a 256-CPU hypercube, each node has a binary address, from 00000000 to 11111111. A hop from one machine to another always involves changing a single bit in the address. Thus from 00000000 to 00000001 is one hop. From there to 00000011 is another hop. In all, eight hops are needed.
11. Q: What is the difference between a distributed operating system and a network operating system?
A: A distributed operating system manages multiprocessors and homogeneous multicomputers. A network operating system connects different independent computers that each have their own operating system so that users can easily use the services available on each computer.
12. Q: Explain how microkernels can be used to organize an operating system in a client-server fashion.
A: A microkernel can separate client applications from operating system services by enforcing each request to pass through the kernel. As a consequence, operating system services can be implemented by (perhaps different) userlevel servers that run as ordinary processes. If the microkernel has networking capabilities, there is also no principal objection in placing those servers on remote machines (which run the same microkernel).
13. Q: Explain the principal operation of a page-based distributed shared memory system.
A: Page-based DSM makes use of the virtual memory capabilities of an operating system. Whenever an application addresses a memory location that is currently not mapped into the current physical memory, a page fault occurs, giving the operating system control. The operating system can then locate the referred page, transfer its content over the network, and map it to physical memory. At that point, the application can continue.
14. Q: What is the reason for developing distributed shared memory systems?
What do you see as the main problem hindering efficient implementations?
A: The main reason is that writing parallel and distributed programs based on message-passing primitives is much harder than being able to use shared memory for communication. Efficiency of DSM systems is hindered by the fact, no matter what you do, page transfers across the network need to take place. If pages are shared by different processors, it is quite easy to get into a state similar to thrashing in virtual memory systems. In the end, DSM systems can never be faster than message-passing solutions, and will generally be slower due to the overhead incurred by keeping track of where pages are.
15. Q: Explain what false sharing is in distributed shared memory systems. What possible solutions do you see?
A: False sharing happens when data belonging to two different and independent processes (possibly on different machines) are mapped onto the same logical page. The effect is that the page is swapped between the two processes, leading to an implicit and unnecessary dependency. Solutions include making pages smaller or prohibiting independent processes to share a page. 16. Q: An experimental file server is up 3/4 of the time and down 1/4 of the time, due to bugs. How many times does this file server have to be replicated to give an availability of at least 99%?
A: With k being the number of servers, we have that (1/4)kobject 3id]*)(req->size, req->data, &size, results); res = malloc(sizeof(message)+size); /* create response message */ res->object3id = object3id; /* identify object */ res->method3id = req.method3id; /* identify method */ res->size = size; /* set size of invocation results */ memcpy(res->data, results, size); /* copy results into response */ put3msg(root, sizeof(res), res); /* append response to buffer */ free(req); /* free memory of request */ free(*results); /* free memory of results*/
}
} void invoke3adapter(long oid, message *request) { put3msg(adapter3thread, sizeof(request), request);
}
14. Q: Is a server that maintains a TCP/IP connection to a client stateful or stateless?
A: Assuming the server maintains no other information on that client, one could justifiably argue that the server is stateless. The issue is that not the server, but the transport layer at the server maintains state on the client. What the local operating systems keep track of is, in principle, of no concern to the server.
15. Q: Imagine a Web server that maintains a table in which client IP addresses are mapped to the most recently accessed Web pages. When a client connects to the server, the server looks up the client in its table, and if found, returns the registered page. Is this server stateful or stateless?
A: It can be strongly argued that this is a stateless server. The important issue with stateless designs is not if any information is maintained by the server on its clients, but rather how accurate that information has to be. In this example, if the table is lost for what ever reason, the client and server can still properly interact as if nothing happened. In a stateful design, such an interaction would be possible only after the server had recovered from a possible fault.
16. Q: To what extent does Java RMI rely on code migration?
A: Considering that object references are actually portable proxies, each time an object reference is passed, we are actually migrating code across the network. Fortunately, proxies have no execution state, so that support for simple weak mobility is all that is needed.
17. Q: Strong mobility in UNIX systems could be supported by allowing a process to fork a child on a remote machine. Explain how this would work.
A: Forking in UNIX means that a complete image of the parent is copied to the child, meaning that the child continues just after the call to fork. A similar approach could be used for remote cloning, provided the target platform is exactly the same as where the parent is executing. The first step is to have the target operating system reserve resources and create the appropriate process and memory map for the new child process. After this is done, the parent’s image (in memory) can be copied, and the child can be activated. (It should be clear that we are ignoring several important details here.)
18. Q: In Fig. 3-13 it is suggested that strong mobility cannot be combined with executing migrated code in a target process. Give a counterexample.
A: If strong mobility takes place through thread migration, it should be possible to have a migrated thread be executed in the context of the target process.
19. Q: Consider a process P that requires access to file F which is locally available on the machine where P is currently running. When P moves to another machine, it still requires access to F. If the file-to-machine binding is fixed, how could the systemwide reference to F be implemented?
A: A simple solution is to create a separate process Q that handles remote requests for F. Process P is offered the same interface to F as before, for example in the form of a proxy. Effectively, process Q operates as a file server.
20. Q: Each agent in D’Agents is implemented by a separate process. Agents can communicate primarily through shared files and by means of message passing. Files cannot be transferred across machine boundaries. In terms of the mobility framework given in Sec. 3.4, which parts of an agent’s state, as given in Fig. 3-19, comprise the resource segment?
A: The resource segment contains all references to local and global resources. As such, it consists of those variables that refer to other agents, local files, and so on. In D’Agents, these variables are primarily contained in the part consisting of global program variables. What makes matters simple, is that virtually all resources in D’Agents are nontransferrable. Only agents can move betweeen machines. Because agents are already named by global references, namely an (address, local-id) pair, transforming references to resources in the presence of migration is relatively simple in D’Agents.
21. Q: Compare the architecture of D’Agents with that of an agent platform in the FIPA model.
A: The main distinction between the two is that D’Agents does not really have a separate directory service. Instead, it offers only a low-level naming service by which agents can be globally referenced. The management component in the FIPA architecture corresponds to the server in D’Agents, whereas the ACC is implemented by the communication layer. The FIPA model provides no further details on the architecture of an agent, in contrast to D’Agents.
22. Q: Where do agent communication languages (ACLs) fit into the OSI model?
A: Such languages are part of the application layer.
23. Q: Where does an agent communication language fit into the OSI model, when it is implemented on top of a system for handling e-mail, such as in D’Agents? What is the benefit of such an approach?
A: It would still be part of the application layer. An important reason for implementing ACLs on top of e-mail, is simplicity. A complete, worldwide communicaton infrastructure is available for handling asynchronous message-passing between agents. Essentially, such an approach comes close to message-queuing systems discussed in Chap. 2.
24. Q: Why is it often necessary to specify the ontology in an ACL message?
A: In this context, an ontology can best be interpreted as a reference to a standard interpretation of the actual data contained in an ACL message. Usually, data in message-passing systems is assumed to be correctly interpreted by the sender and receiver of a message. Agents are often considered to be highly independent from each other. Therefore, it can not always be assumed that the receiving side will interpret the transferred data correctly. Of course, it is necessary that there is common agreement on the interpretation of the ontology field.
SOLUTIONS TO CHAPTER 4 PROBLEMS
1. Q: Give an example of where an address of an entity E needs to be further resolved into another address to actually access E.
A: IP addresses in the Internet are used to address hosts. However, to access a host, its IP address needs to be resolved to, for example, an Ethernet address.
2. Q: Would you consider a URL such as http://www.acme.org/index.html to be location independent? What about http://www.acme.nl/index.html?
A: Both names can be location independent, although the first one gives fewer hints on the location of the named entity. Location independent means that the name of the entity is independent of its address. By just considering a name, nothing can be said about the address of the associated entity.
3. Q: Give some examples of true identifiers.
A: Examples are ISBN numbers for books, identification numbers for software and hardware products, employee numbers within a single organization, and Ethernet addresses (although some addresses are used to identify a machine instead of just the Ethernet board).
4. Q: How is a mounting point looked up in most UNIX systems?
A: By means of a mount table that contains an entry pointing to the mount point. This means that when a mounting point is to be looked up, we need to go through the mount table to see which entry matches a given mount point.
5. Q: Jade is a distributed file system that uses per-user name spaces (see In other words, each user has his own, private name space. Can names from such name spaces be used to share resources between two different users?
A: Yes, provided names in the per-user name spaces can be resolved to names in a shared, global name space. For example, two identical names in different name spaces are, in principle, completely independent and may refer to different entities. To share entities, it is necessary to refer to them by names from a shared name space. For example, Jade relies on DNS names and IP addresses that can be used to refer to shared entities such as FTP sites.
6. Q: Consider DNS. To refer to a node N in a subdomain implemented as a different zone than the current domain, a name server for that zone needs to be specified. Is it always necessary to include a resource record for that server’s address, or is it sometimes sufficient to provide only its domain name?
A: When the name server is represented by a node NS in a domain other than the one in which N is contained, it is enough to give only its domain name. In that case, the name can be looked up by a separate DNS query. This is not possible when NS lies in the same subdomain as N, for in that case, you would need to contact the name server to find out its address.
7. Q: Is an identifier allowed to contain information on the entity it refers to?
A: Yes, but that information is not allowed to change, because that would imply changing the identifier. The old identifier should remain valid, so that changing it would imply that an entity has two identifiers, violating the second property of identifiers.
8. Q: Outline an efficient implementation of globally unique identifiers.
A: Such identifiers can be generated locally in the following way. Take the network address of the machine where the identifier is generated, append the local time to that address, along with a generated pseudo-random number. Although, in theory, it is possible that another machine in the world can generate the same number, chances that this happens are negligible.
9. Q: Give an example of how the closure mechanism for a URL could work.
A: Assuming a process knows it is dealing with a URL, it first extracts the scheme identifier from the URL, such as the string ftp:. It can subsequently look up this string in a table to find an interface to a local implementation of the FTP protocol. The next part of the closure mechanism consists of extracting the host name from the URL, such as www.cs.vu.nl, and passing that to the local DNS name server. Knowing where and how to contact the DNS server is an important part of the closure mechanism. It is often hard-coded into the URL name resolver that the process is executing. Finally, the last part of the URL, which refers to a file that is to be looked up, is passed to the identified host. The latter uses its own local closure mechanism to start the name resolution of the file name.
10. Q: Explain the difference between a hard link and a soft link in UNIX systems.
A: A hard link is a named entry in a directory file pointing to the same file descriptor as another named entry (in possibly a different directory). A symbolic link is a file containing the (character string) name of another file.
11. Q: High-level name servers in DNS, that is, name servers implementing nodes in the DNS name space that are close to the root, generally do not support recursive name resolution. Can we expect much performance improvement if they did?
A: Probably not: because the high-level name servers constitute the global layer of the DNS name space, it can be expected that changes to that part of the name space do not occur often. Consequently, caching will be highly effective, and much long-haul communication will be avoided anyway. Note that recursive name resolution for low-level name servers is important, because in that case, name resolution can be kept local at the lower-leveldomain in which the resolution is taking place.
12. Q: Explain how DNS can be used to implement a home-based approach to locating mobile hosts.
A: The DNS name of a mobile host would be used as (rather poor) identifier for that host. Each time the name is resolved, it should return the current IP address of the host. This implies that the DNS server responsible for providing that IP address will act as the host’s name server. Each time the host moves, it contacts this home server and provides it with its current address.
Note that a mechanism should be available to avoid caching of the address. In other words, other name servers should be told not to cache the address found.
13. Q: A special form of locating an entity is called anycasting, by which a service is identified by means of an IP address (see, for example, Sending a request to an anycast address, returns a response from a server implementing the service identified by that anycast address. Outline the implementation of an anycast service based on the hierarchical location service described in Sec. 4.2.4.
A: Each service has a unique identifier associated with it. Any server implementing that service, inserts its network-level address into the location service at the directory node of the leaf domain in which the server resides. Lookup requests use the identifier of the service, and will automatically return the nearest server implementing that service.
14. Q: Considering that a two-tiered home-based approach is a specialization of a hierarchical location service, where is the root?
A: The root is formed jointly by all home locations, but is partitioned in such a way that each mobile entity has its own root server.
15. Q: Suppose it is known that a specific mobile entity will almost never move outside domain D, and if it does, it can be expected to return soon. How can this information be used to speed up the lookup operation in a hierarchical location service?
A: Simply encode the domain D in the identifier for the entity that is used for the lookup operation. The operation can then be immediately forwarded to the directory node dir(D), from where the search continues.
16. Q: In a hierarchical location service with a depth of k, how many location records need to be updated at most when a mobile entity changes its location?
A: Changing location can be described as the combination of an insert and a delete operation. An insert operation requires that at worst k +1 location records are to be changed. Likewise, a delete operation also requires changing k +1 records, where the record in the root is shared between the two operations. This leads to a total of 2k +1 records.
17. Q: Consider an entity moving from location A to B, while passing several intermediate locations where it will reside for only a relatively short time. When arriving at B, it settles down for a while. Changing an address in a hierarchical location service may still take a relatively long time to complete, and should therefore be avoided when visiting an intermediate location. How can the entity be located at an intermediate location?
A: Combine the hierarchical location service with forwarding pointers. When the entity starts to move, it leaves behind a forwarding pointer at A to its next (intermediate) location. Each time it moves again, a forwarding pointer is left behind. Upon arrival in B, the entity inserts its new address into the hierarchical location service. The chain of pointers is subsequently cleaned up, and the address in A is deleted.
18. Q: When passing a remote reference from process P1 to P2 in distributed reference counting, would it help to let P1 increment the counter, instead of P2 ?
A: No, because in that case, process P2 may decide to immediately remove its reference before P1 has a chance to increment the counter at the object. Consequently, when P2 sends its decrement message, the counter may drop to zero and the object may be incorrectly removed.
19. Q: Make clear that weighted reference counting is more efficient than simple reference counting. Assume communication is reliable.
A: Creation or removal of reference can be done with only a single message, as is also the case with simple reference counting. Passing a reference is much cheaper, as it can be done with only one message containing the copied reference with its partial weight set to half of the weight of the original reference. There is thus no need to contact the object, in contrast to simple reference counting.
20. Q: Is it possible in generation reference counting, that an object is collected as garbage while there are still references, but which belong to a generation the object does not know of?
A: No. Whenever a reference of generation k is removed (so that possibly the associated entry in G[k ] becomes zero), the object is always informed about any outstanding references of generation k +1. If that generation was not yet known to the object, it will be at the time a reference of generation k is removed.
21. Q: Is it possible in generation reference counting, that an entry G[i ] becomes smaller than zero?
A: Yes. Suppose a reference of generation k is copied, leading to a reference of generation k +1. If the latter is removed before the reference from which it 22 PROBLEM SOLUTIONS FOR CHAPTER 4 was copied, G[k +1] will be decremented by one. If no reference of generation k had yet been removed, G[k +1] will then drop below zero.
22. Q: In reference listing, if no response is received after sending a ping message to process P, the process is removed from the object’s reference list. Is it always correct to remove the process?
A: No. It may be possible that the process is temporarily unreachable due to a network partition, for example, caused by a failing router or gateway. In that case, the reference is lost and the skeleton may falsely decide that the object can be removed if the list becomes empty.
23. Q: Describe a very simple way to decide that the stabilization step in the tracing-based garbage collector of Lang et al. has been reached.
A: Assume that each process can decide whether its local garbage collector has changed any of the markings on proxies or skeletons during the local propagation step. A very simple, nonscalable solution, is to report whether or not any changes have been made to a central coordinator. As soon as that coordinator has recorded that no more changes have occurred, it tells each process to start reclaiming garbage.
SOLUTIONS TO CHAPTER 5 PROBLEMS
1. Q: Name at least three sources of delay that can be introduced between WWV broadcasting the time and the processors in a distributed system setting their internal clocks.
A: First we have signal propagation delay in the atmosphere. Second we might have collision delay while the machines with the WWV receivers fight to get on the Ethernet. Third, there is packet propagation delay on the LAN. Fourth, there is delay in each processor after the packet arrives, due to interrupt processing and internal queueing delays.
2. Q: Consider the behavior of two machines in a distributed system. Both have clocks that are supposed to tick 1000 times per millisecond. One of them actually does, but the other ticks only 990 times per millisecond. If UTC updates come in once a minute, what is the maximum clock skew that will occur?
A: The second clock ticks 990,000 times per second, giving an error of 10 msec per second. In a minute this error has grown to 600 msec. Another way of looking at it is that the second clock is one percent slow, so after a minute it is off by 0.01 × 60 sec, or 600 msec.
3. Q: Add a new message to Fig. 5-7 that is concurrent with message A, that is, it neither happens before A nor happens after A.
A: The solution cannot involve 0 or it would be ordered. Thus it must be a message from 1 to 2 or from 2 to 1. If it departs or arrives from 1 after 16, it will be ordered with respect to A, so it must depart or arrive before 16. The possibilities are a message leaving process 2 at 0 and arriving at process 1 at 8, or a message leaving process 1 at 0 and arriving at process 2 at 10. Both of these are concurrent with A.
4. Q: To achieve totally-ordered multicasting with Lamport timestamps, is it strictly necessary that each message is acknowledged?
A: No, it is sufficient to multicast any other type of message, as long as that message has a timestamp larger than the received message. The condition for delivering a message m to the application, is that another message has been received from each other process with a large timestamp. This guarantees that there are no more messages underway with a lower timestamp.
5. Q: Consider a communication layer in which messages are delivered only in the order that they were sent. Give an example in which even this ordering is unnecessarily restrictive.
A: Imagine the transfer of a large image which, to that end, has been divided into consecutive blocks. Each block is identified by its position in the original image, and possibly also its width and height. In that case, FIFO ordering is not necessary, as the receiver can simply paste each incoming block into the correct position.
6. Q: Suppose that two processes detect the demise of the coordinator simultaneously and both decide to hold an election using the bully algorithm. What happens?
A: Each of the higher-numbered processes will get two ELECTION messages, but will ignore the second one. The election will proceed as usual.
7. Q: In Fig. 5-12 we have two ELECTION messages circulating simultaneously. While it does no harm to have two of them, it would be more elegant if one could be killed off. Devise an algorithm for doing this without affecting the operation of the basic election algorithm.
A: When a process receives an ELECTION message, it checks to see who started it. If it itself started it (i.e., its number is at the head of the list), it turns the message into a COORDINATOR message as described in the text. If it did not start any ELECTION message, it adds its process number and forwards it along the ring. However, if it did send its own ELECTION message earlier and it has just discovered a competitor, it compares the originator’s process number with its own. If the other process has a lower number, it discards the message instead of passing it on. If the competitor is higher, the message is forwarded in the usual way. In this way, if multiple ELECTION messages are started, the one whose first entry is highest survives. The rest are killed off along the route.
8. Q: Many distributed algorithms require the use of a coordinating process. To what extent can such algorithms actually be considered distributed? Discuss.
A: In a centralized algorithm, there is often one, fixed process that acts as coordinator. Distribution comes from the fact that the other processes run on different machines. In distributed algorithms with a nonfixed coordinator, the coordinator is chosen (in a distributed fashion) among the processes that form part of the algorithm. The fact that there is a coordinator does not make the algorithm less distributed.
9. Q: In the centralized approach to mutual exclusion (Fig. 5-13), upon receiving a message from a process releasing its exclusive access to the critical region it was using, the coordinator normally grants permission to the first process on the queue. Give another possible algorithm for the coordinator.
A: Requests could be associated with priority levels, depending on their importance. The coordinator could then grant the highest priority request first.
10. Q: Consider Fig. 5-13 again. Suppose that the coordinator crashes. Does this always bring the system down? If not, under what circumstances does this happen? Is there any way to avoid the problem and make the system able to tolerate coordinator crashes?
A: Suppose that the algorithm is that every request is answered immediately, either with permission or with denial. If there are no processes in critical regions and no processes queued, then a crash is not fatal. The next process to request permission will fail to get any reply at all, and can initiate the election of a new coordinator. The system can be made even more robust by having the coordinator store every incoming request on disk before sending back a reply. In this way, in the event of a crash, the new coordinator can reconstruct the list of active critical regions and the queue by reading file from the disk.
11. Q: Ricart and Agrawala’s algorithm has the problem that if a process has crashed and does not reply to a request from another process to enter a critical region, the lack of response will be interpreted as denial of permission. We suggested that all requests be answered immediately, to make it easy to detect crashed processes. Are there any circumstances where even this method is insufficient? Discuss.
A: Suppose a process denies permission and then crashes. The requesting process thinks that it is alive, but permission will never come. One way out is to have the requester not actually block, but rather go to sleep for a fixed period of time, after which it polls all processes that have denied permission to see if they are still running.
12. Q: How do the entries in Fig. 5-16 change if we assume that the algorithms can be implemented on a LAN that supports hardware broadcasts?
A: Only the entries for the distributed case change. Because sending a pointto- point message is as expensive as doing a broadcast, we need only send one broadcast message to all processes requesting the entry to the critical section. Likewise, only one exit broadcast message is needed. The delay becomes 1+(n − 1): one delay coming from the broadcast request, and an additional n − 1 as we still need to receive a message from each other process before being allowed to enter the critical section.
13. Q: A distributed system may have multiple, independent critical regions. Imagine that process 0 wants to enter critical region A and process 1 wants to enter critical region B. Can Ricart and Agrawala’s algorithm lead to deadlocks? Explain your answer.
A: It depends on the ground rules. If processes enter critical regions strictly sequentially, that is, a process in a critical region may not attempt to enter another one, then there is no way that it can block while holding a resource (i.e., a critical section) that some other process wants. The system is then deadlock free. On the other hand, if process 0 may enter critical region A and then try to enter critical region B, a deadlock can occur if some other process tries to acquire them in the reverse order. The Ricart and Agrawala algorithm itself does not contribute to deadlock since each critical region is handled independently of all the others.
14. Q: In Fig. 5-17 we saw a way to update an inventory list atomically using magnetic tape. Since a tape can easily be simulated on disk (as a file), why do you think this method is not used any more?
A: The primary reason is probably that people have gotten greedier and want to do more than they used to. If the users were content to simply maintain the inventory by making a run once a day, one could do it on disk. The problem is that now everyone is demanding instantaneous online access to the data base, which makes it impossible to store the inventory as a tape image.
15. Q: In Fig. 5-25(d) three schedules are shown, two legal and one illegal. For the same transactions, give a complete list of all values that x might have at the end, and state which are legal and which are illegal.
A: The legal values are 1, 2, and 3. The illegal values are 4, 5, and 6. The 4 is achieved by first running the middle transaction, then incorrectly interleaving the other two. The 5 is achieved in schedule 3. The 6 is achieved by first setting x to 0 three times, then incrementing it three times.
16. Q: When a private workspace is used to implement transactions on files, it may happen that a large number of file indices must be copied back to the parent’s workspace. How can this be done without introducing race conditions?
A: One way is by setting a lock at the top of the system to prevent any activity at all until all the indices have been overwritten. It would probably be wise to create an intentions list before starting to guard against crashes.
17. Q: Give the full algorithm for whether an attempt to lock a file should succeed or fail. Consider both read and write locks, and the possibility that the file was unlocked, read locked, or write locked.
A: The algorithm is as follows. If the file is unlocked, the attempt always succeeds. If the file is read locked and the attempt is for another read lock, it also succeeds. In all other cases, it fails (i.e., write locking a locked file fails, as does read locking a file already locked for writing).
18. Q: Systems that use locking for concurrency control usually distinguish read locks from write locks. What should happen if a process has already acquired a read lock and now wants to change it into a write lock? What about changing a write lock into a read lock?
A: A read lock can only be converted to a write lock if there are no other readers. Doing this conversion is effectively releasing the lock, then immediately trying to reacquiring it as a write lock. This operation fails if there are other readers. If there are other processes waiting to acquire a write lock, it is a matter of policy whether it should succeed; there is no technical objection. Downgrading a lock from write to read is always legal and should always work. (Doing so, however, implicitly gives priority to waiting readers over waiting writers, but this is a legal choice.)
19. Q: With timestamp ordering in distributed transactions, suppose a write operation write(T1,x) can be passed to the data manager, because the only, possibly conflicting operation write(T2,x) had a lower timestamp. Why would it make sense to let the scheduler postpone passing write(T1,x) until transaction T2 finishes?
A: By waiting until transaction T2 finishes, the scheduler may avoid a cascaded abort in the case T2 aborts.
20. Q: Is optimistic concurrency control more or less restrictive than using timestamps? Why?
A: It is more restrictive because by using the optimistic scheme, if a transaction tries to commit and discovers that another transaction has modified a file it has used, it always aborts. With timestamps, the transaction can sometimes commit, namely, if the other transaction has a lower timestamp.
21. Q: Does using timestamping for concurrency control ensure serializability? Discuss.
A: Yes. Stronger yet, it either completely serializes the transactions in their timestamp order, or it aborts some of them.
22. Q: We have repeatedly said that when a transaction is aborted, the world is restored to its previous state, as though the transaction had never happened. We lied. Give an example where resetting the world is impossible.
A: Any situation in which physical I/O has occurred cannot be reset. For example, if the process has printed some output, the ink cannot be removed from the paper. Also, in a system that controls any kind of industrial process, it is usually impossible to undo work that has been done.
SOLUTIONS TO CHAPTER 6 PROBLEMS
1. Q: Access to shared Java objects can be serialized by declaring its methods as being synchronized. Is this enough to guarantee serialization when such an object is replicated?
A: No. The problem is that access to each replica is serialized. However, different operations at different replicas may be executed at the same time, leaving the replicated instance variables in an inconsistent state.
2. Q: Consider a monitor as discussed in Chap. 1. If threads are allowed to block in a replicated monitor, what do we need to guarantee when signaling a condition variable?
A: That the same thread is awakened in each replica, so that exactly the same flow of control takes place in all replicas. In practice, this means that the runtime system should be in full control of thread scheduling at each replica.
3. Q: Explain in your own words what the main reason is for actually considering weak consistency models.
A: Weak consistency models come from the need to replicate for performance. However, efficient replication can be done only if we can avoid global synchronizations, which, in turn, can be achieved by loosening constistency constraints.
4. Q: Explain how replication in DNS takes place, and why it actually works so well.
A: The basic idea is that name servers cache previously looked up results. These results can be kept in a cache for a long time, because DNS makes the assumption that name-to-address mappings do not change often.
5. Q: During the discussion of consistency models, we often referred to the contract between the software and data store. Why is such a contract needed?
A: If a program expects a sequentially consistent data store and cannot live with anything less, the store must provide sequential consistency. However, to improve performance, some systems provide a weaker model. It is then essential that the software agrees to abide by the rules imposed by this model. Generally, it means that programs obeying the rules will perceive what looks like a sequentially consistent data store.
6. Q: Linearizability assumes the existence of a global clock. However, with strict consistency we showed that such an assumption is not realistic for most distributed systems. Can linearizability be implemented for physically distributed data stores?
A: Yes. Linearizability assumes loosely synchronized clocks, that is, it assumes that several events may happen within the same time slot. Those events need to be ranked adhering to sequential consistency.
7. Q: A multiprocessor has a single bus. Is it possible to implement strictly consistent memory?
A: Yes. The bus serializes requests so they appear at the memory in absolute time order.
8. Q: Why is W1(x)a R2(x)NIL R3(x)a not legal for Fig. 6-7(b)?
A: It violates data coherence.
9. Q: In Fig. 6-0, is 000000 a legal output for a distributed shared memory that is only FIFO consistent? Explain your answer.
A: Yes. Suppose (a) runs first. It prints 00. Now (b) runs. If the store into (a) has not arrived yet, it also prints 00. Now (c) runs. If neither store has arrived yet, it too prints 00.
10. Q: In Fig. 6-8, is 001110 a legal output for a sequentially consistent memory? Explain your answer.
A: Yes. If the processes run in the order (a), (c), (b), this result is obtained.
11. Q: At the end of Sec. 6.2.2, we discussed a formal model that said every set of operations on a sequentially consistent data store can be modeled by a string, H, from which all the individual process sequences can be derived. For processes P1 and P2 in Fig. 6-9, give all the possible values of H. Ignore processes P3 and P4 and do not include their operations in H.
A: It is clear that P2 cannot read 1 before it is written, so all values of H will have the write of 1 before the read of 1. Program order is respected, so the write of 2 must come after the read of 1. The only possibilities that remain are these: H = W(x)a R(x)a W(x)b W(x)c
H = W(x)a R(x)a W(x)c W(x)b
12. Q: In Fig. 6-13, a sequentially consistent memory allows six possible statement interleavings. List them all.
A: The six statement interleavings are as follows:
(1) a=1; if (b==0); b=1; if (a==0);
(2) a=1; b=1; if (a==0); if (b==0);
(3) a=1; b=1; if (b==0); if (a==0);
(4) b=1; if (a==0); a=1; if (b==0)
(5) b=1; a=1; if (b==0); if (a==0);
(6) b=1; a=1; if (a==0); if (b==0);
13. Q: It is often argued that weak consistency models impose an extra burden for programmers. To what extent is this statement actually true?
A: It really depends. Many programmers are used to protect their shared data through synchronization mechanisms such as locks or transactions. The main idea is that they require a coarser grain of concurrency than one offered at the level of only read and write operations. However, programmers do expect that operations on synchronization variables adhere to sequential consistency.
14. Q: In most implementations of (eager) release consistency in distributed shared memory systems, shared variables are synchronized on a release, but not on an acquire. Why is acquire needed at all then?
A: Acquire is needed to delay a process trying to access shared variables while another process is doing so.
15. Q: Does Orca offer sequential consistency or entry consistency? Explain your answer.
A: Formally, Orca provides only entry consistency, as concurrent operations on the same object are properly serialized. However, when using totallyordered broadcasting as the means to implement ordering on operations, all operations are globally ordered, independent of the object on which they are performed. In that case, it offers sequential consistency.
16. Q: Does totally-ordered multicasting by means of a sequencer and for the sake of consistency in active replication, violate the end-to-end argument in system design?
A: Yes. The end-to-end argument states that problems should be solved at the same level in which they occur. In this case, we are dealing with the problem of totally-ordered multicasting for achieving consistency in active replication. In primary-based protocols, consistency is achieved by first forwarding all operations to the primary. Using a sequencer, we are actually doing the same but at a lower level of abstraction. In this case, it may have been better to use a primary-based protocol in which updates are propagated by sending operations.
17. Q: What kind of consistency would you use to implement an electronic stock market? Explain your answer.
A: Causal consistency is probably enough. The issue is that reactions to changes in stock values should be consistent. Changes in stocks that are independent can be seen in different orders.
18. Q: Consider a personal mailbox for a mobile user, implemented as part of a wide-area distributed database. What kind of client-centric consistency would be most appropriate?
A: All of them, actually. What it boils down to is that the owner should always see the same mailbox, no matter whether he is reading or updating it. In fact, the simplest implementation for such a mailbox may well be that of a primary-based local-write protocol, where the primary is always located on the user’s mobile computer.
19. Q: Describe a simple implementation of read-your-writes consistency for displaying Web pages that have just been updated.
A: The simplest implementation is to let the browser always check whether it is displaying the most recent version of a page. This requires sending a request to the Web server. This scheme is simple as it is already implemented by many systems.
20. Q: Give an example where client-centric consistency can easily lead to write-write conflicts.
A: In our description of client-centric consistency models, we assumed that each data item had a single owner, which had exclusive write access. Dropping this assumption is enough to generate write-write conflicts. For example, if two independent users of the same data eventually bind to the same replica, their respective updates may need to be propagated to that replica. At that point, update conflicts will become apparent.
21. Q: When using a lease, is it necessary that the clocks of a client and the server, respectively, are tightly synchronized?
A: No. If the client takes a pessimistic view concerning the level at which its clock is synchronized with that of the server, it will attempt to obtain a new lease far before the current one expires.
22. Q: Consider a nonblocking primary-backup protocol used to guarantee sequential consistency in a distributed data store. Does such a data store always provide read-your-writes consistency?
A: No. As soon as the updating process receives an acknowledgement that its update is being processed, it may disconnect from the data store and reconnect to another replica. No guarantees are given that the update has already reached that replica. In contrast, with a blocking protocol, the updating process can disconnect only after its update has been fully propagated to the other replicas as well.
23. Q: For active replication to work in general, it is necessary that all operations be carried out in the same order at each replica. Is this ordering always necessary?
A: No. Consider read operations that operate on nonmodified data or commutative write operations. In principle, such situations allow ordering to be different at different replicas. However, it can be hard or impossible to detect whether, for example, two write operations are commutative.
24. Q: To implement totally-ordered multicasting by means of a sequencer, one approach is to first forward an operation to the sequencer, which then assigns it a unique number and subsequently multicasts the operation. Mention two alternative approaches, and compare the three solutions.
A: Another approach is to multicast the operation, but defer delivery until the sequencer has subsequently multicast a sequence number for it. The latter happens after the operation has been received by the sequencer. A third approach is to first get a sequence number from the sequencer, and then multicast the operation. The first approach (send operation to sequencer), involves sending one pointto- point message with the operation, and a multicast message. The second approach requires two multicast messages: one containing the operation, and one containing a sequence number. The third approach, finally, costs one point-to-point message with the sequence number, followed by a multicast message containing the operation.
25. Q: A file is replicated on 10 servers. List all the combinations of read quorum and write quorum that are permitted by the voting algorithm.
A: The following possibilities of (read quorum, write quorum) are legal. (1, 10), (2, 9), (3, 8), (4, 7), (5, 6), (6, 5), (7, 4), (8, 3), (9, 2), and (10, 1).
26. Q: In the text, we described a sender-based scheme for preventing replicated invocations. In a receiver-based scheme, a receiving replica recognizes copies of incoming messages that belong to the same invocation. Describe how replicated invocations can be prevented in a receiver-based scheme.
A: Assume object A invokes object B. As before, each invocation is assigned the same, unique identifier by all replicas of A. Each replica subsequently multicasts its invocation request to the replicas of B. When a replica of B receives an invocation request, it checks whether it had already received that request from one of A’s replicas. If not, the invocation is carried out at the local replica, and a reply message is multicast to all replicas of A. If the invocation had already been done, the incoming request is further ignored. Likewise, a replica of A passes only the first incoming reply for an outstanding invocation request to its local copy of the object, while all subsequent replies to that same invocation request are simply ignored.
27. Q: Consider causally-consistent lazy replication. When exactly can an operation be removed from a write queue?
A: When it is known that the operation has been performed everywhere. This can be detected by keeping track of the last update (from, say Li) carried out in each copy Lj . If each Lj has performed an operation W from Li with timestamp ts (W), Li can remove all operations W′ with a smaller timestamp (i.e., with each W′[k ] ≤ W[k ]). Therefore, each Lj sends an acknowledgement to Li containing the timestamp of the most recently performed write operation.
28. Q: For this exercise, you are to implement a simple system that supports multicast RPC. We assume that there are multiple, replicated servers and that each client communicates with a server through an RPC. However, when dealing with replication, a client will need to send an RPC request to each replica. Program the client such that to the application it appears as if a single RPC is sent. Assume you are replicating for performance, but that servers are susceptible to failures.
SOLUTIONS TO CHAPTER 7 PROBLEMS
1. Q: Dependable systems are often required to provide a high degree of security. Why?
A: If, for example, the responses given by servers cannot be trusted to be correct because some malicious party has tampered with them, it hardly makes sense to talk about a dependable system. Likewise, servers should be able to trust their clients.
2. Q: What makes the fail-stop model in the case of crash failures so difficult to implement?
A: The fact that, in practice, servers simply stop producing output. Detecting that they have actually stopped is difficult. As far as another process can see, the server may just be slow, or communication may (temporarily) be failing.
3. Q: Consider a Web browser that returns an outdated cached page instead of a more recent one that had been updated at the server. Is this a failure, and if so, what kind of failure?
A: Whether or not it is a failure depends on the consistency that was promised to the user. If the browser promises to provide pages that are at most T time units old, it may exhibit performance failures. However, a browser can never live up to such a promise in the Internet. A weaker form of consistency is to provide one of the client-centric models discussed in Chap. 7. In that case, simply returning a page from the cache without checking its consistency may lead to a response failure.
4. Q: Can the model of triple modular redundancy described in the text handle Byzantine failures?
A: Absolutely. The whole discussion assumed that failing elements put out random results, which are the same as Byzantine failures.
5. Q: How many failed elements (devices plus voters) can Fig. 7-2 handle? Give an example of the worst case that can be masked.
A: In each row of circles, at most one element can fail and be masked. Furthermore, one voter in each group can also fail provided it is feeding a faulty element in the next stage. For example, if all six elements at the top of their respective columns all fail, two of the three final outputs will be correct, so we can survive six failures.
6. Q: Does TMR generalize to five elements per group instead of three? If so, what properties does it have?
A: Yes, any odd number can be used. With five elements and five voters, up to two faults per group of devices can be masked.
7. Q: For each of the following applications, do you think at-least-once semantics or at most once semantics is best? Discuss.
(a) Reading and writing files from a file server.
(b) Compiling a program.
(c) Remote banking.
A: For (a) and (b), at least once is best. There is no harm trying over and over. For (c), it is best to give it only one try. If that fails, the user will have to intervene to clean up the mess.
8. Q: With asynchronous RPCs, a client is blocked until its request has been accepted by the server (see Fig. 2-12). To what extent do failures affect the semantics of asynchronous RPCs?
A: The semantics are generally affected in the same way as ordinary RPCs. A difference lies in the fact that the server will not be processing the request while the client is blocked, which introduces problems when the client crashes in the meantime. Instead, the server simply does its work, and attempts to contact the client later on, if necessary.
9. Q: Give an example in which group communication requires no message ordering at all.
A: Multicasting images in small fragments, where each fragment contains the (x,y) coordinate as part of its data. Likewise, sending the pages of a book, with each page being numbered.
10. Q: In reliable multicasting, is it always necessary that the communication layer keeps a copy of a message for retransmission purposes?
A: No. In many cases, such as when transferring files, it is necessary only that the data is still available at the application level. There is no need that the communication layer maintains its own copy.
11. Q: To what extent is scalability of atomic multicasting important?
A: It really depends on how many processes are contained in a group. The important thing to note is, that if processes are replicated for fault tolerance, having only a few replicas may be enough. In that case, scalability is hardly an issue. When groups of different processes are formed, scalability may become an issue. When replicating for performance, atomic multicasting itself may be overdone.
12. Q: In the text, we suggest that atomic multicasting can save the day when it comes to performing updates on an agreed set of processes. To what extent can we guarantee that each update is actually performed?
A: We cannot give such guarantees, similar to guaranteeing that a server has actually performed an operation after having sent an acknowledgement to the client. However, the degree of fault tolerance is improved by using atomic multicasting schemes, and makes developing fault-tolerant systems easier.
13. Q: Virtual synchrony is analogous to weak consistency in distributed data stores, with group view changes acting as synchronization points. In this context, what would be the analogon of strong consistency?
A: The synchronization resulting from individual multicasts, be they totally, causally, or FIFO ordered. Note that view changes take place as special multicast messages, which are required to be properly ordered as well.
14. Q: What are the permissable delivery orderings for the combination of FIFO and total-ordered multicasting in Fig. 7-14?
A: There are six orderings possible:
Order 1 Order 2 Order 3 Order 4 Order 5 Order 6 m1 m1 m1 m3 m3 m3 m2 m3 m3 m1 m1 m4 m3 m2 m4 m2 m4 m1 m4 m4 m2 m4 m2 m2
15. Q: Adapt the protocol for installing a next view Gi +1 in the case of virtual synchrony so that it can tolerate process failures.
A: When a process P receives Gi +k, it first forwards a copy of any unstable message it has, regardless to which previous view it belonged, to every process in Gi +k , followed by a flush message for Gi +k . It can then safely mark the message as being stable. If a process Q receives a message m that was sent in Gj (with j < i + k), it discards it when Q was never in Gj . If the most recently installed view at Q is Gl with l > j, message m is also discarded (it is a duplicate). If l = j and m had not yet been received, process Q delivers m taking any additional message ordering constraints into account. Finally, if l < j, message m is simply stored in the communication layer until Gj has been installed.
16. Q: In the two-phase commit protocol, why can blocking never be completely eliminated, even when the participants elect a new coordinator?
A: After the election, the new coordinator may crash as well. In this case, the remaining participants can also not reach a final decision, because this requires the vote from the newly elected coordinator, just as before.
17. Q: In our explanation of three-phase commit, it appears that committing a transaction is based on majority voting. Is this true?
A: Absolutely not. The point to note is that a recovering process that could not take part in the final decision as taken by the other process, will recover to a state that is consistent with the final choice made by the others.
PROBLEM SOLUTIONS FOR CHAPTER 7 37
18. Q: In a piecewise deterministic execution model, is it sufficient to log only messages, or do we need to log other events as well?
A: More logging is generally needed: it concerns all nondeterministic events, including local I/O and, in particular, system calls.
19. Q: Explain how the write-ahead log in distributed transactions can be used to recover from failures.
A: The log contains a record for each read and write operation that took place within the transaction. When a failure occurs, the log can be replayed to the last recorded operation. Replaying the log is effectively the opposite from rolling back, which happens when the transaction needed to be aborted.
20. Q: Does a stateless server need to take checkpoints?
A: It depends on what the server does. For example, a database server that has been handed a complete transaction will maintain a log to be able to redo its operations when recovering. However, there is no need to take checkpoints for the sake of the state of the distributed system. Checkpointing is done only for local recovery.
21. Q: Receiver-based message logging is generally considered better than sender-based logging. Why?
A: The main reason is that recovery is entirely local. In sender-based logging, a recovering process will have to contact its senders to retransmit their messages.
SOLUTIONS TO CHAPTER 8 PROBLEMS
1. Q: Which mechanisms could a distributed system provide as security services to application developers that believe only in the end-to-end argument in system’s design, as discussed in Chap. 5?
A: None. Applying the end-to-end argument to security services means that developers will not trust anything that is not provided by their own applications. In effect, the distributed system as a whole is considered to be untrusted. 2. Q: In the RISSC approach, can all security be concentrated on secure servers or not?
A: No, we still need to make sure that the local operating systems and communication between clients and servers are secure.
3. Q: Suppose you were asked to develop a distributed application that would allow teachers to set up exams. Give at least three statements that would be part of the security policy for such an application.
A: Obvious requirements would include that students should not be able to access exams before a specific time. Also, any teacher accessing an exam before the actual examination date should be authenticated. Also, there may be a restricted group of people that should be given read access to any exam in preparation, whereas only the responsible teacher should be given full access.
4. Q: Would it be safe to join message 3 and message 4 in the authentication protocol shown in Fig. 8-15, into KA,B(RB,RA)?
A: Yes, there is no reason why the challenge sent by Alice for Bob cannot be sent in the same message.
5. Q: Why is it not necessary in Fig. 8-12 for the KDC to know for sure it was talking to Alice when it receives a request for a secret key that Alice can share with Bob?
A: Suppose that Chuck had sent the message ‘‘I’m Alice and I want to talk to Bob.’’ The KDC would just return KA,KDC(KA,B) which can be decrypted only by Alice because she is the only other entity holding the secret key KA,KDC.
6. Q: What is wrong in implementing a nonce as a timestamp?
A: Although a timestamp is used only once, it is far from being random. Implementations of security protocols exist that use timestamps as nonces, and which have been succusfully attacked by exploiting the nonrandomness of the nonces.
PROBLEM SOLUTIONS FOR CHAPTER 8 39
7. Q: In message 2 of the Needham-Schroeder authentication protocol, the ticket is encrypted with the secret key shared between Alice and the KDC. Is this encryption necessary?
A: No. Because Bob is the only one who can decrypt the ticket, it might as well have been sent as plaintext.
8. Q: Can we safely adapt the authentication protocol shown in Fig. 8-19 such that message 3 consists only of RB?
A: In principle, if RB is never used again, then returning it unencrypted should be enough. However, such randomness is seldom found. Therefore, by encrypting RB, it becomes much more difficult for Chuck to break in and forge message 3.
9. Q: Devise a simple authentication protocol using signatures in a public-key cryptosystem.
A: If Alice wants to authenticate Bob, she sends Bob a challenge R. Bob will be requested to return KB −(R), that is, place his signature under R. If Alice is confident that she has Bob’s public key, decrypting the response back to R should be enough for her to know she is indeed talking to Bob.
10. Q: Assume Alice wants to send a message m to Bob. Instead of encrypting m with Bob’s public key KB +, she generates a session key KA,B and then sends [KA,B(m),KB +(KA,B)]. Why is this scheme generally better? (Hint: consider performance issues.)
A: The session key has a short, fixed length. In contrast, the message m may be of arbitrary length. Consequently, the combination of using a session key and applying public-key cryptography to a short message will generally provide much better performance than using only a public key on a large message.
11. Q: How can role changes be expressed in an access control matrix?
A: Roles, or protection domains in general, can be viewed as objects with basically a single operation: enter. Whether or not this operation can be called depends on the role from which the request is issued. More sophisticated approaches are also possible, for example, by allowing changes back to previous roles.
12. Q: How are ACLs implemented in a UNIX file system?
A: Each file has three associated entries: one for the owner, one for a group that is associated with the file, and one for everyone else. For each entry, the access rights can essentially be specified as read, write, execute.
13. Q: How can an organization enforce the use of a Web proxy gateway and prevent its users to directly access external Web servers?
40 PROBLEM SOLUTIONS FOR CHAPTER 8
A: One way is to use a packet-filtering gateway that discards all outgoing traffic except that directed to a few, well-known hosts. Web traffic is accepted provided it is targeted to the company’s Web proxy.
14. Q: Referring to Fig. 8-29, to what extent does the use of Java object references as capabilities actually depend on the Java language?
A: It is independent of the Java language: references to secured objects still need to be handed out during runtime and cannot be simply constructed. Java helps by catching the construction of such references during compile time.
15. Q: Name three problems that will be encountered when developers of interfaces to local resources are required to insert calls to enable and disable privileges to protect against unauthorized access by mobile programs as explained in the text.
A: An important one is that no thread switching may occur when a local resource is called. A thread switch could transfer the enabled privileges to another thread that is not authorized to access the resource. Another problem occurs when another local resource needs to be called before the current invocation is finished. In effect, the privileges are carried to the second resource, while it may happen that the caller is actually not trusted to access that second resource. A third problem is that explicitly inserting calls to enable and disable privileges is suspect to programming errors, rendering the mechanism useless.
16. Q: Name a few advantages and disadvantages of using centralized servers for key management.
A: An obvious advantage is simplicity. For example, by having N clients share a key with only a centralized server, we need to maintain only N keys. Pairwise sharing of keys would add up to N(N − 1)/2 keys. Also, using a centralized server allows efficient storage and maintenance facilities at a single site. Potential disadvantages include the server becoming a bottleneck with respect to performance as well as availability. Also, if the server is compromised, new keys will need to be established.
17. Q: The Diffie-Hellman key-exchange protocol can also be used to establish a shared secret key between three parties. Explain how.
A: Suppose Alice, Bob, and Chuck want to set up a shared secret key based on the two publicly known large primes n and g. Alice has her own secret large number x, Bob has y, and Chuck has z. Alice sends gx mod n to Bob; Bob sends gy mod n to Chuck; and Chuck sends gz mod n to Alice. Alice can now compute gxz mod n, which she sends to Bob. Bob, in turn, can then compute gxyz mod n. Likewise, after receiving gx mod n from Alice, Bob can compute gxy mod n, which he sends to Chuck. Chuck can then compute gxyz mod n. Similarly, after Chuck receives gy mod n from Bob, he computes gyz mod n and sends that to Alice so that she can compute gxyz mod n.
18. Q: There is no authentication in the Diffie-Hellman key-exchange protocol. By exploiting this property, a malicious third party, Chuck, can easily break into the key exchange taking place between Alice and Bob, and subsequently ruin the security. Explain how this would work.
A: Assume Alice and Bob are using the publicly known values n and g. When Alice sends gx mod n to Bob, Chuck need simply intercept that message, return his own message gz mod n, and thus make Alice believe she is talking to Bob. After intercepting Alice’s message, he sends gz mod n to Bob, from which he can expect gy mod n as reply. Chuck is now in the middle.
19. Q: Give a straightforward way how capabilities in Amoeba can be revoked.
A: The object’s owner simply requests that the server discard all registered (rights, check)-pairs for that object. A disadvantage is that all capabilities are revoked. It is difficult to revoke a capability handed out to a specific process.
20. Q: Does it make sense to restrict the lifetime of a session key? If so, give an example how that could be established.
A: Session keys should always have a restricted lifetime as they are easier to break than other types of cryptographic keys. The way to restrict their lifetime is to send along the expiration time when the key is generated and distributed. This approach is followed, for example, in SESAME.
21. Q: What is the role of the timestamp in message 6 in Fig. 8-38, and why does it need to be encrypted?
A: The timestamp is used to protect against replays. By encrypting it, it becomes impossible to replay message 6 with a later timestamp. This example illustrates a general application of timestamps in cryptographic protocols.
22. Q: Complete Fig. 8-38 by adding the communication for authentication between Alice and Bob.
A: Alice sends to Bob the message M=[KB,AS(A,KA,B),KA,B(t )], where KB,AS is the secret key shared between Bob and the AS. At that point, Bob knows he is talking to Alice. By responding with KA,B(t + 1), Bob proves to Alice that he is indeed Bob.
23. Q: Consider the communication between Alice and an authentication service AS as in SESAME. What is the difference, if any, between a message m1 = KAS + (KA,AS(data)) and m2 = KA,AS(KAS + (data))?
A: Message m1 must be decrypted by the authentication service, but results in a message that both the authentication service and Alice can understand. With message m2, the authentication service or Alice must first decrypt the message, but only the authentication service can understand its content. In this respect, the two messages are quite different.
24. Q: Define at least two different levels of atomicity for transactions in electronic payment systems.
A: Money atomicity is what we defined in the text: the actual payment should take place entirely, or should fail altogether. A coarser grain of atomicity is goods atomicity. In that case, the whole transaction succeeds if and only if the purchased goods are delivered and paid for.
25. Q: A customer in an e-cash system should preferably wait a random time before using coins it has withdrawn from the bank. Why?
A: Otherwise, the bank may start noticing a relation between deposited coins and the fact that withdrawals from that specific user are taking place. This would violate anonymity.
26. Q: Anonymity of the merchant in any payment system is often forbidden. Why?
A: It is often legally required to know the payee to warrant against the purchase of stolen goods.
27. Q: Consider an electronic payment system in which a customer sends cash to a (remote) merchant. Give a table like the ones used in Figs. 8-44 and 8-0 expressing the hiding of information. SOLUTIONS TO CHAPTER 9 PROBLEMS
1. Q: Why is it useful to define the interfaces of an object in an Interface Definition Language?
A: There are several reasons. First, from a software-engineering point of view, having precise and unambiguous interface definitions is important for understanding and maintaining objects. Furthermore, IDL-based definitions come in handy for generating stubs. Finally, and related to the latter, if an IDL definition has been parsed and stored, supporting dynamic invocations becomes easier, as the client-side proxy can be automatically constructed at runtime from an interface definition.
2. Q: Give an example in which CORBA’s dynamic invocation mechanisms come in handy.
A: When a process implements a gateway between two different ORBs, it can dynamically construct an invocation for an object at the target ORB without knowing in advance what that object actually looks like. Instead, it can find out dynamically.
3. Q: Which of the six forms of communication discussed in Sec. 2.4 are supported by CORBA’s invocation model?
A: Transient asynchronous communication (by means of CORBA one-way requests) and response-based transient synchronous communication (by means of CORBA synchronous requests). Deferred synchronous request can be viewed as a combination of two one-way requests, possibly with a blocking client waiting for the response.
4. Q: Outline a simple protocol that implements at-most-once semantics for an object invocation.
A: A simple solution is to let a proxy retransmit an invocation request, but telling the server explicitly that it is a retransmission. In that case, the server may decide to ignore the request and return an error to the client. In a more sophisticated solution, the server can cache results of previous invocations and check whether it still has those results when receiving a retransmitted request.
5. Q: Should the client and server-side CORBA objects for asynchronous method invocation be persistent?
A: In general, they should be persistent, allowing the client or server to shut down and later restart and fetch the objects from disk. However, in theory, there is no hard reason to demand that these objects should be persistent.
6. Q: In a GIOP Request message, the object reference and method name are part of the message header. Why can they not be contained in the message body only? A: The server will need to demultiplex incoming messages to the appropriate object adapter. It can only look at message headers as it should know nothing about the actual invocation. Demultiplexing can be done on an object reference alone, but further demultiplexing may be needed by also looking at the method that is to be invoked.
7. Q: Does CORBA support the thread-per-object invocation policy that we explained in Chap. 3?
A: Yes, although it can be done only by associating a single object per POA, which in turn should provide the single-threading policy. In other cases, the thread-per-object policy depends on what the underlying ORB provides.
8. Q: Consider two CORBA systems, each with their own naming service. Outline how the two naming services could be integrated into a single, federated naming service.
A: Integration can be straightforward. Because naming contexts are regular objects, a naming context in a foreign name space can be registered under a specific name in the naming graph. Resolving that name will return an IOR referring to the foreign naming context. To a client, a regular object reference is returned for which it can invoke the resolve method as usual.
9. Q: In the text, we state that when binding to a CORBA object, additional security services may be selected by the client ORB based on the object reference. How does the client ORB know about these services?
A: In principle, they are encoded as part of an IOR, for example, in the Components field of a tagged profile. Such information can be made available by the object itself when its associated object server (or rather, POA) generates its IOR.
10. Q: If a CORBA ORB uses several interceptors that are not related to security, to what extent is the order in which interceptors are called important?
A: It really depends, but in general, if security is needed, most interceptors (at the client side) will be called after calling the access control interceptor, but certainly before the secure invocation interceptor. The latter is strictly required because the secure invocation interceptor produces encrypted messages in which only the destination is visible. Any modification to such a message will lead to a fault when checked for integrity.
11. Q: Illustrate how a transactional queue can be part of a distributed transaction as mentioned in the text.
A: In this case, an application may need to send and receive messages from components running on other machines, and which possibly need to access local databases. By turning the operations on a transactional queue into a nested transaction, and likewise using nested transactions for the other groups of operations, it becomes possible to construct one big distributed transaction. 12. Q: In comparing DCOM to CORBA, does CORBA provide standard marshaling or custom marshaling? Discuss.
A: CORBA provides standard marshaling, but allows customization of proxies and skeletons by means of interceptors. In this sense, custom marshaling in DCOM can be seen as a work-around for the lack of general interception techniques. However, there is no reason why a CORBA developer should not write his own custom proxies, as long as they adhere to language-dependent mapping of the IDL specifications of the object’s interfaces. The drawback of this approach, however, is that it requires these custom proxies to be stored in an interface repository from where they can be retrieved by clients. Unlike DCOM, CORBA provides no standard support for storing and retrieving customized proxies. In DCOM, such matters are handled by the registry.
13. Q: In DCOM, consider a method m that is contained in interface X, and which accepts a pointer to interface Y as input parameter. Explain what happens with respect to marshaling when a client holding an interface pointer to a proxy implementation of X, invokes m.
A: When m is invoked, the proxy implementing X will have to marshal the proxy implementing Y. If Y is based on standard marshaling, then, in principle, only the interface identifier of Y needs to be marshaled into the message comprising the marshaled invocation of m. If Y is based on custom marshaling, X will have to marshal Y using the class object associated with Y that handles the marshaling of Y. The CLSID of that class object is prepended to the marshaled version of Y in order to allow the receiver to load the appropriate code to unmarshal Y again.
14. Q: Does custom marshaling in DCOM require that the process receiving a marshaled proxy runs on the same type of machine as the sending process?
A: No, provided that a marshaled proxy can be represented in a machineindependent way. A developer of object-specific proxies will have to take this design issue into account. For example, assume the sending process runs on an Intel-based Windows machine whereas the receiver is running as a Solaris process on a SPARC station. When the sender marshals its proxy, it provides a CLSID identifying the class object that is needed to unmarshal the proxy by the receiver. When the Solaris implementation of DCOM receives this CLSID, it will have to be able to load its specific implementation of the identified class object. That class object is then used to instantiate an object to unmarshal the proxy and return an interface pointer to the receiving process. If the proxy can be unmarshaled to something that can be readily understood by the Solaris implementation of DCOM, then no problems need to occur.
15. Q: Outline an algorithm for migrating an object in DCOM to another server.
A: The main issues that need to be dealt with is migrating the state to the target object server, and having each client rebind to the migrated object. One possible approach is the following. First, a copy of the object is created on the target server using the object’s CLSID. Then, the state is marshaled and transferred to the new server, where the newly created object is instructed to unmarshal that state. The old copy is replaced by an object implementing a forwarding pointer that offers the same interface as the moved object. Each time an invocation request is forwarded, the invoking client will be notified that it should rebind to the object at its new location.
16. Q: To what extent does JIT activation violate or comply with the notion of objects?
A: JIT activation essentially assumes that objects have no state. In other words, an object is reduced to a collection of (stateless) methods. In general, such collections are hardly considered to resemble objects.
17. Q: Explain what happens when two clients on different machines use the same file moniker to bind to a single object. Will the object be instantiated once or twice?
A: It really depends. If no special measures are taken, two copies of the object will be constructed and initialized with the data from the same file. However, with a special DCOM object server and possibly adapted moniker, it should be possible to create only a single, shared instance.
18. Q: What would be the obvious way to implement events in Globe without violating the principle that objects are passive?
A: The best way would be to implement a simple layer on top of Globe objects in which a thread would occasionally poll the underlying object to see if any state changes have occurred. Such a thread could then invoke a callback interface as provided by a client application. In effect, such an implementation changes a pull-style model into a push-style model.
19. Q: Give an example in which the (inadvertent) use of callback mechanisms can easily lead to an unwanted situation.
A: If a callback leads to another invocation on the same object, a deadlock may arise if locks are needed to protect shared resources. Situations as these are very hard to control in a general way.
20. Q: In CORBA, nodes in naming graph, such as directories, are also considered as objects. Would it make sense to also model directories as distributed shared objects in Globe?
A: Yes. This is best explained by considering a root directory in a worldwide naming service. In most cases, the root directory hardly changes. Consequently, its state, which consists of a collection of object references (in the form of object handles), can be widely replicated. A similar reasoning will show that for many lower-level nodes, a different approach to replication is needed as the update-to-read ratio is very different. These differences can be dealt with by means of distributed shared objects.
21. Q: Assume a Globe object server has just installed a persistent local object. Also, assume that this object offers a contact address. Should that address be preserved as well when the server shuts down?
A: If the address is not preserved, the server should remove it from the Globe location service before shutting down. When the object is loaded again, the server should offer a new address and insert that address into the location service. For efficiency reasons, Globe also supports persistent contact addresses that remain the same even when a server shuts down.
22. Q: In our example of using Kerberos for security in Globe, would it be useful to encrypt the ticket with a key shared between the object and the ticket granting service, instead of the key KB,TGT?
A: It depends. Using a single object key would allow secure and efficient multicasting of a shared session key to all replicas. However, because the object key needs to be shared between all processes that have a replica, we need to trust that all processes will indeed keep that key a secret.
SOLUTIONS TO CHAPTER 10 PROBLEMS
1. Q: Is a file server implementing NFS version 3 required to be stateless?
A: No, but the NFS protocols are such that it is possible to provide an implementation using stateless servers.
2. Q: NFS does not provide a global, shared name space. Is there a way to mimic such a name space?
A: A global name space can easily be mimiced using a local name space for each client that is partially standardized, and letting the automounter mount the necessary directories into that name space.
3. Q: Give a simple extension to the NFS lookup operation that would allow iterative name lookup in combination with a server exporting directories that it mounted from another server.
A: If a lookup operation always returns an identifier for the server from which a directory was mounted, transparent iterative name lookups across multiple servers would be easy. Whenever a server looks up a mount point on which it mounted a remote file system, it simply returns the server’s ID for that file system. The client can then automatically mount the directory as well, and contact its associated server to continue name resolution.
4. Q: In UNIX-based operating systems, opening a file using a file handle can be done only in the kernel. Give a possible implementation of an NFS file handle for a user-level NFS server for a UNIX system.
A: The problem to be solved is to return a file handle that will allow the server to open a file using the existing name-based file system interface. One approach is to encode the file name into the file handle. The obvious drawback is that as soon as the file name changes, its file handles become invalid.
5. Q: Using an automounter that installs symbolic links as described in the text makes it harder to hide the fact that mounting is transparent. Why?
A: After the symbolic link has been followed, the user will not be in the expected directory, but in a subdirectory used by the automounter. In other words, the local home directory for Alice will be /tmp3mnt/home/alice instead of what she thinks it is, namely /home/alice. Special support from the operating system or shell is needed to hide this aspect.
6. Q: Suppose the current denial state of a file in NFS is WRITE. Is it possible that another client can first successfully open that file and then request a write lock?
A: Yes. If the second client requires read/write access (i.e., value BOTH) but no denial (i.e., value NONE), it will have been granted access to the file. However, although a write lock may actually be granted, performing an Pactual write operation will fail. Remember that share reservation is completely independent from locking.
7. Q: Taking into account cache coherence as discussed in Chap. 6, which kind of cache-coherence protocol does NFS implement?
A: Because multiple write operations can take place before cached data is flushed to the server, NFS clients implement a write-back cache.
8. Q: Does NFS implement entry consistency? What about release consistency?
A: Yes. Because share reservations and file locking are associated with specific files, and because a client is forced to revalidate a file when opening it and flush it back to the server when closing it, it can be argued that NFS implements entry consistency. Note that this scheme also comes close to eager release consistency, except that not all files need to be flushed, as would be the case with release consistency, which, by definition, works on all shared data.
9. Q: We stated that NFS implements the remote access model to file handling. It can be argued that it also supports the upload/download model. Explain why.
A: Because a server can delegate a file to a client, it can effectively support whole-file caching and putting that client in charge of further handling of the file. This model comes close to uploading a file to a client and downloading it later when when the client is finished.
10. Q: In NFS, attributes are cached using a write-through cache coherence policy. Is it necessary to forward all attributes changes immediately?
A: No. For example, when appending data to a file, the server does not really need to be informed immediately. Such information may possibly be passed on when the client flushes its cache to the server.
11. Q: To what extent will the duplicate-request cache as described in the text actually succeed in implementing at-most-once semantics?
A: Although the scheme will generally work, there are still numerous problems. For example, the duplicate-request cache is often implemented in volatile memory. Consequently, when the server crashes, the cache is lost, which may eventually lead to processing duplicate requests when the server later recovers. Another problem is that cache entries, in general, will eventually need to be removed to make space for new requests. To do this safely, the client should acknowledge previous replies, possibly by piggybacking ACKs on next requests.
12. Q: In NFS, what kind of security measures can be taken to prevent a malicious client from reclaiming a lock it never was granted when a server enters its grace period?
50 PROBLEM SOLUTIONS FOR CHAPTER 10
A: One simple solution is to pass an attribute certificate to a client when it first claims a lock. That certificate should be shown again when reclaiming the lock later.
13. Q: Fig. 10-21 suggests that clients have complete transparent access to Vice. To what extent is this indeed true?
A: All issues concerning file access and caching are handled by Venus, so in that sense, user processes can indeed remain unaware of the organization of servers and the placement of volumes. However, when disconnected, it may be impossible to achieve failure transparency if it turns out that a file is not in the cache, or that a conflict needs to be manually resolved.
14. Q: Using RPC2’s side effects is convenient for continuous data streams. Give another example in which it makes sense to use an application-specific protocol next to RPC.
A: File transfer. Instead of using a pure RPC mechanism, it may be more efficient to transfer very large files using a protocol such as FTP.
15. Q: If a physical volume in Coda is moved from server A to server B, which changes are needed in the volume replication database and volume location database?
A: The volume location database needs to be updated with the new location.
No other changes are needed.
16. Q: What calling semantics does RPC2 provide in the presence of failures?
A: Considering that the client will be reported an error when an invocation does not complete, RPC2 provides at-most-once semantics.
17. Q: Explain how Coda solves read-write conflicts on a file that is shared between multiple readers and only a single writer.
A: The problem is solved by ‘‘defining it away.’’ The semantics of transactions underlying file sharing in Coda, permit treating all readers as accessing the shared file before the writer opened it. Note that read-write conflicts within a specific time interval cannot be solved in this way.
18. Q: In Coda, is it necessary to encrypt the clear token that Alice receives from the AS when she logs in? If so, where does encryption take place?
A: Yes, otherwise the session key in the clear token can be seen by any other process. Encryption takes place as part of the secure RPC. As is also shown in
Fig. 10-32, after mutual authentication between Alice and the AS has taken place, the AS generates a session key as part of secure RPC that is used to encrypt the clear token when returning it to Alice.
19. Q: If a file on a Vice server needs its own specific access control list, how can this be achieved?
PROBLEM SOLUTIONS FOR CHAPTER 10 51
A: A solution is to create a separate directory containing only that file, and to associate the required access control list with that directory. If necessary, a symbolic link to the file can be created in the directory where that file logically belongs. 20. Q: Does it make sense for a Vice server to offer only WRITE permissions on a file? (Hint: consider what happens when a client opens a file.)
A: No. The point is that in Coda, whenever a client opens a file, it receives a copy of that file to cache. This effectively already gives it READ permission. Only a Venus process may possibly enforce write-only permission, but in that case, Vice servers will need to trust clients.
21. Q: Can union directories in Plan 9 replace the PATH variable in UNIX systems?
A: Yes. By mounting various file systems in a specific order at the same mount point, and subsequently looking only for executables at that mount point, there is no need to look in any other subdirectory.
22. Q: Can Plan 9 be efficiently used as wide-area distributed system?
A: Without modifications, probably not. An important obstacle is keeping files at a server and using a write-through caching policy. In effect, all update operations need to be passed between a client and the server, which introduces a considerable latency in the case of wide-area networks.
23. Q: What is the main advantage of using stripe groups in xFS compared to the approach in which a segment is fragmented across all storage servers?
A: The main advantage is scalability. In a system with many servers, writing a segment requires that all servers are contacted. Using stripe groups, it becomes possible to limit the size of a group and in this way also limits the amount of network traffic and servers that need to be contacted.
24. Q: Using self-certifying path names, is a client always ensured it is communicating with a nonmalicious server?
A: No. SFS does not solve naming problems. In essence, a client will have to trust that the server named in the path can actually be trusted. In other words, a client will have to put its trust in the name and name resolution process. It may very well be the case that a malicious SFS server is spoofing another server by using its IP address and passing the other server’s public key.
SOLUTIONS TO CHAPTER 11 PROBLEMS
1. Q: To what extent is e-mail part of the Web’s document model?
A: E-mail is not part of the document model underlying the Web, but rather a separate system that has been integrated with hypertext documents by means of client-side software only. For example, most Web browsers provide additional support for handling e-mail, but the actual e-mail messages are not related to hypertext documents in any way.
2. Q: The Web uses a file-based approach to documents by which a client first fetches a file before it is opened and displayed. What is the consequence of this approach for multimedia files?
A: One of the problems is that in many cases such files are first fetched and stored locally before that can be opened. What is needed, however, is to keep the file at the server and set up a data stream to the client. This approach is supported in the Web, but requires special solutions at both the client and the server.
3. Q: Why do persistent connections generally improve performance compared to nonpersistent connections?
A: The real gain is in avoiding connection setup, which requires a 3-way handshake in the case of TCP.
4. Q: Explain the difference between a plug-in, an applet, a servlet, and a CGI program.
A: A plug-in is a piece of code that can be dynamically loaded from a browser’s local file system. In contrast, an applet is dynamically downloaded from the server to the browser, for which reason security is generally more important. Note that many plug-ins can be dynamically downloaded as well, but generally not without manual interference by the user. A servlet is comparable to a plug-in, but is entirely handled at the server side. A CGI program is a piece of code that runs in a separate process on the same machine as the server.
5. Q: Outline the design of a general-purpose URN-to-URL naming service, that is, a service that resolves a URN to the URL of a copy of a document named by the URN.
A: Such a service has already been discussed in Chap. 4, namely a worldwide hierarchical location service. Instead of using object identifiers as input, the service accepts a URN and looks up the URL of the ‘‘nearest’’ copy of the named document.
6. Q: In WebDAV, is it sufficient for a client to show only the lock token to the server in order to obtain write permissions?
A: No, the client should also identify itself as the rightful owner of the token. For this reason, WebDAV not only hands over a token to a client, but also registers which client has the token.
7. Q: Instead of letting a Web proxy compute an expiration time for a document, a server could do this instead. What would be the benefit of such an approach?
A: An important benefit would be that the expiration time is consistent in a hierarchy of caches. For example, if a browser cache computes a longer expiration time than its associated Web proxy, this would mean that one user would get to see a possibly stale document, while other users that access the same proxy are returned a fresher version.
8. Q: Does the Akamai CDN follow a pull-based or push-based distribution protocol?
A: Because replicas are fetched on demand after a client has been redirected to a CDN server, Akamai is seen to follow a pull-based protocol.
9. Q: Outline a simple scheme by which an Akamai CDN server can find out that a cached embedded document is stale without checking the document’s validity at the original server.
A: A simple approach followed by Akamai is to include a hash value of the embedded document in its modified URL. It is important to realize that any client will always retrieve all modified URLs when fetching the main document. Subsequent lookups will eventually lead to a CDN server, which can then conclude that a cached document is no longer valid by comparing hash values in the modified URL of the requested and the cached document, respectively.
10. Q: Would it make sense to associate a replication strategy with each Web document separately, as opposed to using one or only a few global strategies?
A: Probably, considering the wide variety of usage patterns for Web documents. Many documents such as personal home pages, are hardly ever updated, and if so, they are updated by only a single person, making these documents suitable for caching and replication. Dynamically generated documents containing timely information require a different strategy, especially if they need to be replicated for performance. Note that the current Web can hardly differentiate such documents.
11. Q: URLs in Notes contain an identifier of a document. To what extent is this an improvement compared to the file names used in the Web?
A: The main improvement is that even if a document changes its (local) name, this change will not make its URL invalid as would be the case in the Web. The only requirement is that the database name remains the same. Note in this respect, that most URLs in the Web become invalid not because a document is moved to another server, but only because their local name is changed. 12. Q: Does the URL scheme in Notes imply that operations are uniquely named worldwide?
A: In fact, yes. Because operations are accessed only at a particular database, which is accessed through a globally unique identifier, one could argue that operations are also uniquely named.
13. Q: Consider a cluster of Domino servers and suppose a Notes client contacts a busy Domino server in this cluster for the first time. Explain what happens.
A: Because the client contacts the server for the first time, we can assume it knows nothing about the cluster the server is part of. Consequently, it cannot contact another server and the client’s request will have to be deferred until a later time.
14. Q: In the text, we described a conflict between two modified Notes documents. It is also possible that a conflict arises between a deleted copy and a modified one. Explain how this conflict arises and how it can be resolved.
A: The conflict is easily explained by considering deletion as a special modify operation, leaving only a deletion stub behind. If the other copy is subsequently modified more than once, one can argue that this ‘‘undoes’’ the deletion. However, if there is exactly one modification opposed to the deletion, it is harder to decide which operation should prevail. In this case, Notes follows the policy of removing the document.
15. Q: To what extent can write-write conflicts occur in the case of cluster replication in Notes?
A: Cluster replication differs only from the general replication scheme in the frequency by which updates are propagated. Conflict detection and resolution is exactly the same. However, because the time during which replicas are not synchronized is much smaller, there will generally be much fewer conflicts to resolve compared to general replication.
16. Q: Give an alternat ve solution to using cross-certificates in Notes.
A: Instead of using cro s-certificates, two organizations could use a trusted tird party that hands out certificates. In other word , they could make use a certification authority.
17. Q: Executing downloaded code in Notes is often based on trust. Is this safe?
A: In principle, no. However, Notes provides strong authentication and access control to shared databases. In other words, it is not as open as, for example, the Web. In this sense, it is generally safer provided trust can be put into proper system administration.
SOLUTIONS TO CHAPTER 12 PROBLEMS
1. Q: What type of coordination model would you classify the message-queuing systems discussed in Chap. 2?
A: Considering that in message-queuing systems processes are temporally uncoupled, but will otherwise have to agree on message format and destination queues, they classify as mailbox coordination models.
2. Q: Referential uncoupling in TIB/Rendezvous is supported by subject-based addressing. What else contributes to referential uncoupling in this system?
A: The fact that messages are self describing. In that way, there is no need for processes to agree in advance to the format of the messages they exchange. Instead, a receiving process can dynamically determine what an incoming message actually contains. Of course, semantical issues are still left to the application.
3. Q: Outline an implementation of a publish/subscribe system based on a message-queuing system like that of IBM MQSeries.
A: Such an implementation can be accomplished by using a message broker. All publish/subscribe messages are published to the broker. Subscribers pass their subscriptions to the broker as well, which will then take care to forward published messages to the appropriate subscribers. Note that the broker makes use of the underlying message-queuing network.
4. Q: To what is a subject name in TIB/Rendezvous actually resolved, and how does name resolution take place?
A: A name is resolved to the current group of subscribers. Name resolution takes place by properly routing message to those subscribers. In TIB/Rendezvous, routing takes place through a combination of multicasting and filtering messages at rendezvous and router daemons.
5. Q: Outline a simple implementation for totally-ordered message delivery in a TIB/Rendezvous system.
A: Use a sequencer to which all messages are sent. Subscribers pass subscriptions to this sequencer. The FIFO-ordered message delivery of the TIB/Rendezvous system will then guarantee that all messages multicast by the sequencer are delivered to every subscriber in the same order.
6. Q: When a message is delivered to a process P as part of a TIB/Rendezvous transaction, P confirms that delivery. Is it possible for the transaction layer at P to automatically send a confirmation to the transaction daemon?
A: Yes, it could. When a message is delivered to P as part of a transaction, what happens is that the message is removed from an event queue local to P.
56 PROBLEM SOLUTIONS FOR CHAPTER 12 Removal of a message can be interpreted by the transaction layer as delivery, so that it can automatically send a confirmation to the transaction daemon.
7. Q: Assume a process is replicated in a TIB/Rendezvous system. Give two solutions to avoid so that messages from this replicated process are not published more than once.
A: A first solution is to attach a message identifier to each published message, and to let subscribers discard duplicates by taking a look at the identifiers. The main drawback of this approach is the waste of network bandwidth. Another solution is to assign a coordinator among the replicas, and let only the coordinator actually publish messages. This solution is similar to assigning a coordinator in the case of invocations with replicated distributed objects.
8. Q: To what extent do we need totally-ordered multicasting when processes are replicated in a TIB/Rendezvous system?
A: Assuming that duplicate messages are either detected by subscribers, or avoided altogether, total ordering is not an issue at all. In this case, the FIFOordering delivery semantics are sufficient to let subscribers process the messages in the order they were published by the replicated process.
9. Q: Describe a simple scheme for PGM that will allow receivers to detect missing messages, even the last one in a series.
A: A simple scheme is to use sequence numbers. Whenever a sender has no more data to send, it should announce by multicasting a special control message. If that control message is lost, a receiver will start complaining in the usual way. The sender can then simply retransmit the control message. In essence, this is also the solution adopted in PGM.
10. Q: Can ledgers in TIB/Rendezvous that are implemented as files guarantee that messages are never lost, even in the presence of crashing processes?
A: No. If a ledger is not flushed to disk before the sending process crashes, messages may be lost that later require retransmission by a receiver. The idea of certified message delivery is to increase the reliability of message delivery. By no means is it a solution to 100 percent guaranteed delivery.
11. Q: How could a coordination model based on generative communication be implemented in TIB/Rendezvous?
A: This is actually not very difficult. What makes TIB/Rendezvous different from, for example, the JavaSpaces in Jini, is that processes are still temporally coupled. If published messages are temporarily stored, it would be possible for subscribers to read them even when the publisher of messages no longer exists. What is needed is for each subscriber to record a published message that it has already read, so that receiving duplicates can be avoided.
12. Q: Apart from the temporal uncoupling in Jini, in your opinion, what is the most important difference between Jini and TIB/Rendezvous when considering coordination models?
A: An important difference is the fact that TIB/Rendezvous uses characterstring naming to match publishers and subscribers, whereas JavaSpaces uses marshaled object references. Another important difference is that JavaSpaces allows tuple instances to be removed upon reading, thus providing a means to synchronize processes. A comparable facility is not available in TIB/Rendezvous.
13. Q: A lease period in Jini is always specified as a duration and not as an absolute time at which the lease expires. Why is this done?
A: Especially in wide-area systems, it may happen that clocks on different machines give very different times. In that case, specifying the expiration of a lease as an absolute time is simply too inaccurate as the holder of the lease may have a very different idea when the lease expires than the processes that handed out the lease. With durations, this difference becomes less an issue, provided some guarantees can be given that the transmission time of a lease is relatively low.
14. Q: What are the most important scalability problems in Jini?
A: One important problem is related to the fact the Jini uses a multicast protocol to locate lookup services. In a wide-area system, this protocol will have to be replaced by something different if Jini is to scale to large numbers of users and processes. A second problem is related to matching templates to tuples in JavaSpaces. Again, special measures will be needed to search a JavaSpace that may be potentially distributed across a large-scale network. No efficient solutions to these problems are yet known.
15. Q: Consider a distributed implementation of a JavaSpace in which tuples are replicated across several machines. Give a protocol to delete a tuple such that race conditions are avoided when two processes try to delete the same tuple.
A: Use a two-phase commit protocol. In phase one, the process doing the delete sends a message to all the JavaSpace servers holding the tuple asking them to lock the tuple. When all of them are locked, the delete is sent. If a second delete happens simultaneously, it can happen that some servers have locked one tuple and some have locked the other. If a server cannot grant a request because the tuple is already locked, it sends back a negative acknowledgement, which causes the initiator to abort the delete, unlock all the tuples it has locked, wait a random time, and try again later.
16. Q: Suppose a transaction T in Jini requires a lock on an object that is currently locked by another transaction T ′. Explain what happens.
A: Transaction T will continue to block until the lock is either released or until the lease on the transaction expires. In the latter case, transaction T is simply aborted.
17. Q: Suppose a Jini client caches the tuple it obtained from a JavaSpace so that it can avoid having to go to the JavaSpace the next time. Does this caching make any sense?
A: Caching is senseless because the tuple will have been removed from the JavaSpace when it was returned; it is ready for the client to keep. The main idea behind caching is to keep data local to avoid another server access. In the case of Jini, a JavaSpace is often used to explicitly synchronize processes. Caching does not play a role when process synchronization is explicitly needed.
18. Q: Answer the previous question, but now for the case that a client caches the results returned by a lookup service.
A: This is a completely different situation. The lookup service stores information on the whereabouts of services. In this case, it may indeed make sense for a client to cache previously returned results and try to contact the returned services before going to the lookup service again.
19. Q: Outline a simple implementation of a fault-tolerant JavaSpace.
A: The simplest approach is to implement a JavaSpace on a single server with stable storage. Write operations succeed only if the tuple has been safely written to storage. In a more advanced setting, a distributed JavaSpace can be used in which a server group is used to mask process failures. In that case, the servers may need to follow a two-phase commit protocol for each write operation.
1.分布式系统定义:分布式系统是若干独立计算机的集合,它们对于用户来说就像一个系统。分布式系统屏蔽系统中种类各异的计算机和网络,常常通过一个软件层(中间件)组织起来。
2. 分布式系统目标:让用户连接到资源(共享资源:降低经济成本,方便协作和信息交换:互联网、群件、电子商务)、透明性、开放性、可扩展性
分布式系统的重要目标之一是透明性,即将它的进程和资源实际上分布在多台计算机上这一事实隐藏起来。
透明性 描述
访问 隐藏数据表示形式以及访问方式的不同
位置 隐藏数据所在位置
迁移 隐藏资源是否已移动到另一个位置
重定位 隐藏资源是否在使用中已移动到另一个位置
复制 隐藏资源是否已被复制
并发 隐藏资源是否由若干相互竞争的用户共享
故障 隐藏资源的故障和恢复
持久性 隐藏资源(软件)位于内存里或在磁盘上 • 开放性定义:根据一系列准则来提供服务,这些准则描述了所提供服务的语法和语义 • 分布式系统中,服务通常通过接口指定,接口定义了可用函数的名称、参数类型、返回值以及可能出现的异常,良好的接口规范说明应具有: – 完整性 – 中立性 • 互操作性:不同厂商组件的共存和协同工作程度 • 可移植性 • 灵活性:方便的组合不同组件,添加、替换组件 – 灵活性的关键:策略与机制分离
分布式系统的可扩展性
• 规模上的扩展:更多的用户和资源 • 地域上的扩展:用户和资源相隔更远 难以扩充为局域网设计的分布式系统的原因: • 局域网的分布式系统是基于同步通信的,难以适用于广域系统 • 局域网提供高度可靠的基于广播的通信方式,而广域网的通信本质上是不可靠的,而且是点对点的;服务定位问题 • 存在集中式组件产生的性能和可靠性问题 • 管理上的扩展:跨越多个管理机构
3. 分布式系统硬件:多CPU计算机系统: • 根据是否共享存储器 – 多处理器(multiprocessors)系统:共享存储器 – 多计算机系统(multicomputers) :不共享存储器 • 同构的:相同计算机,单一互联网络 • 异构的:不同计算机,通过不同网络互连 • 根据网络互连体系结构 – 总线型(bus):使用一根主干线连接 – 交换型(switched):各机器之间用独立线路相连
同构多计算机系统:需要解决CPU之间的通信问题,信息量较少
• 基于总线 • 基于交换 a) 网状拓扑 b) 超立方体拓扑 c) MPP(massively parallel processors),COW (clusters of workstations):互联网络与容错性
异构多计算机系统:
• 计算机差异:处理器类型、存储器大小以及I/O带宽等 • 系统中的互联网络也可以是高度异构的 • 实例:校园网 • 没有整体的系统视图:应用程序不能假定在系统各处都提供相同的性能和服务 • 分布式系统的用武之地
4. 分布式系统软件: ➢ 分布式操作系统管理的是紧耦合计算机系统,包括多处理器系统和同构多计算机系统 ➢ 网络操作系统可以将不同操作系统的计算机连接起来,但不能提供单一的系统视图 ➢ 现代分布式系统通过中间件来隐藏底层计算机集合的异构性和分布性
|系统 |描述 |主要目标 |
|DOS |紧耦合的操作系统,用于多处理器系统和同构式多计算机系统,以|隐藏及管理硬件资源 |
| |一种简单的全局视图管理资源 | |
|NOS |松耦合的操作系统,用于异构式多计算机系统 |为远程客户提供本地服务 |
| |(LAN 和 WAN),一组运行各自操作系统的计算机协同 | |
|中间件 |NOS 通用服务实现层之上的附加层 |提供分布式透明性 |
• 分布式操作系统:DOS (Distributed Operating Systems) • 网络操作系统:NOS (Network Operating Systems) • Middleware(中间件):对NOS的改进,提高分布透明性
单处理器操作系统:
• 管理单CPU的计算机 • 内核模式与用户模式 • 通过微内核分隔应用程序与操作系统代码
多处理器操作系统:目标是通过多CPU支持高性能
• 数据由多个处理器访问,必须确保数据的一致性 • 信号量(semaphore)和管程(monitor) • 用于保护整数免受并发操作的管程,它将阻塞某个进程 • 具有更高的复杂性:不存在共享的存储器,使用消息通信 • 多计算机操作系统的常见结构
分布式共享内存系统(DSM) :人们试图在多计算机系统上模拟共享存储器 • 提高性能的方法:复制 • 页的大小
|项目 |分布式操作系统 |网络操作系统 |基于中间件的分布式系统 |
| |多处理器系统 |多计算机系统 | | |
|透明度 |很高 |高 |低 |高 |
|所有的节点使用的操作系统是否 |是 |是 |否 |否 |
|相同 | | | | |
|操作系统拷贝数目 |1 |N |N |N |
|通信基于的实体 |共享内存 |消息 |文件 |特定模型 |
|资源管理 |全局,集中管理 |全局,分布管理 |各节点自行管理 |各节点自行管理 |
|可扩展性 |否 |部分 |是 |各系统不同 |
|开放性 |封闭的 |封闭的 |开放的 |开放的 |
• 中间件:以中间件形式组织的分布式系统的一般结构 • 对应用程序隐藏底层平台的异构性 • 中间件模型: – 将所有东西都看作文件 – 分布式文件系统 – 基于RPC – 分布式对象 – 分布式文档 • 中间件服务: – 访问透明性 – 命名服务 – 分布式事务 – 安全功能 • 在一个基于中间件的开放分布式系统中,各中间件层所使用的协议及向应用程序提供的接口必须相同 • 客户端-服务器模型: • 服务器(server):实现某个特定服务的进程 • 客户(client):向服务器请求服务的进程 • 客户端-服务器之间的一般交互 • 小结:分布式系统的定义、分布式系统的目标、分布式系统的硬件、分布式系统的软件 – 分布式操作系统管理的是紧耦合计算机系统,包括多处理器系统和同构多计算机系统 – 网络操作系统可以将不同操作系统的计算机连接起来,但不能提供单一的系统视图 – 现代分布式系统通过中间件来隐藏底层计算机集合的异构性和分布性 客户-服务器模型 9. 某多计算机系统中的256个CPU组成了一个16×16的网格方阵。在最坏的情况下,消息的延迟时间有多长(以跳(hop)的形式给出,跳是结点之间的逻辑距离)? 10. 现在考虑包含256个CPU的超立方体,最坏情况下消息的延迟有多长(同样以跳的形式给出)? 16. 由于存在错误,某个试验性文件服务器有3/4的时间能够正常工作,而另外1/4的时间无法工作。如果要确保服务至少在99%的时间可用,需要将该文件服务器复制多少次? 17. 考虑一个进程链,该进程由进程P1, P2,… Pn构成,实现了一个多层客户-服务器体系结构。进程Pi 是Pi+1的客户, Pi 只有得到Pi+1的应答之后才能向Pi-1发出应答。如果考虑到进程P1的请求-应答性能,这种组织结构主要存在什么问题?
第二章 通信 大纲: 2.1 多层协议 2.2 远程过程调用 2.3 远程对象调用 2.4 面向消息的通信 2.5 面向流的通信 进程间的通信是一切分布式系统的基础,它基于底层网络提供的底层消息传递机制 • 分层协议 OSI 模型中的层、接口和协议 必须在不同层次制订多种协议,包括从位传输的底层细节到信息表示的高层细节: ➢ 0,1的电压表示 ➢ 消息的结束位 ➢ 检测消息的丢失或损坏及其处理 ➢ 数值、字符串及其它数据项的长度和表示方法 ➢ 面向连接的协议:电话 ➢ 无连接的协议:邮箱 • 远程过程调用RPC:Remote Procedure Call RPC是分布式系统通信处理的事实标准 ➢ 常规过程调用 ➢ 客户存根和服务器存根 ➢ 参数传递 远程过程调用步骤: 1. 客户过程以正常的方式调用客户存根 2. 客户存根生成一个消息,然后调用本地操作系统 3. 客户端操作系统将消息发送给远程操作系统 4. 远程操作系统将消息交给服务器存根 5. 服务器存根将参数提取出来,然后调用服务器 6. 服务器执行要求的操作,操作完成后将结果返回给服务器存根 7. 服务器存根将结果打包成一个消息,然后调用本地操作系统 8. 服务器操作系统将含有结果的消息发送回客户端操作系统 9. 客户端操作系统将消息交给客户存根 10. 客户存根将结果从消息中提取出来,返回给调用它的客户过程 • 远程对象调用 使用客户端代理的远程对象的一般组织结构 • 面向消息的通信 当远程过程调用和远程对象调用不适用时,需要面向消息的通信。 ➢ 消息中的持久性和同步性 ➢ 面向消息的暂时通信 ➢ 面向消息的持久通信 第三章大纲: 进程 3.1 线程、3.2 客户端 、3.3 服务器、3.4 代码迁移 、3.5 负载均衡、3.6 软件代理 进程是操作系统中极为重要的概念,通过这一概念的引入,可以更准确地把握OS对资源的管理功能和对用户的服务功能,同时更能从本质上了解OS中关于用户作业和程序的处理过程以及OS对死锁、同步、互斥、通信等内容的展开。 进程的概念:一、程序的并发执行 1、在单道程序系统中程序的执行特点: (1)顺序性:即不同程序之间是顺序执行的,一个程序的执行必须在他的前一个程序的执行完毕。 (2)封闭性:指用户程序在其运行期间所使用的系统是专用的,资源的状态只有该程序能改变,而不受外界因素的影响。 (3)无关性:指程序的运行结果与他的执行速度无关。 (4)再现性:如果一个程序在不同的时间执行,只要初始条件相同,该程序的执行轨迹和最终结果亦必相同。 2、多道程序系统中程序执行的特点: (1)异步性:每个程序均以各自独立的速度向前推进,没有时间上的规律性,程序的启动与执行时间是不确定的, 而且程序的执行一般是非连续的,且呈走走停停的状态。 (2)竞争性: (3)相互制约性 (4)与速度有关 a.举例 b.与时间相关的错误现象 (5)独立性 (6)随机性 (7)资源共享 3、程序的并发执行(P38) (1)什么是程序的并发执行? 并发:指在计算机系统中同时存在多个程序,宏观上看,这些程序是同时向前推进的,表现在以 下两个方面:第一、用户程序之间并发执行;第二、用户程序与OS程序之间并发执行。 并行:程序并行要求微观上的同时,即在绝对的同一时刻有多个程序同时向前推进,所以程序并行 只能在多处理机上才能实现。
两相邻语句S1,S2可以并发的条件(P39)
(3)程序的并发执行所带来的影响:由于资源共享和竞争,程序的执行结果将不可避免地失去封闭性和可再现性 二、进程的定义 1、进程:是一个具有独立功能的程序对某个数据集在处理机上的执行过程,是分配资源的基本单位。 2、进程与程序的区别: (1)进程是一个动态的概念,而程序则是一个静态的概念。程序是指令的有序集合,没有任何执行的 含义。而进程则强调执行过程,他动态地被创建,并被执行后消亡。 (2)进程具有并行特征,而程序没有。有进程的定义可知,进程具有特征的两个方面,即独立性和异 步性。也就是说,在不考虑资源共享的情况下,各进程的执行是独立的,执行速度是异步的。显 然,由于程序不反映执行过程,所以不具有并行特征。 (3)进程是竞争计算机系统资源的基本单位,从而其并行性受到系统自己的制约。这里,制约就是对进程独立性和异步性的限制。 (4)不同的进程可以包含同一个程序,只要该程序对应的数据集不同。 三、进程和作业的关系: (1)作业是用户向计算机提交任务的任务实体。而进程是完成用户任务的执行实体,是向系统申请分配资 源的基本单位。 (2)一个作业可由多个进程组成。 且必须至少有一个进程组成,但反过来不成立。 (3)作业的概念主要用在批处理系统中。像UNIX这样的分时系统中,则没有作业概念。而进程的概念则用在几乎所有的多道系统中。 进程的描述:从静态看,进程由三部分组成:进程控制块、有关程序段、数据结构集。 (1)PCB:包含了有关进程的描述信息、控制信息及资源信息,是进程动态特征的集中反映,OS根据PCB感知进程的存在和通过PCB中所含的各项变量的变化,掌握进程的状态已达到控制进程活动的目的。 (2)进程的程序部分:描述进程所要完成的功能。 (3)数据结构集:是程序执行时必不可少的工作区和操作对象。 一、进程控制块PCB (1)PCB集中反映一个进程的动态特征。 (2)创建进程时,就由操作系统为其建立PCB,然后根据PCB中的信息对进程进行控制。 (3)当一个进程完成时,系统释放PCB,进程也随之消亡。 1、PCB内容:OS不同,PCB内容也有差异,但所有PCB均有以下内容: (1)描述信息:a、进程名或进程标志号b、用户名或用户标志号c、家族关系 (2)控制信息:a、进程当前状态b、进程优先级c、程序开始地址d、各种计时信息e、通信信息 (3)资源管理信息:a、占用内存大小及其管理用数据结构指针 b、在某些复杂系统中,还有对换或覆盖用的有关信息,如对换程序长度,对换外存地址等。 c、共享程序段大小及起始地址。d、输入输出设备的设备号,所要传送的数据长度、缓冲区地址、缓冲区长度及所用设备的有关数据结构指针等。e、指向文件系统的指针及有关标志等。 (4)CPU现场保护结构, 当前进程因等待某个事件而进入等待状态或因某种事件的发生被中止在处理机上的执行时,为了以后该进程能在被打断处恢复执行,需要保护当前进程的CPU现场(或称进程上下文)。PCB中设有专门的CPU现场保护结构,以存储退出执行时的进程现场数据。 二、进程上下文(P44) 1、概念:是进程执行活动全过程的静态描述,包括计算机中与执行进程有关的各种寄存器的值,程序段 在经过编译后形成的机器指令代码集(正文段)、数据集及各种堆栈值及PCB结构。 2、进程上下文动态部分:指在进入和退出不同的上下文时,系统为各层上下文中相关联的寄存器值所保 存和恢复的记录。 3、进程上下文静态部分:包括PCB结构、将进程虚地址空间映射到物理空间用的有关表格和核心栈。 三、进程空间(P45) 1、进程空间:任一进程均有一个自己的地址空间,把该空间称为进程控间或虚空间。进程空间大小只与处理机位数有关。 2、进程空间的划分: 分为用户空间和系统空间。 (1)用户模式(用户态):用户程序的执行方式,只能在用户空间进行,不能访问系统空间。 (2)系统模式(系统态):系统程序执行方式,能对系统空间和用户空间访问。 进程状态及转换: 一、进程状态 1、执行态:进程获得了CPU,正在CPU上运行它的程序。在单CPU的系统中,任一时刻系统中只能有一个处于执行状态的进程。 2、就绪状态:进程获得了除CPU之外的一切当前需要的资源,准备占用CPU。一个进程一旦被建立就处于就绪状态。 3、等待状态:进程正在等待某一事件的发生或受到某种制约而暂时停止运行。 4、停止状态:进程运行终止,等待被系统撤销。 5、死锁状态:进程在无限地等待一件永远不会发生的事件,这是一种进程运行故障。 二、进程状态转换 1、转换图
2、图中各种转换的条件 (1)变换1:处于就绪状态的进程获得了CPU而变成执行态。 (2)变换2:正在运行的进程因等待某种原因(如时间片用完)而被迫放弃CPU。 (3)变换3:表示正在运行的进程因等待某时间的发生而处于等待。 (4)变换4:表示进程所等待的事件发生了,而变成就绪态。 (5)变换5:表示当前进程运行完毕,进入停止态,等待被撤销。 (6)变换6:表示若干进程无休止地等待陷入死锁。 (7)变换7:在解除死锁的有的方案中,可以将死锁态进程逐一恢复。 进程控制1、进程控制:系统使用一些具有特定功能的程序段(原语)来创建、撤销进程以及完成进程个状态间的转换,从而达到多进程高效率并发执行和协调、实现资源共享的目的。 2、原语:在系统态下执行的具有特定功能的程序段,原语分为指令级原语和功能级原语。 (1)指令级原语:执行期间不允许中断,是不可分割的单位。 (2)功能级原语:执行期间不允许并发的程序段。 原语的特点: a、具有独立的系统功能; b、在系统态运行; c、不允许中断或不允许并发。 一、进程创建与撤销 1、进程创建 (1)创建方式: a、由系统程序模块统一创建,各进程关系平等; b、由父进程创建 进程之间存在隶属关系,形成家族关系; 属于某家族的一个进程可以继承父进程的资源。 2、进程撤销: (1)引起进程撤销的原因: a、该进程已完成所要求的功能而正常终止; b、由于某种错误而导致非正常终止; c、祖先进程要求撤销某个子进程。 (2)撤销的任务: a、释放所占有的所有资源; b、释放相应的PCB或子进程相应的PCB。 (3)撤销原语流程图 二、进程的阻塞和唤醒 1、进程的阻塞(执行态 等待态) 处于执行状态的进程由于需等待外部事件的发生 而调用block原语将自己从执行态变为等待态。在实 现阻塞时,block应先中断处理机并保存CPU现场。 流程图(P49) 2、唤醒 当等待队列中的进程所等待的事件发生时,待事 件的所有进程都将被唤醒,即从等待态变为就绪态。 唤醒方法有: a、由系统进程唤醒 系统进程统一控制事件的发生 并将“事件发生”这一消息通知等待进程,从而使 进程从等待态变为就绪态; b、由事件发生进程唤醒 这种情况下,事件发生进 程和被唤醒进程之间是合作关系。 流程图(P49) 进程互斥:一、资源共享引起的制约 1、与时间相关的错误 由于兵法进程是随机发生的,如果参与并发的进程相互间直接或间接联系非常紧密,如共享变量或某些资源,可能产生与进程推进速度有关的错误,也称与时间相关的错误。 例子(略) 2、进程互斥:指两个或两个以上的进程不能同时进入关于同一组共享变量的临界区域,否则会发生与时间相关的错误。 互斥是进程间所发生的间接性相互排斥现象,根本原因在于不同进程对共享变量或资源的访问。 3、进程同步:指一组相互合作的并发进程,为协同起推进速度,需要相互等待与相互唤醒,进程间的这种直接相互作用叫同步。 4、共享资源:指两个或两个以上的进程要访问使用的资源,包括硬件、软件资源,如打印机、数据等。 5、临界资源:并发进程可以共享系统中的各种资源,但其中某些资源一次只允许一个进程访问,这样的资源叫临界资源,如打印机、磁带等。 6、临界区域:进程中涉及对临界资源访问的那一部分操作即程序段叫做关于该临界资源的临界区域。 7、间接制约:由于共享某一共有资源而引起的在临界区内不允许并发进程交叉执行的现象称为由共享资源而造成的对并发进程执行速度的间接制约,简称间接制约。 8、互斥概念的含义: (1)不允许多个进程同时进入关于同一组共享变量的相同的临界区。 (2)不允许多个进程同时进入关于同一组共享变量的不同的临界区。 9、解决临界区互斥问题应遵循的准则: (1)有空让进 (2)无空等待 (3)多中择一 (4)有限等待 (5)让权等待 10、实现互斥的基本要求: (1)互斥性要求:任一时刻最多只能有一个进程处于 关于同一组共享变量的临界区域中; (2)公平性要求:不能让一个进程无限地等待进入关 于任一组共享资源的临界区。 二、互斥的加锁实现(P52) 1、加锁原语LOCK与开锁原语UNLOCK 思想:(1)设变量X,它取值为0或1,把X作为一把所配置给临界资源,若X=0,表示资源未加锁,X=1,表示资源已加锁。 (2)互斥的进程在进入临界资源前先对X进行测试,若X=0表示进程可以进入临界区,进入后立即加锁,不允许其他进程进入;当该进程退出临界区时应立即开锁,以便其他进程进入。 (3)加锁原语LOCK定义为:a、检查x的值; b、若x=0,则表示临界资源可用,允许一个进程进入临界区,并立即加锁:x=1; c、若x=1,表示资源已被占用,返回第一步继续测试。 2、例:设进程A负责分配打印机,进程B释放打印机,
一、线程的概念: (一)概念:线程是指一个进程内的基本调度单位。也称轻权进程,这个调度单位是由OS内核控制的,也可由用户程序控制。 (二)引入线程的目的: a、提高系统的执行效率; b、减少CPU的空转时间和调度切换时间,便于系统管理 (三)进程与线程的区别:a、进程是资源分配的基本单位,所有与进程有关的资源均被记录在进程控制块PCB中,以表示该进程拥有这些资源或正在使用它们。 b、进程是抢占CPU的调度单位,它拥有一个完整的虚拟地址空间。 c、与进程相对应,线程与资源分配无关,它属于某一个进程,并与进程中的其他线程共享资源。 d、当进程发生调度时,不同的进程拥有不同的虚拟地址空间,而同一进程内的不同线程共享同一地址空间。 e、线程由相关堆栈寄存器和线程控制表TCB组成。寄存器可被用来存储线程内的局部变量,但不能存储其他线程的相关变量。 总结:发生进程调度与发生线程调度时相比较,进程切换时将涉及到有关资源指针的保存以及地址空间的变化等问题,线程切换时,由于同一进程内的线程共享资源和地址空间,将不涉及资源信息的保存和地址变化问题,从而减少了OS的开销时间,而且进程的调度与切换是由OS内核完成的,而线程的调度可由OS内核完成,也可由用户程序进行。 二、线程的适用范围 1、适用范围: (1)多处理机系统中,同一用户程序可以根据不同的功能划分为不同的线程,放在不同的上执行。 (2)在用户程序可以按功能划分为不同的小段时,单处理机系统也可因使用线程而简化程序的结构和提高执行效率。 2、线程的几种典型应用: (1)服务器中的文件管理与通信控制 (2)前后台处理: (3)异步处理 (4)数据的批处理及网络中的信息发送与接收和其他相关处理等。 三、线程的执行特性:——线程状态及同步 1、线程的三个基本状态: 执行态 就绪态 阻塞态 2、线程状态变迁: (1)派生(spawn):线程在进程内派生出来,它既可由进程派生,也可由线程派生。新派生出来的线程放入就绪队列。 (2)阻塞(Block):如果一个线程在执行过程中需要等待某个事件发生,则被阻塞,阻塞时,寄存器上下文、程序计数器以及堆栈指针都会得到保存。 (3)激活:如果阻塞线程的事件发生,则该线程被激活并进入就绪队列。 (4)调度:选择一个就绪线程进入执行状态。 (5)结束:如果一个线程执行结束,它的寄存器上下文及堆栈内容等将被释放。 3、线程的同步特性: 由于同一进程中的所有线程共享该进程的所有资源和地址空间,任何线程对资源的操作均会对其它相关线程产生影响,因此,系统必须为线程的执行提供同步控制机制,以防止因线程的执行而破坏其它数据结构和给其他线程带来不利影响。 四、线程分类 分为用户级线程、系统级线程 • Process vs. Thread – A process is often defined as a program in execution. – A thread is a sequence of instructions that can be executed in parallel with other threads. – Threads are like lightweight processes. – Multiple threads within a single process share the state information, memory and other resources directly, but they are able to execute independently – Threads collaborate with each other -- processes compete. – …… • Why Multithreading? – Allows to perform more than one task at a time. – Increases the performance of programs. • When Multithreading? – A multiprocessor computer works by using the multiple processors to handle threads simultaneously (multiprocessing). – A program that spends a great deal of time waiting for outside events (time slicing). • Problem 3 in chapter 3, page 180 – In the text, we described a multithreaded file server, showing why it is better than a single-threaded server and a finite-state machine server. Are there any circumstances in which a single-threaded server might be better? Give an example. • Problem 14 in chapter 3, page 181 – Is a server that maintains a TCP/IP connection to a client stateful or stateless?