Free Essay

Dsc340

In:

Submitted By bpember4
Words 8643
Pages 35
DSC 340 Study Guide
Mick McQuaid
Spring 2014
Following is a study guide for DSC 340. It’s a living document to be updated by the instructor every week during the term based on readings, contributions from students, and updates in the rapidly changing world of business information systems.
1. PERSONAL INFORMATION SYSTEMS
Extensive research over the past twenty years, some of it by Tom Malone at MIT and Susan Dumais at Microsoft Research, has explored how people organize personal information.
One example that helps people understand the problem of personal information is the knife analogy, described below.
One important finding about personal information management has been that people are prone to do one or some combination of these two things: filing and piling.
After reviewing the following topics, conclude the study of personal information systems by doing the share best practices exercise.
1.1 Knives In the Home
Suppose you have just won a complete set of knives for your home. Where should you put them? You could have a single cabinet to store all knives, but it is more likely that you will distribute the knives to different rooms, placing them near where they will be used: steak knives in a buffet in the dining room, cooking knives in the kitchen, handyman knives in a garage workshop.
When you need a particular knife, it will be in the context of a current task, such as preparing food, setting a table, or cutting a length of rope for a clothesline. In each room, there is some place where the tools appropriate to the tasks performed in that room are stored. need more development of above anecdote
1.2 Filing: Using Hierarchies to Organize Information
Filing refers to organizing items according to categories or classifications or clusters. (Researchers define these three words differently.) When a filer looks for information, it is found in a place where like information is found. That information may be in a nested structure containing more general information at the higher levels and more specific information at the lower levels.

1

For example, information relevant to your work as a student may be kept in files on a usb drive in a folder called schoolwork. Within that folder may be a separate folder for each course, as well as a separate folder for administrative documents not related to any given course. There might be a folder for each term containing a schedule for that term, grades for that term, and more. On the other hand, you may divide such folders differently: into group work and individual work. Or you may organize according the types of files, with videos in one folder, music in another, and text documents in another.
1.3 Piling: Using Tags to Organize Information
Piling refers to dumping information where it is most convenient. The piler makes no effort to move information around. Instead, the piler usually uses tags of some kind to find information. The piler may create these tags or take advantage of existing tags.
For example, Mac users employ Spotlight, a local search engine to find files using
(mostly) words in the files.
Those words are automatically indexed by Spotlight into a tagging system, especially while the computer is otherwise idle. As another example, the IMDB has a file containing keywords for each movie recorded—if someone cares enough about that movie to type in keywords. You can search for a movie by entering any of these keywords.
What’s problematic about such a system? One issue is that a given user does not necessarily know what keywords are available to describe a given concept. The
IMDB tries to overcome this be presenting a display of all the keywords that appear in movies that share the keyword being searched. How else could you try to overcome this limitation?
Another limitation of tags is that words have different senses so that searches for words like net and rock return results that may not be of interest depending on whether the search is for tennis or the web or music or geology. How can you try to overcome this limitation?
One way is to use context. For instance, your browser by default saves the most recent URL you visited as a referrer and makes it possible for the administrator of the next URL to identify it. If my referrer has the string wimbledon in it, am I more likely to be looking for tennis or the web?
A typical business use of tags can be found in Adwords, the main way Google earns revenue. An adword is a tag associated with an advertisement. A business can pay to be advertised when an adword appears in the Google search window.
1.4 Exercise: Share Best Practices
1. Form an ad hoc group of four (four is an ideal number—three or five if you must). 2

2. Share a google doc between the four group members and the instructor.
3. Each member of the group writes three paragraphs describing your personal information process. The first paragraph describes what you do. The second paragraph describes what works well about it. The third paragraph describes what does not work well about it.
4. Discuss the resulting paragraphs
5. Write two paragraphs as a group describing the strengths of individual members you all agree should be carried forward and the weaknesses of individual members you all agree you do not want to include in your personal information management process.
Some valuable readings can be found at the following URLs. These were all obtained by googling the expression pilers-vs-filers and appeared in the first three pages of results. These are replicated in the readings folder on Blackboard. The names are the blackboard files, while the links are the google result links.
01pimFilersVsPilersForbes.pdf
01pimFilersVsPilersEconomist.pdf
01pimJones2011.png
Note that each of these readings has a different form of credibility, to be discussed in a later section of the course. When you use a search engine to discover information about a topic, you must be sensitive to issues related to the search engine and to the information sources linked. Here, I will just give a brief blurb about each source. Forbes was a US magazine for decades before it created an online presence. It cultivates an image as a maverick business publication, espousing values favoring considerable social and economic freedom for individuals, and a blunt style, connecting business success to individuals rather than organizations.
The Economist is a conservative British weekly magazine. A policy advisor to US
President Clinton once told me that The Economist is the most widely read publicly available weekly publication among presidents and prime ministers of nations worldwide. The Google Books result is a page from a book available in Knight Library, called
Personal Information Management, edited by William P Jones and Jaime Teevan.
This book is a collection of chapters by scholars who write about business information systems. This book introduces the mainstream academic thinking about personal information management. It introduces what scholars who do studies about personal information management have concluded over the past thirty years of study.

3

2. INFORMATION
People studied information overload long before the Internet made it a tired phrase.
In psychology classes you may have seen videos of famous experiments such as passing the basketball. In this experiment, a person watches a group of people passing basketballs and is asked to count the number of total passes and keep them in mind. The person is told that accuracy is very important. The real purpose of the basketball passing is to overload the person with information. Meanwhile, a person wearing a gorilla costume walks through the room. After the exercise, the researchers ask the person if anything unusual happened. The person invariably replies that nothing unusual has happened. The researchers then show the person a video of the preceding few minutes, including the person in the gorilla costume.
Invariably the person is shocked. The person almost never accepts that someone in a gorilla costume walked in front of them until they see the video evidence.
This demonstrates the astounding power of information overload to shut down our perceptions of our environment.
Similar experiments abound. There is, for instance, one experiment where a researcher with a map in hand walks up to someone on a crowded London sidewalk and asks for directions. While the bystander tries to think of the best route, two uniformed workmen walk between the researcher and the bystander, carrying a large heavy mirror. Concealed behind the mirror is another researcher of a different race and sex but carrying an identical map in the same folded orientation. The original researcher moves along with the two workmen who speed past so that there is only a moment when the bystander can’t see the researcher. The new researcher, of a different race and sex, tries to carry on the same line of questioning as if nothing had happened. Video of the encounters suggests that the bystander usually does not realize the change because of the information overload resulting from trying to visualize the landmark under discussion and the route information.
Both these experiments and many similar ones are described online under information overload or cognitive overload or similar terms.
In addition to studying the effects of information, there are many scientists studying information itself and the characteristics that define information and distinguish one kind of information from another.
2.1 Unstructured Information
The term unstructured information may be a bit misleading. It typically refers to information that is indeed structured but whose content does not follow very strict rules. For example, a news article or a blog post would be considered unstructured information. An article usually has a title but not always. It does not follow a strict rule as to whether or not it is titled or subtitled. It may or may not have pictures. It may have varying numbers of paragraphs, tables, diagrams, comments from users, links to related articles, or other features. The key is that it is completely flexible

4

about these things and the person who presents the article does not have to think about conveying the structure of the article to a computer program, only to a human reader. Human readers are much more forgiving of lapses in structure than are computer programs.
This lack of structure is really a matter of degree and is best understood by comparing it to the following terms.
2.2 Semistructured Information
Semistructured information refers to labeled information such as is found in forms filled out by people. When you fill out a form, each place where you can insert information contains a label telling you what kind of information belongs there.
There are some rules but it is often quite easy to break rules for completing forms.
Some forms enforce rules by not allowing you to type in any information that violates the rules. A good example of this can be found in income tax forms online.
Yet most online forms allow the person filling them in some flexibility and may contain instructions that can be disobeyed. For instance, you may be transgendered and asked to fill in M or F in a box marked gender. You may be able to enter a
T or leave the box blank, depending on how much time and money was spent on developing the form. If you do enter something unexpected, the person or program processing the form has to decide how to handle it.
It is a hallmark of semistructured information that some human intervention is required to process it because some entered information can not be anticipated.
2.3 Structured Information
Structured information obeys strict rules and can be processed in extremely large volumes at high speeds and can be aggregated easily to determine, for instance, how many green shirts in size L were ordered on game days in the 2013 season.
Structured information is often passed from one computer program to another. Systems that process structured information use various techniques to diminish the effects of human error, including bar code readers, credit card readers, qr code readers, nfc readers and similar devices to obtain information. When human input is needed, it is often restricted. For instance, when a fast-food cashier takes an order, they often press a touchscreen area with a picture of the item being ordered rather than trying to type a name or a price, both of which are supplied by a computer program reading the touchscreen.
Most structured information in business is presented in one of two main ways, as relations or as hierarchies.
2.4 Relational Data
By far, the most prevalent form of data in business today is relational data, stored in database products such as Oracle, SQL Server, and MySQL.

5

Relational data is presented in tables consisting of rows and columns. The rows refer to entities and the columns refer to attributes of the entities. An example of an entity is a customer. An example of an attribute is a zip code for that customer.
A key characteristic of relational data is that the rows and columns of one table are usually linked to the rows and columns of many other tables. In order to speed processing of relational data, a given table should be long (many rows) and thin (few columns). So, rather than have a table that describes a customer, the information about a customer may be spread over many tables, each with only a few columns.
2.5 Hierarchical Data
The second most frequent way to present business data is in a hierarchical format.
XML and JSON are among the most common hierchical formats in use business.
To illustrate what the term hierarchical means, consider a waybill as an example.
One international waybill used by FedEx has four main headings and some required subheadings and optional subheadings. For instance, there can be an intermediate consignee in addition to an ultimate consignee. The description of commodities to be shipped has a number of subheadings, not all of which are applicable to every kind of commodity. Hence, for each commodity there is a choice of entering number or unit in addition to quantity. Each commodity has a number, a description, a weight, and a value.
So an example hierarchy might look like this
Sender
Sender Name
Sender Address
Sender Account Number
Recipient
Recipient Name
Recipient Address
Commodities
Commodity 1
Quantity
Unit / Number
Weight
Value
Commodity 2
Quantity
Unit / Number
Weight
Value
Authorization
A key characteristic of this hierarchical form is that it can present exactly the same data as in the relational form above. In other words, the above hierarchy could be

6

converted to tables of rows and columns. Many computer programs just translate between one form and the other, depending on immediate needs.
2.6 Big Data
The term big data refers to collections of data that can not be processed on a single computer. The traditional tools of computing are inappropriate for big data because they require that the data be available for processing on a computer.
A major breakthrough for business computing occurred when the founders of Google developed a way to process big data using large numbers of very inexpensive connected computers. The method they used has become the standard for working with big data. MapReduce was the name Google gave to its original framework and it has become a generic term, like Kleenex, to describe other examples of the framework, such as Hadoop, the most popular implementation.
The way in which these very inexpensive connected computers work together is in a kind of tree structure, where one computer at the top of the tree gives orders to other computers in the tree and receives results from them.
Two key characteristics of this approach are that no one computer in the tree has all the data and that no one computer is unique. These two characteristics enable scalability and fault-tolerance.
2.7 Supply Chain Information
Supply chains typically have a channel captain, such as Walmart in the retail world or General Motors in the US automotive world. These channel captains can dictate the flow of information through the supply chain. They typically use formal documents structured as XML or some conceptually similar format.
XML stands for eXtensible Markup Language and it belongs to the same family as
HTML and SGML, as well as other such markup languages.
2.8 Exercise: Mit Beer Game
1. Form a group of four and create a google doc in which you choose a role for each group member. The roles are distributor, factory, retailer, and wholesaler. One person must play each role. Share the google doc with the instructor.
2. Play the MIT beer game without prior knowledge of the game except as follows. Take a snapshot of the graphics available to assess your status after playing for half an hour.
3. Do the reading about the MIT beer game and play it again for half an hour and take a corresponding snapshot of the results.
4. Write three paragraphs in the google doc. First, describe your experience running it without prior knowledge (except for the following paragraph).

7

Second, describe your experience running it after learning about it. Third, say what you think might be key issues in the information flow through the supply chain that affect your success.
Minimal description: The MIT beer game is an online simulation of a supply chain.
Each of six players represents a role in the chain between raw materials and thirsty customers. For you to succeed, you must all succeed. You must work together to succeed. The game occurs in rounds. Each member of the supply chain must make a choice before a round can begin. The game determines your performance in each round based on the choices you make and your history of choices in previous rounds and external factors such as demand fluctuations. Before you learn about the game, try playing for half an hour without resetting (each round builds on the success or failure of previous rounds) just knowing this minimal description.
3. FINDING INFORMATION
Nearly half a century ago, Nobel Prize winner Herbert Simon is alleged (the origin of the quote has been the subject of some controversy) to have said that in the future, attention will be our most precious resource. He may have meant that so much information will be available that we will only be able to pay attention to a small fraction of it and that the problem of deciding what to pay attention to will become the prominent problem of the information-intensive era.
Relationships between businesses and between businesses and customers have been transformed by the ease with which information can be found online. This transformation is ongoing. As an example of current change, some retail stores in 2013 still forbid the use of smartphone cameras because they fear the use of search engines to make price comparisons. This is clearly not sustainable behavior. (To see why this is not sustainable, consider the point of view of the smartphone user making price comparisons. Is it more likely that the smartphone user will eliminate this particular store from the selection set or abandon the use of their smartphone?)
If behavior is not sustainable, then we have to ask whether a steady state will be achieved and, if so, how it will differ.
3.1 Search Engines
A person using a search engine reveals a great deal of personal information that has value for businesses. Some activists believe that individuals should be compensated for the personal information they share with search engines. They believe that the contribution made by users to search engines can be quantified. Others claim that users are compensated by search results and should quit complaining about search engines getting rich. But if the value of search results can be quantified and the value of user contributions can be quantified, policy makers may be convinced that consumers are being exploited and may seek to regulate search engines. This is one way the search engine business may change in the near future.

8

Reading: 03findPageRank.pdf. Page and Brin, 1998. The PageRank Citation
Ranking: Bringing Order to the Web. stanford IL pub
Also, the Wikipedia entry on PageRank has some wonderful graphics illustrating the basic concept.
3.2 Information Scent
The term information scent may be a little confusing because it is borrowed from anthropology. It refers to the expectation of finding information along a given path. The strongest scents represent some balance of the easiest catches and the most nutritious meals.
The root concept comes from our primitive ancestors foraging for food and using scent to choose paths. When the scent stops getting stronger, a given path is abandoned. Similarly, psychologists have found that when the expectation of finding information stops growing along a given information path, the path is abandoned.
The omnipresence of Google as a way for businesses to be found has led businesses to focus on providing an appropriate information scent online. The result is that the web presence for successful businesses fits into a predictable pattern where, to continue the food analogy, visitors can get a quick snack, the menu changes frequently and predictably, and the dishes are easy to find, understand, and digest.
Reading: 03findTrackingScent.pdf, Tracking the Scent of Information, APA Monitor, V 43, N 3, P 44. Tracking the scent of information
3.3 Information Credibility
Currently, information available online can come from obscured or relatively anonymous sources.
Currently, there are low barriers to presenting information online. For many reasons, information obtained online may not be credible. Using information obtained online requires consideration of credibility, a concept which has been defined differently by various communities. Some of the concepts used in describing credibility include the following.
1. Technical knowledge, skill or expertise
2. Consistency of actions, values, meathods, measures, principles, expectations, and outcomes (definition of journalistic integrity from Wikipedia)
3. Objectivity
4. Pecuniary Interest
5. Agreement with ideas and values held by the recipient
6. Community membership
7. Precision (variance)
8. Accuracy (bias)
9

9. Falsifiability (using scientific method)
Reading: 03findStudentCredJudgment.pdf. College Students’ Credibility Judgments in the Information-Seeking Process. Chapter 3 of Digital media, youth, and credibility, 2011.
3.4 Getting Help
In a previous era, selection of computer applications was contingent in part on the availability of help for usage of the applications. Businesses providing applications had to determine optimal expenditure of resources on help facilities to be viable in the market.
Today, in contrast, the most prevalent forms of help for the use of computer applications are Youtube videos and user communities. At this writing I am specifically referring to the Youtube service rather than videos services in general, based on a
2013 study of online video traffic.
Youtube has created a low-barrier marketplace for help, allowing individuals with relatively few resources to assess the best opportunities for profiting from the provision of help for computer applications. This means that institutional adopters of computer applications can use the level of available Youtube help as a proxy for the health of a given computer application, replacing possibly tedious and expensive primary research.
User communities spring up around successful computer applications and can be used as a cheap and convenient way to evaluate those applications in much the same way as Youtube videos. Unlike individual videos, user communities may transcend individual products or individual vendors. An online help community such as Stack
Overflow thrives on a reputation that transcends that of any individual product. The relative attention Stack Overflow pays to a given product may serve as an index of that product’s health.
3.5 Exercise: Comparing Help for Two Browsers
Form an ad hoc group of four. Identify two browsers, such as Chrome, Firefox,
Safari, or others. Contrast the Youtube videos and user communities offering help for the two browsers.
Compare the health of the browsers.
Compare the communities supporting the two browsers and identify differences in focus, emphasis, direction, and mission.
4. NETWORKED INFORMATION
The term network is used in two main ways in the computing world. Both of these senses are critical for an understanding of business information systems but they

10

are very very different.
The first, and oldest, sense of the word is to describe the physical connections between computers. This includes the hardware and software that allows one computer to communicate with another computer and with larger groups of computers including the largest group of computers, the internet. It is this sense we will discuss in this section, after clarifying the other sense.
The second sense in which the word network is used comes from the emerging discipline of network science.
This field of study has only been named in the past twenty-five years but draws on ideas dating back much further. It is relevant to us because it explains the behavior of all kinds of networks, including computer networks. Recent advances in network science have identified phenomena like the long tail and growth and decline of online communities such as Facebook and Myspace. This sense of the word network will be discussed in the later section on information users.
4.1 Network Hardware
Computer networks are held together by a less and less diverse set of computer hardware, including routers, switches, and cables of various kinds. At one time a large industry existed for computer network hardware, but that industry consolidated as various kinds of networking was replaced by the connections to the growing Internet. Understanding the Internet, therefore, is more important than understanding the wide variety of networking approaches displaced by it. Understanding the Internet, though, does not require much understanding of hardware but instead of understanding agreements known as protocols.
4.2 Network Protocols
For any two computers to connect, they must agree on a pattern of 1s and 0s to use to transmit and receive information. This agreement is called a protocol and it is agreement on certain protocols that determine whether computers can connect to each other and, ultimately, connect to the internet.
In the early years of computer networking, firms wrote protocols and shared enough information about them so that other firms could connect to their computers. At that time, the IBM corporation was roughly the same size as all other computer makers combined and the computer industry was known in the press as IBM and the Seven
Dwarves. Other companies published network protocols, but they did not matter too much to American business as a whole because so many of the computers available to connect to obeyed IBM’s protocols.
At the same time, the United States Department of Defense sought a network protocol not controlled by a particular company and financed the development of the protocols that later came to define the Internet. The main set was called TCP/IP, which stood for Transmission Control Protocol / Internet Protocol. This is by far the most important agreement for connecting computers in the world today and it

11

is likely that you use it every time you use a phone, tablet, computer, or expensive devices like automobiles containing embeded computers.
Strangely, TCP/IP has been expected to die out every year since before you were born. The people who wrote it did not envision that it would come to dominate the technological world nor that the world of computing would include so many devices as it does today. Hence, they and others have often thought of better protocols in the years since. We would need a separate course to begin to understand why TCP/IP has steamrolled over all other approaches. Often, books or courses will provide a completely mistaken reference to Beta vs VHS to explain why mediocrity often triumphs over quality, but I beg you to take this author’s word that
Beta vs VHS not only doesn’t explain it, it doesn’t even explain Beta vs VHS. Some analogies help and some don’t.
In addition to TCP/IP, a few other protocols were written to form the bedrock of the Internet. These include SMTP, which stands for Simple Mail Transfer Protocol,
FTP, which stands for File Transfer Protocol, and a few others. These protocols remain in use today, every day, even though they were formulated decades ago for a small network of scientific computing specialists.
4.3 Protocol Development
How do network protocols happen? Each protocol that determines how the Internet works is written as an RFC, which stands for Request for Comments. An RFC lists all the rules that computing equipment would have to follow to do something, such as send or receive email or files.
Today, all the relevant RFCs are published on the network although at the beginning they had to also be printed and sent by the US Postal Service to many of the interested parties.
Interested parties at the dawn of the Internet included companies, government agencies, and universities. Each would review the RFC and recommend changes that would make life easier for itself. The RFC would be revised, a lot at first, then revisions would settle down and one version of the RFC would become the de facto standard, followed for decades.
The process of protocol development involves self-interest but a key characteristic of the development of the protocols that became the Internet was that they did not inherently favor one business over another. If they did, analysts believe that there never would have been what we perceive as the Internet, although there would have been many small, high performing networks, used only by extremely well-funded customers. Let me emphasize that the Internet’s key characteristic is that no one owned it in the early years. Every major corporation tried to take control of it, almost invariably complaining that it would be faster, cleaner, and simpler if only their version of protocols would replace the creaking, aging protocols that currently govern the
Internet. From the 1970s until around 2002, no one agent really got what it wanted
12

from the Internet to the exclusion of others. More than anything else, the growth and democracy available on the Internet is an artifact of the development of Internet protocols. 4.4 Technical Problems of Internet Protocols
Whoever obeys the Internet protocols can connect to the Internet. This was completely true at least until the decade since September 11th, when the US government has sought greater control over the Internet under the campaign known as the war on terror. In the past decade, the truth of this statement has become murkier, particularly given two developments in 2013, the Edward Snowden leaks, and the debate over net neutrality. These developments remain controversial so I will simply note here that there is a lively, current, potentially society-changing debate over the truth of the introductory sentence.
Many if not most technical problems with the Internet result from incorrect or incomplete implementations as private network after private network joined the
Internet. These technical problems are shockingly few given the size of the Internet and compared with any other computerized system in history. The Internet is highly fault-tolerant compared to any other network or large computer-based system. This fault tolerance is the subject of a later section on representational state transfer. For now, it is enough to credit network protocol development with highly successful fault tolerance.
4.5 Exercise: Predict Company Reactions to Internet Regulation
Some current and proposed legislation in the USA revolves around the term net neutrality. The debates over this type of legislation pit content providers such as Google and Netflix against network infrastructure providers such as Comcast,
AT&T, and Verizon.
Form a group of four and identify a proposed piece of regulation from the FCC or legislation facing Congress.
Identify the impact on business of the regulation from the point of view of a content provider and a network infrastructure provider. Use evidence beyond the statements of the businesses involved. For instance, leaked documents have shown that some companies have intentionally publicly misstated their costs in developing infrastructure when passing these costs on to consumers.
5. SYSTEMS THINKING
Reading: 05Ackoff1994.pdf 15 pages; exam questions from pages 175–176,
182–186
Reading: 05Churchman1974.pdf 7 pages; exam questions from pages 7–11

13

The concept of systems thinking was popularized in WWII and was credited by the leaders of the USA, in part, with their victory in that conflict. As a result, USA leaders resolved at the end of the war to infuse systems thinking into businesses, universities, and other aspects of the government.
The department that offers this class, Decision Sciences, owes its name and its initial conception to the thinking of that era, which spawned university departments of operations research, operations management, management science, applied mathematics, and eventually management information systems.
Much of the evolution of business information systems can be explained as an outgrowth of a few relatively simple ideas and tools that emerged from this movement.
A few of these are explored in this section.
5.1 Automation And Feedback Loops
The concept of automation involves four things: input, processing, output, and feedback. The key characteristic of automation is that feedback is automatic and modifies processing based on monitoring output. Bear in mind that nothing in this definition specifies that an automated system needs to be an electronic system.
Even a completely mechanical system could be automated. input process

output

The above picture represents a simple system with input, processing, and output. It is missing feedback, so it is not an automated system. To illustrate the difference, imagine a copier with and without feedback.
Copier without feedback. The person making copies selects the number 4 to indicate 4 copies and puts the original into the copier to start the process. After the second copy, the third copy jams. At this point, the design of the copier could allow any number of things to happen. The key is that, without feedback, the input does not know that anything is wrong. It could just keep feeding paper in, worsening the jam, or the whole system could stop and signal a warning that paper is jammed. Suppose this copier implements the latter approach. At that point, the person could remove the jammed paper and press a restart button. Next the input initiates the fourth copy, even though the third copy was never completed. Another possibility would be that pressing the restart button actually restarts the machine with no memory of the 4 copies. It’s up to the person to determine how many copies remain.
Copier with feedback. The person making copies selects the number 4 to indicate
4 copies and puts the original into the copier to start the process. After the second copy, the third copy jams. At this point, the design of the copier could allow any number of things to happen. The key is that, with feedback, the input is aware that

14

two copies have been completed and that the third copy has not. When it resumes, it must resume with another attempt to complete the third copy.
The system with feedback does not need a person to monitor it. It responds to problems. Even though it needs the person to remove jammed paper, it does not need a person to tell it what to do next when the person signals that the paper jam has been corrected. This is an automated system. feedback input

process

output

Notice that the above diagrams show processes or actions, using ellipses. Rectangles are used to show entities or things in the real world. You could think of the ellipses as verbs and the rectangles as nouns. You could even think of the diagrams as representing sentences in a language that describes systems.
5.2 Diagramming Systems
Business information systems are typically too large to be specified, design, or built by individuals. Systems are often large enough to be divided, not only among individuals on a team, but multiple teams. How can teams communicate with others about the information systems they develop or use? It is not practical for people to read millions of lines of computer programs. Hence, formal diagrams serve to communicate concepts being used in large information systems. Every method for developing systems features at least two types of diagrams. Most have three types of diagrams. No description of a system is considered complete without the associated diagrams. No system is thought to be completely described by a single diagram. In other words, a given type of diagram can only describe one aspect of a system and every systems development method assumes that there are at least two essential aspects to any system.
A key characteristic of different systems development methods is that they don’t agree as to what the essential aspects or what the essential diagram types are, but they all agree that more than one is essential.
5.3 Diagramming Data Flows
Perhaps the most enduring diagramming form and the one that appears in the most methods is the diagramming of information or data flows. A DFD, which stands for Data Flow Diagram, must contain exactly four symbols and these four symbols must obey certain rules. The four symbols are flow, process, data store, and entity.
5.3.1 Data Flow. A data flow is an arrow with a head at one end. It must not have arrows at both ends. It must be labeled with a name for the data that is flowing.
It may start or stop at a process, store, or entity, but it may not pass between two

15

entities. In other words, if one end is an entity, the other end must be a process or data store.
5.3.2 Process. A process is a circle with a label naming a process that operates on data. It must have at least one flow entering it and at least one flow exiting it.
No process may be a magic wellspring, having only arrows coming out of it, nor a black hole, having only arrows going into it.
5.3.3 Data Store. A data store is a pair of horizontal lines with a label naming the data store. This is some place where data is stored. It need not be in a computer. It may be an inbox on a physical desk. It may be a filing cabinet. Like a process, it must have at least one flow entering it and at least one flow exiting it. No data store may be a magic wellspring, having only arrows coming out of it, nor a black hole, having only arrows going into it.
5.3.4 External Entity. An external entity is represented by a rectangle with a label naming something outside the system that is somehow connected to the system.
Like a process or data store, it must have at least one flow entering it and at least one flow exiting it. No external entity may be a magic wellspring, having only arrows coming out of it, nor a black hole, having only arrows going into it.
It may seem counterintuitive to place the same in / out restriction on external entities as on system components. After all, an external entity might be a customer.
We might send a refund to a customer with no expectation that the customer send us something in return. In practice, the restriction is often relaxed.
When that happens, it is often the source of trouble.
For instance, suppose that an unscrupulous employee notices that no feedback loop exists for customer refunds and uses that knowledge to develop an embezzlement scheme, misdirecting refunds.
5.3.5 Leveled Data Flow Diagrams. The data flow diagram described above would not be so popular without one additional aspect, called levelling.
Every data flow diagram is assumed to occur at some level that can be exploded into lower levels, exposing more and more detail. It is typical for a set of leveled data flow diagrams to span hundreds of pages, each page with a single diagram, connected in the form of a tree with a single process, the name of the entire system, in the first diagram.
In addition to the symbols mentioned above, flow, process, store, and entity, leveled data flow diagrams have a level number and every process circle has a level number as part of its name, functioning like an atlas, where each edge of a map contains a page number of a connecting map and highlighted sections contain page numbers of detailed maps.
5.4 Diagramming State Transitions
One effective way to describe many business systems is to describe their states.
An easy way to see this is to think of the automated cashier in a grocery store.

16

The most frequent state in which that system finds itself is waiting. Other states include reading an item placed on its sensor, reading a swiped credit card, sending a message to a customer, and so on.
This is an example of a system with a finite number of states. It should be possible to draw a diagram or set of diagrams listing each possible state and showing which states may precede or follow any other given state.
In contrast to the data flow diagram, which mainly occurs in two forms, state transition diagrams have been proposed and used in vastly many forms in different business, scientific, and government communities.
All state transition diagrams have in common that each state represents a state no matter how that state was reached. In other words, it does not matter how a system enters a particular state. There are not different conditions within a state.
5.4.1 State Transition Diagram Symbols. The simplest state transition diagram contains only the following symbols.
1. An unlabelled dot points to the initial state.
2. Labeled circles describe each possible state the system may attain.
3. If the system has an ending state, a dot surrounded by a circle is pointed to by any state that leads to the end state.
4. Arrows, possibly labeled with actions, point from each state to each state that may be reached from that state, including the state itself if an action returns it to that state.
5.4.2 State Transition Diagram Examples. Following are two examples of state transition diagrams. Each example has some context about why a state transition diagram may be a useful representation. Without experience of business information systems, it may not be at all obvious why these examples are applicable. Further reading would be required to understand why. These examples just illustrate how such diagrams are constructed.
State Transition Diagram Example 1, The Farmer’s Puzzle. Many variations of the following puzzle are used to illustrate various information concepts, including artificial intelligence concepts like forward chaining and backward chaining, as well as problem representation concepts.
Puzzle statement. A farmer goes to market with a fox, a chicken, and a vegetable, hoping to sell all three. The farmer must cross a river to reach the market, using a boat that can only accommodate the farmer and one of the three items to be sold.
Unfortunately, the fox will eat the chicken if left unsupervised and the chicken will eat the vegetable if left unsupervised. How can the farmer get all three items across and continue to the market?
Solving the puzzle is a separate task from drawing the state transition diagram but the tasks are related because representing a problem is often a key to solving a problem. We’ll use a different method to solve the puzzle before demonstrating the

17

state transition diagram. First, you have to represent the problem. To do so, you begin by deciding what aspects of the puzzle need to be represented. The candidate objects include the farmer, the fox, the chicken, the boat, and the two sides of the river. All the objects are on the near bank of the river at the start of the problem and all the objects are on the far bank of the river at the end of the problem. A common way for people to begin solving the problem is to make a table with all the items in the left column in the first row of the table and all the items in the right column in the last row of the table, then to start fill in intermediate rows. Following is an example of the beginning of such a table. near far

farmer, fox, chicken, vegetable, boat

farmer, fox, chicken, vegetable, boat
The above table can be expanded to list all the intermediate states of the farmer’s journey. One thing that becomes obvious if you add a few rows is that there should be no entries listing the fox and the chicken on one riverbank without the farmer and that there should be no entries listing the chicken and the vegetable on one riverbank without the farmer. The following version of the table adds one additional entry from the beginning of the problem, respecting this rule. near far

farmer, fox, chicken, vegetable, boat fox, vegetable

chicken, farmer, boat

farmer, fox, chicken, vegetable, boat
The above version of the table is an example of forward chaining since you moved forward from the beginning of the problem toward the end of the problem, using the only obvious legal move. It’s the only obvious legal move because, if the farmer takes anything but the chicken across in the first trip, someone will be eaten during the unsupervised time while the farmer is away. We can also employ a complementary technique called backward chaining in the same way.
The very last thing the farmer must bring across the river before moving on must also be the chicken, since any other configuration on the far bank leads to someone being eaten. The following table shows the situation we arrive at by employing one iteration of forward chaining and one iteration of backward chaining, with the middle of the solution still incomplete.

18

near

far

farmer, fox, chicken, vegetable, boat fox, vegetable

chicken, farmer, boat

chicken, farmer, boat

fox, vegetable farmer, fox, chicken, vegetable, boat

One reason to employ both forward chaining and backward chaining in solving a problem is the issue of combinatorial explosion. If we draw the problem from the beginning as a tree, with a new branch for every possible state, we will have to draw a vast number of branches after only a few transitions. The same is true if we begin at the end and try to trace our way back to the beginning. But if we begin at both ends, we reduce the size of the problem. The problem as shown in the above table is to get from the second state to the next-to-last state. For many problems, including this one, it is easier to find a path between these two intermediate states than from beginning to end.
Looking at the above table, a solution may become obvious. For those who have not seen it yet, let’s add one more legal step at each end and see. near far

farmer, fox, chicken, vegetable, boat fox, vegetable

chicken, farmer, boat

fox, vegetable, farmer, boat

chicken

chicken

fox, vegetable, farmer, boat

chicken, farmer, boat

fox, vegetable farmer, fox, chicken, vegetable, boat

Looking at the above table, we can see that the farmer must take the chicken back to the near bank, which is a key to solving the problem. Now it should seem easy to move forward from the third row or to move backward from the third-to-last row. The only issue is that we have a choice of moving the vegetable across first or moving the fox across first. This choice is not as trivial as it may seem but for now, let’s just move the fox first. That move determines both the next row going forward and the corresponding row going backward, giving the following completed table.

19

near

far

farmer, fox, chicken, vegetable, boat fox, vegetable

chicken, farmer, boat

fox, vegetable, farmer, boat

chicken

vegetable

chicken, fox, farmer, boat

chicken, vegetable, farmer, boat

fox

chicken

fox, vegetable, farmer, boat

chicken, farmer, boat

fox, vegetable farmer, fox, chicken, vegetable, boat

The above table represents a complete solution but it has a couple of limitations.
First, it only represents one complete solution. The farmer could have taken the vegetable across before the fox and this approach has no obvious way to show that except to either include a second table or to modify the structure of this table to show that some rows are optional. Besides these two options, the farmer can legally return to any previous state. There’s no obvious way to capture this fact using a table except by adding a separate list showing which rows can lead to which other rows. A second limitation is that the above table actually contains more symbols than are needed to represent the states of the problem. We don’t really need to see both columns since, in any row, every object that is not in one column is in the other column. Second, the farmer and the boat are not both needed because they are always in the same place.
Both these limitations can be overcome by representing the solution as a state transition diagram. The following diagram shows the state of the near bank only and uses the symbols F; C; V; B to represent the fox, chicken, vegetable, and boat. The solution does have a start state, pointed to by a solid dot. The solution also has a final state, pointing to a circled dot.
In addition to overcoming the above limitations, the state transition diagram has the property that it is compact enough that we can scan it quite easily for violations of the rule that fox must not be left unsupervised with the chicken and the chicken must not be left unsupervised with the vegetable. Since systems large enough to merit state transition diagrams may contain dozens or even hundreds of states, compactness can be a crucial property.

20

F; V
F
F; C; V; B

F; V; B
F; C; B

C; B
V
C
V; C; B

To summarize, the above state transition diagram contains all the information in the preceding tables and more. In addition, it obeys simple, well-known rules that make it unambiguous when used to write software.
State Transition Diagram Example 2, A Computer Program. The most common use of state transition diagrams is so that teams working with software can discuss the software in a precise formal way even though most team members can not read the actual computer programs under discussion. A maxim popularized by blogger
Joel Spolsky is that it is easier to write computer programs than to read them. If this maxim is true, then even team members who can read a given program will find it burdensome.
The main use of Spolsky’s maxim in practice is to warn against rewriting existing programs, a strong temptation if the maxim is true. Spolsky argues that existing programs usually encode considerable business information that may not be obvious and may be lost in rewriting. Instead, Spolsky argues for identifying ways for teams to communicate about existing programs rather than rewriting them. This argument sometimes leads to the use of diagrams, including state transition diagrams.
To illustrate, here is a fragment of code, written in Python, a language named after the comedy group Monty Python. Python uses indentation to group program statements, so Python reads all the following as part of the function cyclic() and the last two lines as being inside a while loop. In addition, Python uses the = to assign values to symbols. So anything on the left side of a = is a symbol that takes on the value expressed on the right side of the symbol. cyclic () x =0 y =0 while (y

Similar Documents

Premium Essay

Dsc340

...Ch. 5. Moore's Law and More: Fast, Cheap Computing, Disruptive Innovation, and What This Moore's Law definition: Chip performance per dollar doubles every eighteen months. Moore’s Law applies to chips—broadly speaking, to processors and chip-based storage. Microprocessor is the brain of a computing device. It’s the part of the computer that executes the instructions of a computer program. For processors, Moore’s Law means that next generation chips should be twice as fast in about eighteen months, but cost the same as today’s models. Random-access memory (RAM ) Fast, chip-based volatile storage in a computing device Flash memory Nonvolatile, chip-based storage Volatile memory: Storage that is wiped clean when power is cut off from a device Nonvolatile memory: Storage that retains data even when powered down. Solid state electronics: Semiconductor-based devices Semiconductors: Substance such as silicon dioxide used inside most computer chips that is capable of enabling and inhibiting the flow of electricity Optical fiber line: High-speed glass or plastic-lined networking cable used in telecommunications ...

Words: 2104 - Pages: 9