information storage and retrieval information storage and retrieval, the systematic process of collecting and cataloging data so that they can be located and displayed on request. Computers and data processing techniques have made possible the high-speed, selective retrieval of large amounts of information for government, commercial, and academic purposes. There are several basic types of information-storage-and-retrieval systems. Document-retrieval systems store entire documents, which are usually retrieved by title or by key words associated with the document. In some systems, the text of documents is stored as data. This permits full text searching, enabling retrieval on the basis of any words in the document. In others, a digitized image of the document is stored, usually on a write-once optical disc. Database systems store the information as a series of discrete records that are, in turn, divided into discrete fields (e.g., name, address, and phone number); records can be searched and retrieved on the basis of the content of the fields (e.g., all people who have a particular telephone area code). The data are stored within the computer, either in main storage or auxiliary storage, for ready access. Reference-retrieval systems store references to documents rather than the documents themselves. Such systems, in response to a search request, provide the titles of relevant documents and frequently their physical locations. Such systems are efficient when large amounts of different types of printed data must be stored. They have proven extremely effective in libraries, where material is constantly changing.
DEFINITION
ISRS (information storage and retrieval system)
An information storage and retrieval system (ISRS) is a network with a built-in user interface that facilitates the creation, searching, and modification of stored data. An ISRS is typically a peer-to-peer ( P2P ) network operated and maintained by private individuals or independent organizations, but accessible to the general public. Some, but not all, ISRSs can be accessed from the Internet. (The largest ISRS in the world is the Internet itself.)
Characteristics of an ISRS include lack of centralization, graceful degradation in the event of hardware failure, and the ability to rapidly adapt to changing demands and resources. The lack of centralization helps to ensure that catastrophic data loss does not occur because of hardware or program failure, or because of the activities of malicious hackers. Graceful degradation is provided by redundancy of data and programming among multiple computers. The physical and electronic diversity of an ISRS, along with the existence of multiple operating platforms, enhances robustness, flexibility, and adaptability. (These characteristics can also result in a certain amount of chaos.) In addition to these features, some ISRSs offer anonymity, at least in theory, to contributors and users of the information.
A significant difference between an ISRS and a database management system ( DBMS ) is the fact that an ISRS is intended for general public use, while a DBMS is likely to be proprietary, with access privileges restricted to authorized entities. In addition, an ISRS, having no centralized management, is less well-organized than a DBMS.
History of Information Retrieval
Almost as soon as computers were developed, information scientists suggested that the new machines had the potential to perform text processing as well as arithmetical operations. By representing text as ASCIIcharacters, queries formulated as character strings could be matched against the character strings in documents. The first computer-based IR systems, which appeared in the 1950s, were based on punched cards . These were followed in the 1960s by systems based on storage of the database on magnetic tape .
These first systems were hampered by the limited processing power of early computers, and the limited capacity for and high cost of storage. They operated offline , in a batch processing mode. It was not until the 1970s that IR systems made it possible for users to submit their queries and obtain an immediate response, allowing them to view the results and modify their queries as needed. The development of magnetic disk storage and improvements in telecommunications networks at this time made it possible to provide access to IR systems nationwide.
Much early work in information retrieval was conducted at U.S. government institutions such as the National Aeronautics and Space Administration (NASA) and the National Library of Medicine (NLM), and included the forerunners of today's systems. Versions of the DIALOG system were first operated by NASA and the Atomic Energy Commission; it later became a commercial system. The MEDLINE system operated by NLM today originated in an experimental system for searching their medical database, MEDLARS.
The Future of Information Retrieval
Researchers continue to improve the performance of information retrieval systems. An ongoing series of experiments called TREC (Text Retrieval Evaluation Conference) is conducted annually by the National Institute of Standards and Technology to encourage research in information retrieval and its use in real-world systems.