Madnick and Donavan (Operating Systems, p.337-8. 1974) carried the following functions of Information Management. “Information Management is quite simple… yet, one of the most important …”.
1. Keeping track of all information through various tables
2. Deciding policies on storage and access
3. Allocating and
4. De-allocating information resource
The term ‘file-system’ is used where it is concerned with simple logical organization. ‘file’ or ‘data-set’ is single separate collection. ‘Data management system’ conducts some “structuring”, but NO interpretation. ‘Database system’ addresses both “structuring” and “interpretation”.
The database is now required to carry out two aspects –
1. Structuring
2. Interpretation
Until this is accomplished, the collection of data remains a data-set.
i. Classical MS-Excel data-set, if organized carefully can structure the data but cannot interpret.
ii. Database is not described through data quantities. Small data-set can become a database and large data-set may remain as file.
To further the understanding of database we need to deal with the “structure” and “interpretation” aspects. The term ‘semantics’ has later been introduced (Semantics is the study of meaning) to replace the interpretation (meaning is fundamental for interpretation) as pre-requisite of a database.
Database = Structure + Semantic {Data-set}
Structure
Data Structure as a distinctive part of Program was defined by Nicholas Wirth (Program = Algorithm + Data Structure). Thus, no Programs are possible without Data Structure! Over the 40 years’, data structures have developed into a complex specialized area with every computer science curriculum having courses addressing them. They are the back-bone to every Information System. Ph.D. level research on data structures is quite popular.
Data Structures emphasize on the elegance, efficiency and effectiveness of storage, operations and access from a computer implementation perspective. Any data structure can become the foundation architecture for database.
Databases emphasize also on the intrinsic context and applicability of the data for the world. This is where the Interpretation and Semantics come in.
Semantics
The semantics and associated developments in computer science are far the most complex and innovative aspects of human discovery. These are applied in many areas of pattern recognition, machine learning and knowledge representation. We will limit to the scope of discussion of semantics to common and widely used implementation of ‘database’ and how they constitute fundamental part of database.
Data Structures like trees created hierarchical databases, wherein the interpretation is captured as stages of hierarchy and resulting models became database systems. IMS (IBM) was so popular in my programming days that CICS-IMS is a sure success skill!
Network database systems approached modeling of interpretation using a network mesh of dependencies using a generalized graph structure. This allowed multiple parent-child dependencies. It acquired a standard status through CODASYL and was used to represent the Interpretation of real business interfaces through datasets.
Relational database developed using the work of Codd (1979) and represented datasets as ‘tables’ in certain unique forms called normalized tables. The business or real-world representation and use of data is modeled through techniques like Entity-Relationship models to capture the interpretation. The resulting design is accompanied with a data dictionary is provided with supporting reference tables like – master lists that provided for controlled vocabulary and values in every table. Together, considerable progress has been made to capture the semantics of data in the model.
Most of the modern information systems deploy relational database models.
Database
1. Is a model of the real-word interfaces of the data on a computer information system – providing a structure and a semantic (interpretation)
2. In absence of semantic support the database ceases to be one.
3. The goodness of any database is largely dependent on the designed model which expresses the interpretation captured from the real-world
4. Relational database is a structure of database providing semantics through techniques like E-R models and data dictionary
Modern Information Systems and their value are completely dependent on the databases. The 4th Paradigm Science is essentially built on the databases. For an organization, team or the CIO, databases are most important aspect of IS.
Contents in this post are Re-Created from the publicly available information as a CIO sees. For the spirit of this Read-Write culture check this http://blog.ted.com/2007/11/06/larry_lessig/