In this module we will talk about data modeling. You will recall that when we introduced software engineering and software development methodology, we were talking about requirement analysis where we mentioned that we perform both the data modeling and the process modeling as we go through the requirement analysis as we understand the application environment of the user.
So let us start by asking again why we model. We build models of complex systems, because it’s difficult to understand any such system in its completeness at one shot. We have to therefore build a model and try to understand it in terms of its complements and how those complements relate with each other. So one of the important reasons for modeling of complex systems is to improve our understanding of such system because we cannot understand them entirely in one single instance.
We need to develop a common understanding of the problem so that we can proceed toward the solution. This common understanding is between all the people involved and also the users for whom we are trying to propose a solution. We cannot afford a trial and error approach. In fact, a model will clearly establish that we are proceeding along the correct direction and that our understanding of the user’s environment is correct. And this will be reflected in the model. This will remove the trial and error kind of approach. And it will also reduce the risk in the overall development.
A model is also extremely useful to communicate the required structure and behavior of our system. We try to capture that in the model and then put it in the form which can be understood and which can be verified by others. So these are the reasons why we model.
Let us see how we model. We choose an appropriate modeling concept or an appropriate modeling paradigm. This should be such that our solution can be properly expressed. So this choice of the right model is extremely important, and it has considerable influence on shaping the solution we propose for the problem. So we chose a model for the kind of purpose we have at hand. This may be modeling of data, or this may be modeling of the processing defined in a given application.
No single model is sufficient. In fact, this is an important point that in most analysis phases, we try to build different types of models which represent the different perspectives of the same environment or the same application. It is important to approach a complex system from different points of view which might be best represented by using different modeling techniques. So a single model may not be sufficient.
In fact, we have been talking about two independent models already. One is for the modeling of data and one is for the modeling of processing. Even in the object-oriented model that we had mentioned earlier, different perspectives are taken. One could be defining the object model which is static kind of a model which reflects the different objects which are present in the user’s environment. And then there is a dynamic model which defines the interactions among these objects. So we do take different perspectives and we try to use an appropriate modeling concept or appropriate modeling paradigm to represent these perspectives.
The best models are connected to reality. In fact, the purpose of the model is to abstract important aspects of the reality and represent them very clearly. So naturally they must meet the requirements as we want to analyze for a given real world situation. So these are the different issues that we must keep in mind when we define our modeling exercise. What model should we choose?
So in this particular module, we are talking about data modeling. We will define the notion of data modeling. We are going to build these models in terms of the important concepts of entity and relationship. In fact, the model that we are going to discuss in detail is the entity relationship model or the ER model in short. We will look at the diagramming concepts which are available in this modeling technique. We will talk about other related concepts of keys, weak entities. Then we will also talk about extensions to the ER model. So these are the different topics we’ll cover in this particular module.
Let’s begin by seeing the purpose. The purpose of the data model is to represent the operational data in the real world. These operational data describe the various events, entities and activities which take place in the business environment for which we are proposing the solution. So remember that we are trying to represent the operational data and there may be a lot of these data which describe different entities, different activities which happen in the business environment, different types of events which take place. So all these data need to be captured in solving that application problem. So objective of data model would be to represent this operational data.
The model may be described at various levels. The model may be at the logical level or a physical level. Physical level naturally will address not only what data we have but how that data is stored and retrieved and updated and things like that. So this would be the day in which the data would be actually handed.
Very often we first try to understand the data at the logical level. The model may also be at external level or conceptual level or internal level. What we really mean here is that when we say model is at external level, it might define the model as seen by a particular user who is the user of the application. Naturally his view of the data may be a subset of the overall data content in the application whereas the conceptual model represents the data in its totality at a level which represents the important concepts in the application.