FRDBMS for imprecise queries based on GEFRED model

When users work with usual software tools they have to change their many valued logical thinking (approximate reasoning) into the two-valued computer logic. Although the Structured Query Language (SQL) is a very powerful tool, it is unable to satisfy needs for data selection based on linguistic expressions and degrees of truth. In this paper, we are interested in flexible querying that is based on fuzzy set theory. Medina et al. have developed a server named fuzzy SQL, supporting flexible queries and based on a theoretic model called GEFRED. To model the flexible queries and the concept of fuzzy attributes, an extension of the SQL language named fuzzy SQL has been defined. The FRDB is supposed to have already been defined by the user. In this paper, the work of medina et al. has been extended to implement a new architecture of fuzzy DBMS based on the GEFRED model. This architecture is based on the concept of weak coupling with the DBMS SQL Server.


Introduction
Databases are a very important component in computer systems.Because of their increasing number and volume, good and accurate accessibility to a database becomes even more important.Organizations work with very large data collections mainly stored in relational databases.Linguistic expressions are interesting for data extraction, analysis, dissemination and decision making.The research area of fuzziness in Data Base Management Systems (DBMS) has resulted in a number of models aimed at the representation of imperfect information in Data Bases, or at enabling nonprecise queries (often called flexible queries) on conventional database schemas.However, few works have been done from a practical point of view.In this paper, we are interested to the works of Medina et al. who introduced the GEFRED model in 1994 and its associated language named FSQL.This language presents new concepts such as fuzzy comparators, fuzzy attributes, fuzzy constants, etc.We propose to implement a new architecture of the Fuzzy Relational DBMS (FRDBMS) based on the GEFRED model.This architecture is based on the weak coupling principle with the RDBMS SQL Server.

General Terms
 Database: A classical database is a structured collection of information (records or data) stored in a computer. Fuzzy Database: A fuzzy database is a database which is able to deal with uncertain or incomplete information using fuzzy logic.
 Fuzzy Logic: Fuzzy logic is derived from fuzzy set theory by Zadeh (1965) dealing with reasoning that is approximate rather than precisely deduced from classical predicate logic.It can be thought of as the application side of fuzzy set theory dealing with well thought out real world expert values for a complex problem. FRDB: It is an extension of the relational database.This extension introduces fuzzy predicates under shapes of linguistic expressions that, at the time of a flexible querying, permits to have a range of answers (each one with a membership degree) in order to offer to the user all intermediate variations between the completely satisfactory answers and those completely dissatisfactory .
 FRDBMS: It is an extension of the relational DBMS in order to treat, store and interrogate imprecise data.
 FRDB Models: Two broad approaches are possibilistic model and the similarity relation based model.These models are considered in a very simple shape and consist in adding a degree, usually in the interval [0, 1], to every tuple.They allow maintaining the homogeneity of the data in DB.The main models under both approaches are Prade-Testemale, Umano-Fukami, Buckles-Petry, Zemankova-Kaendel and GEFRED of Medina et al..

Different Models
Under this denomination, models using the possibility theory to represent imprecision are included.The most important models in this group are Prade-Testemale model, Umano-Fukami model, and GEFRED model.The GEFRED model is an eclectic synthesis of some of the previous models.It consists of a general abstraction that allows for the use of various approaches, regardless of how different they might l Prade-Testemale Model: Prade and Testemale published a fuzzy relational database (FRDB) model that allows the integration of what they call Incomplete or uncertain data in the possibility theory sphere.An attribute A, having a D domain, is considered.All the available knowledge about the value taken by A for an x object can be represented by a possibility distribution πA(x) about D ∪ {e}, where e is a special element denoting the case in which A is not applied to x.
Umano-Fukami Model: This proposal also utilizes the possibility distributions in order to model information knowledge.In this model, if D is the discourse universe of A(x), πA(x) (d) represents the possibility that A(x) takes the value d∈D.
The following kind of knowledge may be modeled: unknown and applicable information, the non-applicable information (undefined), and the total ignorance (we do not know if it is applicable or non-applicable): GEFRED Model: The GEneralised model Fuzzy heart Relational Database (GEFRED) has been proposed in 1994 by Medina et al. .One of the major advantages of this model is that it consists of a general abstraction that allows for the use of various approaches, regardless of how different they might look.In fact, it is based on the generalized fuzzy domain and the generalized fuzzy relation, which include respectively classic domains and classic relations.It constitutes an eclectic synthesis of the various models published so far with the aim of dealing with the problem of representation and treatment of fuzzy information by using relational DB.

Fuzzy Attributes in GEFRED Model
In order to model fuzzy attributes we distinguish between two classes of fuzzy attributes-Fuzzy Sets as Fuzzy Values: These fuzzy attributes may be classified in four data types.In all of them the values unknown, undefined, and null are included: Fuzzy Attributes Type 1 (FTYPE1): these are attributes with "precise data", classic or crisp (traditional).However, they can have linguistic labels defined over them, which allow us to make the query conditions for these attributes more flexible.
Fuzzy Attributes Type 2 (FTYPE2): these attributes admit both crisp and fuzzy data, in the form of possibility distributions over an underlying ordered domain (fuzzy sets).
Fuzzy Attributes Type 3 (FTYPE3): they are attributes over "data of discreet non-ordered dominion with analogy" .In these attributes some labels are defined ("blond", "red", "brown", etc.) that are scalars with a similarity (or proximity) relationship defined over them, so that this relationship indicates to what extent each pair of labels be similar to each other.
Fuzzy Attributes Type 4 (FTYPE4): these attributes are defined in the same way as Type 3 attributes, without it being necessary for a similarity relationship to exist between the labels Fuzzy Degrees as Fuzzy Values: The domain of these degrees can be found in the interval [0,1], although other values are also permitted, such as a possibility distribution .The meaning of these degrees is varied and depends on their use.The most important possible meanings of the degrees used by some authors are: fulfillment degree, Uncertainty degree, Possibility degree and Importance degree.The ways of using these fuzzy degrees are classified in two families: associated degrees (type 5, type 6, type 7) and non-associated degrees (type 8).

The FSQL Language
The FSQL language is an authentic extension of SQL language to model fuzzy queries.
Linguistic Labels: if an attribute is able of fuzzy treatment then linguistic labels can be defined on it.These labels will be preceded with the symbol $ to distinguish them easily.Every label has an associated trapezoidal possibility distribution (for fuzzy attributes type 1 and 2) or a scalar (for fuzzy attributes type 3 and 4).Valid statements in SQL are also valid in FSQL.
Function CDEG: the function CDEG (compatibility degree) may be used with an attribute in the argument to compute.It computes the fulfillment degree of the condition of the query for the attribute mentioned in its argument.
Fulfillment Thresholds: for each simple condition, a fulfillment threshold τ may be established (default is 1) with the format: <condition> THOLD τ indicating that the condition must be satisfied with minimum degree τ ∈ considered.Fuzzy Quantifiers: there are of two types: absolute and relative.They allow us to use expressions like "most", "almost all", "many", "very few", etc.New Architecture of the FRDBMS We propose the weak coupling approach with DBMS.The concept of weak coupling is shown in Figure 1.

Figure 1 Weak Coupling Concept
The FRDBMS proposed respects the GEFERD model.The language of description and manipulation of the data is therefore FSQL.Seen that the FSQL language is an extension of the SQL language, a FRDBMS can model a RDB (described in SQL language) or a FRDB (described in FSQL language).The principle of this coupling is the definition of a software layer that allows the transformation of the command written by the user in FSQL language in their equivalent written in SQL.In order to implement a system which represent and manipulate "imprecise" information, Medina et al. have developed FIRST architecture (a fuzzy Interface for relational systems).It is built on RDBMS Client-Server architecture provided by Oracle.It extends the existing structure and adds new components to handle fuzzy information.The main important component added to this architecture is the FSQL Server which assures the translation of flexible queries written in FSQL in a comprehensible language by the DBMS (SQL).The installation of this architecture is described in Figure 2.

Conclusion and Future Scope
Fuzzy relational data bases have been extensively studied in a theoretical level.The majority of these works used the fuzzy sets formalism to model the linguistic terms as "moderate", "means" and to value the predicates including such terms.Medina et al. have developed a server named fuzzy SQL, supporting flexible queries and based on a theoretic model called GEFRED.This server has been programmed in PL/SQL language under Oracle database management systems.To model the flexible queries and the concept of fuzzy attributes, an extension of the SQL language named fuzzy SQL has been defined.The FSQL language extends the SQL language, to support the flexible queries, with many fuzzy concepts.The FRDB is supposed has already been defined by the user.In this proposal, we extend the work of medina et al. to implement a software layer which will convert FSQL queries to SQL queries.This architecture is based on the concept of weak coupling with the DBMS.This will facilitate the user a powerful and easy to use data mining tool which allows him to query data from databases by using linguistic expressions in order to improve the quality of selection process.The proposed architecture of FRDBMS based on the GEFRED model makes use of weak coupling concept with the DBMS.As a future work a new architecture supporting the concept of strong coupling with DBMS can be developed.
As futures perspectives of this work, we also mention the automatic mapping of existing relational DB to FRDB.This point is theoretically done but not implemented yet, so we think that it will contribute to make easier the use of the FRDB in real applications.