Quick update, the binaries for the latest build of -ontopPro- (the new name for the OBDA plugin for Protege) are now available for download! We are now updating all the websites and documentation to describe all the new features. As soon as we finish updating the documents we will make the official announcements.
Tuesday, April 17, 2012
Friday, April 13, 2012
Version 1.7-alpha is almost out!
We are very very excited to announce that we are about to release a new version of the framework. Stay tunned in the next days for a totally redesigned OBDA plugin for Protege, super fast and feature reach Quest and much more!
Saturday, January 7, 2012
Quest Performance for 1.7
Just a small teaser on the performance improvements that you can expect for the next release. We are so excited about them that we can't wait to tell.
Query containment: A critical part of query optimization in Quest relies on using query containment (CQC) to remove redundant queries from query rewriting. This is done by mutual checks, query1 vs query2 and so on. In complex cases, Quest might need to perform thousands or hundreds of thousands checks for a given query, hence the performance of the containment check algorithm is CRITICAL. We have been improving our algorithms and API to minimize the cost of this, so far we have a achieved a reduction of up to ... 95% of the cost! This means that if you got a complex query that was taking 12 s or 20 s to rewrite due to CQC, now it will just take 0.5 s or 1 s!
The best part is that what we have implemented until now is just a small fraction of all the optimization related to query containment we have in mind!
SQL Analysis: Up to version 1.6, we handled SQL as a black box. Any SQL query that was present in the mappings, was not really understood by Quest. The only thing Quest got from the query was the "signature", that is, the columns in the SELECT. Because of this, during the generation of the final SQL queries, Quest always had to rely on nesting of sub-queries. For example:
SELECT view1.x view2.y FROM
(SELECT x FROM employee) view1,
(SELECT x,y FROM worksfor) view2
WHERE
view1.x = view2.y
This is, in general, not good for performance. In order to plan this kind of query, the DBMS like Postgres, DB2 and Oracle have to "flat" the query. If this is not done, chance are that the query plan for the query will be not good, index might not be used, join orders might be suboptimal. What is worse, if the DBMS doesn't implement query flattening, for example MySQL, then the nested views have to be materialized before usage, with no indexes or anything... VERY BAD PERFORMANCE.
The good news is, for version 1.7, Quest will include its brand new SQL analyzer, and updated code that makes use of the result of this analysis during query rewriting, and SQL query generation. The first version of the analyzer doesn't understand 100% of SQL, however, it is enough to cover around 80% of the SQL queries we have seen in most use cases. The result is that the SQL queries generated by quest will be much closer to what a human would write on its own, and the DBMS will be able to optimize much much better.
If you would like to give version 1.7 a test drive before it is released, please contact us.
Query containment: A critical part of query optimization in Quest relies on using query containment (CQC) to remove redundant queries from query rewriting. This is done by mutual checks, query1 vs query2 and so on. In complex cases, Quest might need to perform thousands or hundreds of thousands checks for a given query, hence the performance of the containment check algorithm is CRITICAL. We have been improving our algorithms and API to minimize the cost of this, so far we have a achieved a reduction of up to ... 95% of the cost! This means that if you got a complex query that was taking 12 s or 20 s to rewrite due to CQC, now it will just take 0.5 s or 1 s!
The best part is that what we have implemented until now is just a small fraction of all the optimization related to query containment we have in mind!
SQL Analysis: Up to version 1.6, we handled SQL as a black box. Any SQL query that was present in the mappings, was not really understood by Quest. The only thing Quest got from the query was the "signature", that is, the columns in the SELECT. Because of this, during the generation of the final SQL queries, Quest always had to rely on nesting of sub-queries. For example:
SELECT view1.x view2.y FROM
(SELECT x FROM employee) view1,
(SELECT x,y FROM worksfor) view2
WHERE
view1.x = view2.y
This is, in general, not good for performance. In order to plan this kind of query, the DBMS like Postgres, DB2 and Oracle have to "flat" the query. If this is not done, chance are that the query plan for the query will be not good, index might not be used, join orders might be suboptimal. What is worse, if the DBMS doesn't implement query flattening, for example MySQL, then the nested views have to be materialized before usage, with no indexes or anything... VERY BAD PERFORMANCE.
The good news is, for version 1.7, Quest will include its brand new SQL analyzer, and updated code that makes use of the result of this analysis during query rewriting, and SQL query generation. The first version of the analyzer doesn't understand 100% of SQL, however, it is enough to cover around 80% of the SQL queries we have seen in most use cases. The result is that the SQL queries generated by quest will be much closer to what a human would write on its own, and the DBMS will be able to optimize much much better.
If you would like to give version 1.7 a test drive before it is released, please contact us.
Friday, November 25, 2011
Quest and SNOMED and other huge ontologies
Today we started working with SNOMED. We are using it as a benchmark for your ontology loading and preprocessing algorithms. Even though SNOMED is strictly out of the OWL 2 QL profile, it seems to be a very good testing bed for us since most of the axioms in the ontology do fall in the OWL 2 QL fragment (with some minor syntax adjustments). Even better, it seems we will also be able to approximate and get complete inferences for ground instances (which is what often matters in the data-intensive applications we have in mind with Quest).
Things to do to fully support SNOMED:
Things to do to fully support SNOMED:
- Upgrade Quest to support the OWLAPI 3.
- Optionally, upgrade quest to support Protege 4.1 (however, once the previous one is done, this should be straight forward).
- Upgrade our ontology translation mechanism (from OWLAPI to internal representation)
- Benchmark and fix.
- Possibly upgrade our semantic index implementation.
Once the loading performance is done, we should be able to easily link massive amounts of data to SNOMED concepts and relationships with the techniques already have. This would be specially useful for applications like semantic search with NLP concept tagging of resources which is a common use of SNOMED and that generates huge amounts of data assertions. One more step towards getting read of forward/backward chaining!
Expect good performance in huge ontologies like this very soon!
Wednesday, November 2, 2011
Performance! things to expect for version 1.7
Hi, we just came back from ISWC and its time to get back to Quest and the OBDA plugin. There are several important things that we just started implementing and that you will be able to expect in the 1.7 release. We would like to give you a small peak at them,
- T-Mapping (virtual mode): As you know, right now the virtual mode in Quest is not efficient. This is because way in which we generate SQL from the mapping of the OBDA model is very simple, basically just a one-to-one implementation of the methods described in [1], which tend to generate too many SQL queries. However, we already have the theory to optimize the SQL generation step [2], the technique is called T-Mappings for OBDA, and it is basically a query preserving transformation of the mappings of the systems that allows us to simplify the rewriting process, and the SQL generated by the system. Once T-Mappings are implemented in Quest, the system will get a dramatic boost in performance in the virtual OBDA setting, similar to the performance boost that you get when you use "Semantic Index" instead of "direct" modes. No more exponential SQL queries!
- SQL Analysis (virtual and classic mode): We just finished implementing the SQL api for the OBDALib and we are now integrating it with the SQL generation module of Quest. This will allow us to generate very efficient SQL, with little or no nesting at all.
- Improved query containment detection.
- Bulk loading and external databases (classic mode): Currently, Quest in classic OBDA mode is limited by the amount of RAM you have in the system. This is so because of two points. First we Quest can only receive as input an OWLAPI OWLOntology object, and not a reference to the location of the ontology. This means that the whole data has to be loaded before Quest can receive it, hence, RAM limit 1. To fix this we will allow you to load an ontology and data using URL references and files. The second reason of the RAM limit is that once Quest loads the data, it stores it in an in-memory H2 database, hence RAM limit 2. To fix this issues we will allow you to instruct Quest store the ontology in an external database where you will not have such limits.
Other improvements:
- A Sesame implementation for Quest
- Support for OWLAPI 3 and Protege 4.1
[1] Linking data to ontologies. Antonella Poggi, Domenico Lembo, Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, and Riccardo Rosati. J. on Data Semantics, X:133-173, 2008. pdf
[2] Dependencies: Making ontology based data access work in practice Mariano Rodriguez-Muro and Diego Calvanese. In Proc. of the 5th Alberto Mendelzon Int. Workshop on Foundations of Data Management (AMW 2011), volume 749 of CEUR Electronic Workshop Proceedings, http://ceur-ws.org/, 2011. pdf
Tuesday, October 18, 2011
See you at ISWC!
Next week we will have a poster at ISWC related to one of the core optimizations implemented in Quest for classic mode, the Semantic Index. If you would like to meet us, ISWC is the perfect time! See you there!
Version 1.6 is out!
Today we release version 1.6 of the OBDALib! This is a major release that includes lots of new features and optimizations in the OBDALib framework. Some noteworthy features of this release are:
Changes in Quest:
Changes in the OBDA Plugin:
Changes in Quest:
- A new optimization called “Equivalence elimination” that allows Quest to dramatically simplify reasoning in the presence of Class or Property equivalences. Works for virtual or classic OBDA.
- A first refactoring of the "Semantic Index" technique to allow for faster classification time in classic OBDA mode.
- New option for Classic OBDA that allows you to avoid having to import data into your ontologies manually. Now you can as Quest to import the data on its own, from 1 or more sources, and/or from the ontology!
- Added support for Teiid Database virtualization system. Now you can integrate multiple sources using classic or virtual mode, and not only JDBC! but also XML, XSL, CSV! anything that Teiid supports.
- Many bug fixes and stronger compliance with RDFS/OWL2QL
- New package to allow you to use Quest without the need of Protege directly from Java with OBDALib-Core! (check out the tutorials)
Changes in the OBDA Plugin:
- Ability to compute OBDA model statistics (to compute how many data triples/ABox assertions can be generated using an OBDA model)
- New feature to test data source connections
- New feature to test SQL queries in the mappings.
- Many bug fixes to improve stability of the system.
Subscribe to:
Comments (Atom)