GATE.ac.uk - sale/iswc08/swese/comments.txt

---------------------------------------------

Paper: 9
Title: Enhanced semantic access of software artefacts

-------------------- review 1 --------------------

PAPER: 9
TITLE: Enhanced semantic access of software artefacts

OVERALL RATING: 1 (weak accept)
REVIEWER'S CONFIDENCE: 2 (medium)
----------------------- REVIEW --------------------

sec 1
- enable semantically -> semantically enable
- tasks vs. steps?
- very widely -> avoid superlatives
- with hundreds and -> of
- the intro reads a bit entangled. please resructure. a good rule of thumb is
1 para for context - what are we talking about?
1 para for motivation - what is the problem you tackle?
1 para for contribution - how do you solve it?
1 para for outline
- find easily -> easily find

sec 2
- i found it good that you introduce a concrete case study. however, sec 2 reads rather like a description of your prototype, and, thus, seems misplaced. i would expect here motivation/challenges/requirements and not a feature list of your solution
- there are typically -> is
- generates automatically -> automatically generates

sec 3
- what is ol?
- derive automatically -> automatically derive
- 2nd para read like related work
- a screenshot of the taxonomy would be nice
- i would be interested in the quality of the acquired ontology: usually learnt ontologioes are quite unusable and should be better built manually from scratch right away based on a foundational ontology and applying ontology design patterns

sec 4
- text and figure should be better aligned. i read many terms in the figure that are not explained in the text and vice versa.

sec 4.2
- of *a* semantically annotated document
- does the ontology only contain attributes (what would be trivial) or also relations?
- what kind of resource is annnnotated in fig 3. please explain

sec 4.4
- the paper seems rather incremental and focussed on the case study. unclear:what exactly is the contribution?
- is the example "countries in europe" relevant for gate?

sec 5
- last para is future work

sec 6.2
- sotware -> software
- similar to us -> ours

-------------------- review 2 --------------------

PAPER: 9
TITLE: Enhanced semantic access of software artefacts

OVERALL RATING: 1 (weak accept)
REVIEWER'S CONFIDENCE: 4 (expert)
----------------------- REVIEW --------------------

This is a well-written and clear paper that nicely fits into the scope of the workshop. The paper proposes a knowledge engineering method for managing software artifact currently scattered in diverse places of software repositories, yet, relevant for the projects at hand. The proposed solution consists of several phases: first, the authors leverage methods for ontology learning from software artifacts; second, they provide a method for semantic annotation of software artifacts by obtained ontologies in the first step; finally, the leverage a method that translates natural language defined queries into queries for semantic repositories.

The proposed solution seems fully developed and it appears as a complete prototype that addresses long-time existing problem of knowledge management in software engineering. I think that it is worth presenting this paper at the workshop. However, I have some concerns that it would be important to address in this paper and the future work of the authors:

-it is very briefly explained the process of ontology learning. While I would not ping point to this issue, there are two main reasons that draw my attention in this paper. First, the authors claim that their distinguishing value from other related works is the use of ontology learning. As I understood from the paper, the authors used off-the-shelf available system for ontology learning. Thus, my question is what is the contribution then of the authors here? I see no clear contribution, but only (which could be very good one and important for practical problem-driven research where I am coming form) the evaluation of the used ontology learning method in this particular context. Based on the presentation, I got an impression that ontology learning is a silver bullet and does not introduce any issues. That, in fact, triggered my second concern, based on the expensive empirical study we are currently doing with several ontology learning tools – ontology learning tools are quite limited and very challenging from the use by regular software developers. Thus, what is the quality of such development ontologies and what are the costs (time to invest) or their development. Such a variable in your system can not be ignored, and it has to be properly addressed in terms of very deep critical analysis of the quality of the proposed solution, its limitations, and lessons learned. No one is supposed to except from you a story that everything is perfect, if so, why all this is not sold to a company and widely used as silver bullet of the semantic web. Myself, as a researcher, can only be appreciative very much about such critical discussion, as that’s a real contribution to the researchers trying to advance this area.

- Another thing is related to the annotation used, which again does not seem to original w.r.t. the related work. Yet, where you may claim your originality is exactly on leveraging your ontology learned and now evaluate how it is suitable for the tasks you are taking. Thus, again, your critical discussion (and empirical analysis) in that aspect could really be valuable. It would be also useful if you position your used annotation mechanism w.r.t. the classification of semantic annotators given in [1].

-Seems that your natural language-based query system is as your main original contribution to the area of the Semantic Web, and here you are nicely applying it to the domain of software engineering. However, your information about your own main contribution are reported elsewhere, which expects that the readers of your paper will have to first read some of the cited papers, and then full understand possible limitations of that component. Yet, what is the impact of the previous two stages (neither ontology learning and annotation are prefect ones), and combined with the variability introduced by the third component, I would like to see the limitations of your work there. Consider, those don’t have to be prefect numbers, but you would be the first to report such a comprehensive results. Another possible idea that you may leverage in your NLP-based GUI, is to consider suggesting possible relations and other concepts while users typing their queries by leveraging already existent knowledge defined in the ontology similar to the ideas of [2]. I am sure that this may nicely increase the usability of your system.

-Finally, your conclusion would much benefit you if provide a critical analysis of how your approach improves this area where the use of ontologies for software engineering has a long research tradition (well, as long as one can say for this challenging and yet fun area ). For example, where do you think your work w.r.t. the approaches such as LaSSIE. Where are we 10-13 years later? Yet, it would also be relevant to compare your work with [3] and [4]. Yes, they have different goals of the project, but there are so many intersection parts between your and those reviews. In the end of the day, you all are doing something which software engineering world calls: mining software repositories and then, it would be quite useful for you to explore the achievements of that community by starting from its major event - the MSR conference [5], while a good introduction into this area is given in [6]. And even, why don't to try to disseminate in those communities as well. It might be fun and useful :-) .

In a nutshell, I like your efforts, and I think that this paper should be presented at the workshop. However, I strongly encourage you to provide much more critical analysis of the results presented here, as this may only lead to a greater progress of this area, and yet increase significance of your research contribution. I am looking forward to your future papers where you will address those issues.

[As a part of the on-going openness efforts in software engineering, I decided to sing this review: Dragan Gasevic, dgasevic@acm.org]

[1] Victoria S. Uren, Philipp Cimiano, José Iria, Siegfried Handschuh, Maria Vargas-Vera, Enrico Motta, Fabio Ciravegna: Semantic annotation for knowledge management: Requirements and a survey of the state of the art. J. Web Sem. 4(1): 14-28 (2006).
[2] http://mqlx.com/~david/parallax/
[3] René Witte, Yonggang Zhang, Juergen Rilling: Empowering Software Maintainers with Semantic Web Technologies. ESWC 2007: 37-52
[4]Christoph Kiefer, Abraham Bernstein, Jonas Tappolet. Analyzing Software with iSPARQL. Proceedings of the 3rd ESWC International Workshop on Semantic Web Enabled Software Engineering (SWESE). Innsbruck, Austria, June 6, 2007.
[5] http://msr.uwaterloo.ca/msr2009/index.html
[6] Huzefa H. Kagdi, Michael L. Collard, Jonathan I. Maletic: A survey and taxonomy of approaches for mining software repositories in the context of software evolution. Journal of Software Maintenance 19(2): 77-131 (2007)

-------------------- review 3 --------------------

PAPER: 9
TITLE: Enhanced semantic access of software artefacts

OVERALL RATING: 1 (weak accept)
REVIEWER'S CONFIDENCE: 3 (high)
----------------------- REVIEW --------------------

The paper presents a case study on how different sources of software documentation can be integrated using semantic techniques. An ad-hoc architecture is proposed and implemented against a real world prototype.
I am not persuaded with some of the design choices of the presented approach. Most importantly, why an ontology for software artifacts needs to be mined, instead of being carefully designed and integrated in the system. This is the approach of Ancolekar et al (2006) cited in the paper, but also of Kiefer et al (2007), that was presented in SWESE 2007. While the differences are discussed in the paper, a motivation why an ontology acquisition step is required. What are the advantages and what are the obstacles that are tackled this way?
Also, the added value services that are offered by the developed system need to be better justified. How today non-ontology based systems operate and what are the alternatives available? Why shall we use an ontology based system for accessing software artefacts? Where traditional approaches failed and why?