GENERATING JAVA CLASS SKELETON USING A NATURAL LANGUAGE INTERFACE
{eozcan, seseker, ikaradeniz}@cse.yeditepe.edu.tr
Artificial Intelligence Laboratory (AR+I)
26 Ağustos Yerleşkesi Kayışdağı/İstanbul
Abstract. An
intelligent natural language interface based on Turkish Language is designed for
creating Java class skeleton, listing the class and its members. This interface
is developed as a part of a project named as TUJA, a tool for producing Java
programs using Turkish sentences. Turkish sentences are converted into
instances of schemata, representing classes and their members. Concept hierarchies are utilized for building
the classes and their hierarchical representation for Java class skeleton
generation. In this paper, the details of the design and the implementation are
described and a sample run is provided.
1 Introduction
Programming languages are machine processible, precise and mostly unambiguous with predefined syntax and semantics. Still, a novice programmer spends a lot of effort in learning syntactic rules and at the same time developing general programming skills. Even an experienced programmer may face the same problems, if the programming language is a new one. On the other hand, natural languages are more declarative, flexible, powerful and richer, being useful even for occasional users. Also, the programmer may not know the language used in the resources, such as books, to learn a new programming language.
There are visual tools for creating object oriented designs,
furthermore, generating Java/C++ skeletal programs, such as Rational Rose (an

Fig. 1. . Framework of TUJA.
2 Natural Language Processing using Turkish
NLP consists of 5 layers: morphology, syntax, semantics, pragmatics and phonetics ([2], [10]). Due to our scope and purposes we have limited our work in morphology and especially in syntax and semantics layers.
Turkish is one of the most widely spoken languages in the
world, distributed over a large geographical region in
The same suffix can be attached to different words in
different ways. Sometimes, a vowel or a consonant towards the end of a word may
deform. For this reason, morphological analysis in Turkish is not
straightforward as shown in Table 1.
Table
1. Some deformation examples in
Turkish words due to suffixes.
|
Word |
(Stem) + Suffixes |
|
görünürlerde (in sight) |
(gör) + ün + ür + ler +de (görmek
- to see) |
|
ağaca (towards the tree) |
(ağaç) + a (tree) |
|
ağlıyor (he/she/it is
crying) |
(ağla) + yor (ağlamak
to cry) |
There are seven
morphological categories in Turkish: nouns, private nouns, compound nouns,
adjectives, verbs, adverbs and conjunctions. In Turkish, another difficulty
rises due to the syntax. Sentences with different syntaxes using the same words
are allowed in Turkish, yielding a group of sentences with the same meaning as
illustrated in Table 2. The common property of all these three sentences is a
feature of Turkish language, that is, the
verb appears at the end of the sentences.
Table 2. Turkish sentences with different syntax
having the same meaning.
|
Sentence (I gave the book to the child) |
|
Çocuğa
kitabı ben verdim |
|
Çocuğa ben
kitabı verdim |
|
Ben çocuğa
kitabı verdim |
3 Morphology, Syntax and Semantics
It is assumed that Object Oriented Programming terminology is known. Morphology of TUJA is inherited from a previous project, TUSA [6] based on PROLOG. The sentences are categorized into four different groups: (a) Class Declaration Sentences, (b) Attribute Declaration Sentences, (c) Method Declaration Sentences, (d) Relation Declaration Sentences. All possible syntax types are supported to create an abstract model representing the classes. An augmented transition network (ATN) is developed for TUJA interface. HASA relationship is used for composing classes and ISA relationship is used for building the class hierarchy. TUJA assumes that in general, a noun in a sentence refers to a class, interface or an object, and a verb refers to a method.
3.1 Class Declaration Sentences
This group of sentences is used to create a new class or name an existing class as shown in Fig. 2b. Note that declaration of abstract classes; Java interfaces are also supported.

(a) (b)
Fig. 2 (a) ATN, (b) some sample sentences for class declaration sentences.
Part of the ATN for TUJA
detects class declarations as illustrated in Fig. 2a. Nominalverb component alone and
combined with the Modifier component in
the ATN determines whether a class is abstract or not.
3.2 Attribute Declaration Sentences
This group of sentences is used to define the attributes of an
existing class or to define a new class with specified attributes as shown Fig.
3. HAS relation represents the inclusion relationship, determining the elements
included by an object. In other words,
3. 3 Method Declaration Sentences
This group of sentences is used to define the methods of predefined classes or to define a new class with specified member methods as shown in Fig. 4b. Part of the ATN for TUJA determines method declarations as shown in Fig. 4a.