OntoGen: Semi-automatic Ontology Editor (2007)
Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
In this paper we present a semi-automatic ontology editor as implemented in a new version of OntoGen system. The system integrates machine learning and text mining algorithms into an efficient user...
User profiling for the web (2007)
Grcar, Miha, Mladenić, Dunja, Grobelnik, Marko
This paper addresses a problem of personalized information delivery related to the Web, that is based on user profiling. Different approaches to user profiling have been developed. When the user...
User study of ontology generation tool (2007)
Ilijasic Misic, Ivana, Kovacic, Bozidar, Mohoric, Tamara, Mladenić, Dunja, Fortuna, Blaz, Grobelnik, Marko
We present design and results of a user study undertaken in order to evaluate ontology generation process. We have applied our study to an example tool for semi-automatic ontology generation –...
Evaluation of semi-automatic ontology generation in real-world setting (2007)
Mladenić, Dunja, Grobelnik, Marko
This paper presents several aspects of evaluating semi-automatic ontology generation techniques in real-world setting. We provide description of incorporating the techniques in a solution to...
Automatic evaluation of ontologies (2007)
Brank, Janez, Grobelnik, Marko, Mladenić, Dunja
Automatic evaluation of ontologies
Machine learning for resolving researcher affiliation (2007)
Sterk, Marjan, Vladusic, Damijel, Milosevic, Eva, Ferlez, Jure, Mladenić, Dunja, Grobelnik, Marko
This paper describes the Institution Finder, an approach to develop a simple web mining procedure to find the internet domain of the institution(s) that a given researcher is affiliated with. The...
Triplet extraction from sentences (2007)
Rusu, Delia, Dali, Lorand, Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
In this paper we present an approach to extracting subject-predicate-object triplets from English sentences. To begin with, four different well known syntactical parsers for English are used for...
PREDICTING THE ADDITION OF NEW CONCEPTS IN A TOPIC HIERARCHY (2007)
Brank, Janez, Mladenić, Dunja, Grobelnik, Marko
Ontologies often change through time, a process largely done manually by human editors. We discuss the task of automatically predicting when structural changes will occur in a given ontology. We...
Extracting named entities and relating them over time based on Wikipedia (2007)
Bhole, Abhiji, Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
This paper presents an approach to mining information relating people, places, organizations and events extracted from Wikipedia and linking them on a time scale. The approach consists of two phases:...
From social network to light-weight ontology (2007)
Mladenić, Dunja, Grobelnik, Marko, Fortuna, Blaz
We address the problem of constructing a light-weight ontology from social network data. As an example we use social network of a mid size research institution obtained based on e-mail communication....
Using text mining and link analysis for software (2007)
Grcar, Miha, Grobelnik, Marko, Mladenić, Dunja
Many data mining techniques are these days in use for ontology learning – text mining, Web mining, graph mining, link analysis, relational data mining, and so on. In the current state-of-the-art...
Pascal Workshop, Complex Objects Visualization 2005 - COV2005, Proceedings (2006)
Pisanski, Tomaž, Horvat, Boris, Žerovnik, Janez, Mladenić, Dunja, Grobelnik, Marko, Anžič, Tina, ...
Pascal workshop on Complex Object Visualization COV-2005 brought together a group of researchers from various branches of Mathematics and Computer Science focused around a common theme that arises in...
Hierarchical text categorization using coding matrices (2006)
Brank, Janez, Mladenić, Dunja, Grobelnik, Marko
We discuss the task of ontology population as a machine learning problem with a large hierarchy of classes. Since many machine learning methods are designed primarily for two-class problems, it is...
Semi-automatic data-driven ontology construction system (2006)
Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
In this paper we present a new version of OntoGen system for semi-automatic data-driven ontology construction. The system is based on a novel ontology learning framework which formalizes and extends...
Extending IST World database with Serbian research publications, (2006)
Radovanovic, Milos, Ferlez, Jure, Mladenić, Dunja, Grobelnik, Marko, Ivanovic, Mirjana
This paper describes an effort of using knowledge technologies to gain insights into research activity, by exploiting publicly available information on research publications. The specificity of this...
System for Semi-automatic Ontology construction (2006)
Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
In this paper, we review two techniques for topic discovery in collections of text documents (Latent Semantic Indexing and KMeans clustering) and present how we integrated them into a system for...
Background Knowledge for Ontology Construction (2006)
Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
In this paper we describe a solution for incorporating background knowledge into the OntoGen system for semi-automatic ontology construction. This makes it easier for different users to construct...
Feature Selection for Dimensionality Reduction (2006)
Dimensionality reduction is a commonly used step in machine learning, especially when dealing with a high dimensional space of features. The original feature space is mapped onto a new, reduced...
Genre document classification using flexible length phrases (2006)
Radosevic, Daniel, Dobsa, Jasminka, Mladenić, Dunja, Novak, Miroslav, Stapic, Zlatko
Genre document classification using flexible length phrases
Using DMoz for constructing ontology from data stream (2006)
Grobelnik, Marko, Brank, Janez, Mladenić, Dunja, Novak, Blaz, Fortuna, Blaz
This paper presents an approach for constructing an ontology from a stream of documents. Named entities extracted from the documents are used as instances of the ontology. Entities and co-occurring...
Flexible length phrases in document classification (2006)
Radosevic, Daniel, Dobsa, Jasminka, Mladenić, Dunja
In this paper we investigate possibility of using phrases of flexible length in classification of textual documents as an extension to classic bag of words document representation where documents are...
Text mining - machine learning on document (2006)
Text mining - machine learning on document
Knowledge discovery for ontology construction (2006)
Grobelnik, Marko, Mladenić, Dunja
We can observe that the focus of modern information systems is moving from ‘data-processing’ towards ‘concept-processing’, meaning that the basic unit of processing is less and less is the...
Automated structuring of company profiles (2006)
Ljubic, Peter, Lavrac, Nada, Mladenić, Dunja, Plisson, Joel, Mozetic, Igor
Automated structuring of company profiles
Visualization of text document corpus (2005)
Fortuna, Blaz, Mladenić, Dunja, Grobelnik, Marko
From the automated text processing point of view, natural language is very redundant in the sense that many different words share a common or similar meaning. For computer this can be hard to...
A survey of ontology evaluation techniques (2005)
Brank, Janez, Grobelnik, Marko, Mladenić, Dunja
An ontology is an explicit formal conceptualization of some domain of interest. Ontologies are increasingly used in various fields such as knowledge management, information extraction, and the...
Semi-automatic construction of topic ontology (2005)
Fortuna, Blaz, Mladenić, Dunja, Grobelnik, Marko
In this paper, we review two techniques for topic discovery in collections of text documents (Latent Semantic Indexing and K-Means clustering) and present how we integrated them into a system for...
Jakulin, Aleks, Mladenić, Dunja
An ontology is a structured semantic model, composed of concepts, relations and instances. Data is a more primitive but concrete assembly of instances described by their attributes. An example of...
User profiling for interest-focused browsing history (2005)
Grcar, Miha, Mladenić, Dunja, Grobelnik, Marko
User profiling is an important part of the Semantic Web as it integrates the user into the concept of Web data with machine-readable semantics. In this paper, user profiling is presented as a way of...
Plisson, Joel, Mladenić, Dunja, Ljubic, Peter, Lavrac, Nada, Grobelnik, Marko
Organizations have to collaborate in order to achieve business goals which require to use a variety of domainspecific knowledge. Selection of partners with an appropriate expertise is one of the...
Automated structuring of company competencies in virtual organizations (2005)
Ljubic, Peter, Lavrac, Nada, Plisson, Joel, Mladenić, Dunja, Bollhalter, Stefan, Jermol, Mitja
Creation of virtual organizations (VO) consists of several steps. One of the early steps is finding organizations with an appropriate expertise from a larger pool of organizations, referred to as a...
Analysis of demining project proposals (2005)
This paper is an analysis of project proposals for mine clearance of mine-affected areas. Specifically we are studying the relationship between proposals and their evaluation scores influencing their...
to preserve Slovenian digital heritage (2005)
Mladenić, Dunja, Grobelnik, Marko, Kavcic-Colic, Alenka
This paper describes an initiative for preserving Slovenian digital heritage via setting Slovenian national digital archive. We have proposed methodology for archiving electronic publications based...
Initiative to preserve Slovenian digital heritage (2005)
Mladenić, Dunja, Grobelnik, Marko, Kavcic-Colic, Alenka
This paper describes an initiative for preserving Slovenian digital heritage via setting Slovenian national digital archive. We have proposed methodology for archiving electronic publications based...
Automated Knowledge Discovery in Advanced Knowledge Management (2005)
Grobelnik, Marko, Mladenić, Dunja
Knowledge Management is a discipline with many faces – among very provocative ones is the research area dealing with automatic discovery of the hidden truth within the data describing the world...
Next Generation Knowledge Access (2005)
Davies, John, Duke, Alistair, Kings, Nick, Mladenić, Dunja, Bontcheva, Kalina, Grcar, Miha, ...
Purpose The paper shows how access to knowledge can be enhanced by using a set of innovative approaches and technologies based upon the Semantic Web. Approach Emerging trends in knowledge access are...
kNN Versus SVM in the Collaborative Filtering Framework (2005)
Grcar, Miha, Fortuna, Blaz, Mladenić, Dunja
We present experimental results of confronting the k-Nearest Neighbor (kNN) algorithm with Support Vector Machine (SVM) in the collaborative filtering framework using datasets with different...
Simple classification into large topic ontology of Web documents (2005)
Grobelnik, Marko, Mladenić, Dunja
The paper presents an approach to classifying Web documents into large topic ontology. The main emphasis is on having a simple approach appropriate for handling a large ontology and providing it with...
Simple classification into large topic ontology of Web documents (2005)
Grobelnik, Marko, Mladenić, Dunja
The paper presents an approach to classifying Web documents into large topic ontology. The main emphasis is on having a simple approach appropriate for handling a large ontology and providing it with...
Challenges and Creativity in IT Research (2005)
Information Technology (IT) research is fairly broad area covering different research topics and offering many interesting challenges and opportunities for creative research. However, in many...
Building minority language corpora by learning to generate web search queries (2005)
Ghani, Rayid, Jones, Rosie, Mladenić, Dunja
Building minority language corpora by learning to generate web search queries
Using string kernels for classification of Slovenian Web documents (2005)
Fortuna, Blaz, Mladenić, Dunja
In this paper we present an approach for classifying web pages obtained from the Slovenian Internet directory where the web sites covering different topics are organized into a topic ontology.We...
Visualizing very large graphs using clustering neighborhoods (2005)
Mladenić, Dunja, Grobelnik, Marko
This paper presents a method for visualization of large graphs in a two-dimensional space, such as a collection of Web pages. The main contribution here is in the representation change to enable...
Summarization and visualization (2005)
Mladenić, Dunja, Grobelnik, Marko
Both text summarization and visualization aim at providing some sort of general view of the text either giving a text summary in the required natural language or giving some visual representation of...
New media and knowledge management (2005)
Lavrac, Nada, Jermol, Mitja, Urbancic, Tanja, Mladenić, Dunja
media and knowledge management : part of "New media and e-science" programme and "statistics" programme : fall semester, 2004/2005
Text mining methods have being successfully used on different problems, where text data is involved. Some Text mining approaches are capable of handling text just relying on statistics such as,...
Applying collaborative filtering to real0life corporate data (2005)
Grcar, Miha, Mladenić, Dunja, Grobelnik, Marko
In this paper, we present our experience in applying collaborative filtering to real-life corporate data. The quality of collaborative filtering recommendations is highly dependent on the quality of...
Text classification with active learning (2005)
Novak, Blaz, Mladenić, Dunja, Grobelnik, Marko
In many real world machine learning tasks, labeled training examples are expensive to obtain, while at the same time there is a lot of unlabeled examples available. One such class of learning...
A lemmatization web service based on machine learning techniques (2005)
Plisson, Joel, Mladenić, Dunja, Lavrac, Nada, Erjavec, Tomaz
Lemmatization is the process of finding the normalized form of words from surface word-forms as they appear in the running text. It is a useful pre-processing step for any number of language...
Lavrac, Nada, Ljubic, Peter, Mladenić, Dunja, Plisson, Joel
Automated extraction and structuring of competencies from unstructured company data : two case studies
Mapping Documents onto Web Page Ontology (2004)
Mladenić, Dunja, Grobelnik, Marko
The paper describes an approach to automatically mapping Web pages onto ontology using document classification based on the Yahoo! ontology of Web pages. Techniques developed for learning on text...
A rule based approach to word lemmatization (2004)
Plisson, Joel, Lavrac, Nada, Mladenić, Dunja
Lemmatization is the process of finding the normalized form of a word. It is the same as looking for a transformation to apply on a word to get its normalized form. The approach presented in this...
A RULE BASED APPROACH TO WORD LEMMATIZATION (2004)
Plisson, Joel, Lavrac, Nada, Mladenić, Dr. Dunja
Lemmatization is the process of finding the normalized form of a word. It is the same as looking for a transformation to apply on a word to get its normalized form. The approach presented in this...
VISUALIZATION OF NEWS ARTICLES (2004)
Grobelnik, Marko, Mladenić, Dr. Dunja
This paper presents a system for visualization of large amounts of new stories. In the first phase, the new stories are preprocessed for the purpose of name -entity extraction. Next, a graph of...
A Roadmap for Web Mining: From Web to Semantic Web (2004)
Berendt, Bettina, Hotho, Andreas, Mladenić, Dunja, Spiliopoulou, Myra, Stumme, Gerd
The purpose of Web mining is to develop methods and systems for discovering models of objects and processes on the World Wide Web and for web-based systems that show adaptive performance. Web Mining...
Grobelnik, Marko, Mladenić, Dunja
This tutorial gives an overview of the Text Mining problem. After introducing the challenges faced, the various levels of text processing are discussed.