World is digitized. Data is all around us in variety of forms. Apart from most popular format i.e. relational, other forms of data like text, graph, audio, video are just a few names out of a large vocabulary. At the same time, it is being generated at a rapid pace and its size is growing exponentially. The storage and management of data has remained a vital point of concern from last so long. Quite a number of techniques/methods/tools/procedures have been established and matured for this issue. The success rate is this regard is quite high.Merely efficient pilling up the data is not the desired goal rather retrieving valuable information is the dire need.
A number of research studies have been performed and a lot more are going on at the same time. Remarkable counts of novel methods have been developed to tackle this issue from multiple perspectives. The results of these studies have revolutionized the world. Data and Knowledge Engineering (DKE) lab is yet another platform. It is playing its part and contributing to research community under the supervision of Professor Young-Koo Lee in Kyung Hee University, South Korea.
DKE targets the data management and mining from multiple dimensions. In an extremely friendly but serious environment, an intensive research is going on by focusing data in the form of graphs like social networks, chemical and biological compounds, audio and video, and BIG data. Each data format has its exclusive research group in order to facilitate a rigorous research focus. Each group is divided into various sub groups to tackle the problem from all possible data mining and machine learning sub-divisions. This rich culture provides a diverse atmosphere to focus the research in the most optimal and result-oriented way. DKE is a multicultural environment. We have people from: Bangladesh, Pakistan, Vietnam, Rwanda, Guatemala, China and Republic of Korea.
We are looking for brilliant and smart students to be part of our endeavor. If you are interested to do research in any of the above topics or topics under the umbrella of Knowledge Discovery , you are heartily welcome to send your resume and research interests.We believe in team work and diversity.
Cross-optimizations for procedures in an in-memory database system
Hybrid transactional and analytical processing (HTAP) systems like SAP HANA make it much simpler to manage both operational loads and analytical queries without ETL, separate data warehouses, etc. To represent both transactional and analytical business logic in a single database system, stored procedures are often used to express analytical queries using control flow logic and DMLs. Optimizing these complex procedures requires a fair knowledge of imperative programming languages as well as a declarative query language. Therefore, unified optimization techniques involving both
program and query optimization techniques are essential to achieve optimal performance of procedures. Recently, we have carried out projects including the following.
Next-generation storage is a research area about utilizing recently developed data storages which has various characteristics such as on-the-fly data filtering, high-bandwidth, and low-latency. Each characteristic is beneficial to system by reducing I/O traffic, fast sequential I/O, and fast random I/O. To utilize characteristic of various types of storage, we analysis I/O performance of storages and system’s I/O pattern to optimize system’s I/O performance. Next-generation storages can be utilized as storage, tiered storage, or cache to optimize I/O performance of system.
Recently, we have carried out projects including the following
Knowledge Graph Management
RDF is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ.
RDF extends the linking structure of the Web to URIs to name the relationship between things as well as the two ends of the link (this is often referred to as a “triple”). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications. This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes.
RDF Research topics include technology for managing large amounts of RDF data, connection relation detection technology, and semantic search.
Core research issues of the project
On going Project
1. Summarization and Compression based Fast RDF Query Processing Techniques for Massive RDF Graphs
In this Project, we study the technique for summarizing/compressing large RDF graphs for efficient RDF graph storage and query processing.
2. Development of an association data set retrieval technique of Open Data Portal
Developing technology that can search similar data sets by using the hierarchical structure of dataset and meta data expressed in RDF to determine similarity even if relations are not directly expressed.
Achievements of RDF projects
Batjargal Dolgorsuren, Kifayat Ullah Khan, Mostofa Kamal Rasel, Young-Koo Lee, IEEE Access(SCI,Impact Factor: 3.557), 2019
Jawad Khan, Aftab Alam, Jamil Hussain, Young-Koo Lee, Applied Sciences(SCI, Impact Factor:1.983), 2019
Jawad Khan, Young-Koo Lee, Applied Sciences(SCIE, Impact Factor: 2.217), 2019
CLOUD-BASED IMAGE RETRIEVAL SYSTEM
In recent years, the amount of intelligent CCTV cameras installed in public places for surveillance has increased enormously and as a result, a large amount of video data is produced every moment. Due to this situation, there is an increasing request for the distributed processing of large-scale video data. In an intelligent video analytics platform, a submitted unstructured video undergoes through several multidisciplinary algorithms with the aim of extracting insights and making them searchable and understandable for both human and machine. Video analytics have applications ranging from surveillance to video content management. In this context, various industrial and scholarly solutions exist. However, most of the existing solutions rely on a traditional client/server framework to perform face and object recognition while lacking the support for more complex application scenarios. Furthermore, these frameworks are rarely handled in a scalable manner using distributed computing. Besides, existing works do not provide any support for low-level distributed video processing APIs (Application Programming Interfaces). They also failed to address a complete service-oriented ecosystem to meet the growing demands of consumers, researchers and developers. In order to overcome these issues, in this paper, we propose a distributed video analytics framework for intelligent video surveillance known as SIAT. The proposed framework is able to process both the real-time video streams and batch video analytics. Each real-time stream also corresponds to batch processing data. Hence, this work correlates with the symmetry concept. Furthermore, we introduce a distributed video processing library on top of Spark. SIAT exploits state-of-the-art distributed computing technologies with the aim to ensure scalability, effectiveness and fault-tolerance.
Recently, we have carried out projects including the following.