About Us

About Us


World is digitized. Data is all around us in variety of forms. Apart from most popular format i.e. relational, other forms of data like text, graph, audio, video are just a few names out of a large vocabulary. At the same time, it is being generated at a rapid pace and its size is growing exponentially. The storage and management of data has remained a vital point of concern from last so long. Quite a number of techniques/methods/tools/procedures have been established and matured for this issue. The success rate is this regard is quite high.Merely efficient pilling up the data is not the desired goal rather retrieving valuable information is the dire need.

A number of research studies have been performed and a lot more are going on at the same time. Remarkable counts of novel methods have been developed to tackle this issue from multiple perspectives. The results of these studies have revolutionized the world. Data and Knowledge Engineering (DKE) lab is yet another platform. It is playing its part and contributing to research community under the supervision of Professor Young-Koo Lee in Kyung Hee University, South Korea.

DKE targets the data management and mining from multiple dimensions. In an extremely friendly but serious environment, an intensive research is going on by focusing data in the form of graphs like social networks, chemical and biological compounds, audio and video, and BIG data. Each data format has its exclusive research group in order to facilitate a rigorous research focus. Each group is divided into various sub groups to tackle the problem from all possible data mining and machine learning sub-divisions. This rich culture provides a diverse atmosphere to focus the research in the most optimal and result-oriented way. DKE is a multicultural environment. We have people from: Bangladesh, Pakistan, Vietnam, Rwanda, Guatemala, China and Republic of Korea.

We are looking for brilliant and smart students to be part of our endeavor. If you are interested to do research in any of the above topics or topics under the umbrella of Knowledge Discovery , you are heartily welcome to send your resume and research interests.We believe in team work and diversity.


Research Area

Cross-optimizations for procedures in an in-memory database system

Hybrid transactional and analytical processing (HTAP) systems like SAP HANA make it much simpler to manage both operational loads and analytical queries without ETL, separate data warehouses, etc. To represent both transactional and analytical business logic in a single database system, stored procedures are often used to express analytical queries using control flow logic and DMLs. Optimizing these complex procedures requires a fair knowledge of imperative programming languages as well as a declarative query language. Therefore, unified optimization techniques involving both
program and query optimization techniques are essential to achieve optimal performance of procedures. Recently, we have carried out projects including the following.

  • Consulting on SQL/SQLScript Cross-Optimizations for Complex Business Queries in SAP HANA
  • Optimization Techniques for queries containing UDFs using cross-optimizations in SAP HANA
    Representative Achievement:

  • Kisung Park, Hojin Seo, Mostofa Kamal Rasel, Young-Koo Lee, Chanho Jeong, Sung Yeol Lee, Chungmin Lee, Dong-Hun Lee, “Iterative Query Processing based on Unified Optimization Techniques”, SIGMOD ’19 Proceedings of the 2019 International Conference on Management of Data, 2019.07, pp 54-68.
  • Next-generation storage

    Next-generation storage is a research area about utilizing recently developed data storages which has various characteristics such as on-the-fly data filtering, high-bandwidth, and low-latency. Each characteristic is beneficial to system by reducing I/O traffic, fast sequential I/O, and fast random I/O. To utilize characteristic of various types of storage, we analysis I/O performance of storages and system’s I/O pattern to optimize system’s I/O performance. Next-generation storages can be utilized as storage, tiered storage, or cache to optimize I/O performance of system.

    Recently, we have carried out projects including the following

  • Optimization of graph DB performance using low latency storage media
    Knowledge Graph Management

    RDF is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ.
    RDF extends the linking structure of the Web to URIs to name the relationship between things as well as the two ends of the link (this is often referred to as a “triple”). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications. This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes.
    RDF Research topics include technology for managing large amounts of RDF data, connection relation detection technology, and semantic search.

    Core research issues of the project

  • Efficient summarization/compression technology to reduce the disk space for storing RDF data
  • A query optimization technique in summarized / compressed RDF graphs utilizing the caching technology
  • Improve the performance of query processing by exploiting the distributed environment
    On going Project

    1. Summarization and Compression based Fast RDF Query Processing Techniques for Massive RDF Graphs

    In this Project, we study the technique for summarizing/compressing large RDF graphs for efficient RDF graph storage and query processing.

    2. Development of an association data set retrieval technique of Open Data Portal
    Developing technology that can search similar data sets by using the hierarchical structure of dataset and meta data expressed in RDF to determine similarity even if relations are not directly expressed.

    Achievements of RDF projects

  • StarZIP: Streaming Graph Compression Technique for Data Archiving
    Batjargal Dolgorsuren, Kifayat Ullah Khan, Mostofa Kamal Rasel, Young-Koo Lee, IEEE Access(SCI,Impact Factor: 3.557), 2019
  • EnSWF: effective features extraction and selection in conjunction with ensemble learning methods for document sentiment classification
    Jawad Khan, Aftab Alam, Jamil Hussain, Young-Koo Lee, Applied Sciences(SCI, Impact Factor:1.983), 2019
  • LeSSA: A Unified Framework based on Lexicons and Semi-Supervised Learning Approaches for Textual Sentiment Classification
    Jawad Khan, Young-Koo Lee, Applied Sciences(SCIE, Impact Factor: 2.217), 2019

    In recent years, the amount of intelligent CCTV cameras installed in public places for surveillance has increased enormously and as a result, a large amount of video data is produced every moment. Due to this situation, there is an increasing request for the distributed processing of large-scale video data. In an intelligent video analytics platform, a submitted unstructured video undergoes through several multidisciplinary algorithms with the aim of extracting insights and making them searchable and understandable for both human and machine. Video analytics have applications ranging from surveillance to video content management. In this context, various industrial and scholarly solutions exist. However, most of the existing solutions rely on a traditional client/server framework to perform face and object recognition while lacking the support for more complex application scenarios. Furthermore, these frameworks are rarely handled in a scalable manner using distributed computing. Besides, existing works do not provide any support for low-level distributed video processing APIs (Application Programming Interfaces). They also failed to address a complete service-oriented ecosystem to meet the growing demands of consumers, researchers and developers. In order to overcome these issues, in this paper, we propose a distributed video analytics framework for intelligent video surveillance known as SIAT. The proposed framework is able to process both the real-time video streams and batch video analytics. Each real-time stream also corresponds to batch processing data. Hence, this work correlates with the symmetry concept. Furthermore, we introduce a distributed video processing library on top of Spark. SIAT exploits state-of-the-art distributed computing technologies with the aim to ensure scalability, effectiveness and fault-tolerance.

    Recently, we have carried out projects including the following.

  • Development of SIAT-type CCTV cloud platform technology
    Representative Achivements:

  • Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition, Md Azher Uddin, Young-Koo Lee, Sensors (SCIE, Impact Factor: 2.475), 2019
  • Dynamic Scene Recognition using Spatiotemporal based DLTP on Spark, Md Azher Uddin, Mostafijur Rahman Akhond, Young-Koo Lee, IEEE Access (SCIE, Impact Factor: 3.557), 2018
  • Similarity Estimation for Large-Scale Human Action Video Data on Spark Weihua Xu, Md Azher Uddin, Batjargal Dolgorsuren, Mostafijur Rahman Akhond, Kifayat Ullah Khan, Md Ibrahim Hossain and Young-Koo Lee, Applied Sciences (SCIE, Impact Factor: 1.679), 2018.
  • Human Action Recognition using Adaptive Local Motion Descriptor in Spark, Md Azher Uddin, Joolekha bibi Joolee, Aftab Alam and Young-Koo Lee, IEEE Access (SCIE, Impact Factor: 3.244), 2017
  • ML-HDP: A Hierarchical Bayesian Nonparametric Model for Recognizing Human Actions in Video, Nguyen Anh Tu, Thien Huynh-The, Kifayat-Ullah Khan, and Young-Koo Lee, IEEE Transactions on Circuits and Systems for Video Technology (SCI, IF: 2.254), 2017