GT-BIS is a research and development project funded by Brazilian National Research and Educational Network (Rede Nacional de Ensino e Pesquisa - RNP) aiming the development of a system for analyzing heterogeneous big data captured in computer networks in order to detect security threats. Modern Artificial Intelligence techniques will be used for data correlation and machine learning, allowing immediate and early detection of attacks, that would not be detected with existing systems, and automatic learning with traffic history. As contribution to RNP, it is expected that this system can be negotiated and also help in the internal processes of information security. It is also important to highlight the scientific contributions in this work, like the discovery of the best techniques of Artificial Intelligence in the detection of security threats. Until April 2018, the project is at its Phase 1.
With the collection of big data from network traffic, comes the need for more “intelligent” methods to identify security incidents, mainly because it is possible to find new information through the correlation of information and also because a brute-force analysis would take a long time to finish. In this project, it is proposed the development of a system for analyzing massive volumes of heterogeneous data captured in computer networks, in the scope of the RNP network infrastructure, in order to enable the detection, instantly or early, of attacks, that would not be detected with the today existing systems, and the automatic learning with traffic history. The objectives to be achieved are:
As technological innovations, we highlight: (i) a mechanism capable of correlating large amounts of data from several heterogeneous sources in order to detect attacks that would not be possible to be detected with data from a single source, (ii) a mechanism capable of anticipating attacks based on the data collected from several sensors and (iii) the prototype of a learning mechanism for attacks. The prototype, at the end of the Phase 1 of the project, must be able to perform big data analysis that supports the inclusion of several sources and the generation of visualizations of security incidents, besides their anticipation, using correlation of data from several points of a computer network. This data must come from heterogeneous sources, for instance, packet headers and application logs.
Overview of the processing of IT security big data from heterogeneous sources
Group e-mail: : gt-bis@listas.rnp.br
Telephone: +55(11) 3091-0749
Address: Rua do Matão, 1010 - CEP 05508-090 - São Paulo - SP
Departamento de Ciência da Computação (DCC)
Instituto de Matemática e Estatística da Universidade de São Paulo (IME-USP)