Výsledky Mimořádné výzvy: GPU Testing and Benchmarking 24. kola Veřejné grantové soutěže
Všem žadatelům o výpočetní čas v rámci Mimořádné výzvy: GPU Testing and Benchmarking 24. Veřejné grantové soutěže děkujeme za podání žádostí.
Uchazeči o výpočetní zdroje požádali v rámci této výzvy, do následujícího alokačního období o 87 183 uzlohodin GPU akcelerovanou částí superpočítače Karolina. S ohledem k zajištění účelnosti využití alokovaných zdrojů, alokační komise podrobně projednala výsledky technického hodnocení a hodnocením poměru počtu registrovaných publikací na projekt.
O alokacích komise rozhodla na základě výsledků výpočetní připravenosti. S ohledem na dostatečné volné výpočetní zdroje, rozhodla komise o alokaci v plné výši.
Komise konstatovala obecně velmi dobrou technickou úroveň hodnocených projektů, kde z maximálního dosažitelného počtu 5 bodů bylo průměrně dosaženo 4 bodů, minimálně 2 bodů. Osm z hodnocených projektů dosáhlo plných 5 bodů.
Mezi 18 projektů, bylo rozděleno 87 183 uzlohodin GPU akcelerované částí superpočítače Karolina.
ALOKAČNÍ KOMISE V MIMOŘÁDNÉ VÝZVĚ 24. KOLE VEŘEJNÉ GRANTOVÉ SOUTĚŽE IT4INNOVATIONS ROZDĚLILA VÝPOČETNÍ ZDROJE TAKTO:
Hlavní řešitel: Vladimír Petrík
Projekt: Robust object 6D pose estimation for robotics manipulation
Alokace: 6 000 uzlohodin
Abstrakt: The majority of robots deployed in a complex real-world scenario rely on visual feedback that guides the manipulator’s actions and motions. The visual feedback is often preprocessed to extract object-centric information of the scene, such as object poses, or sizes. Although machine learning brings significant improvements to the known object pose detection accuracy, the robustness is not yet ready for continuous feedback for robot control. One of the limitations of the state-of-the-art methods is full-visibility requirements, making predictions unstable in cases of partial occlusions of the object. This is a key limitation for robotic manipulation as occlusions happen naturally during robot-object interactions. The ambition of the project is to improve the robustness of an existing state-of-the-art award-winning 6D pose estimation system by addressing the problem of partial visibility via large-scale training on appropriately synthesized data. Breakthrough progress on this challenge would allow direct integration of the object pose detector into the robot control pipeline.
Hlavní řešitel: Rudolf Rosa
Projekt: THEaiTRE GPT2 Recycling
Alokace: 1 500 uzlohodin
Abstrakt: THEaiTRE is a research project which deals with automatically generating theatre play scripts using AI. The project is a collaboration of computational linguists with theatre experts and commemorates 100 years since the premiere of the first theatre play about robots, R.U.R. by Karel Capek. The project has already produced the first theatre play script, titled \AI: When a Robot Writes a Play\", which was from 90% generated by the THEaiTRobot 1.0 system, based on the OpenAI GPT2 language model. The play premiered in February 2021 through an online broadcast from the theatre and was viewed by thousands of spectators worldwide.We are interested in generating the theatre plays in Czech language, while the GPT2 language model is only available in English language. Therefore, for the first play, we had to use automated translation of the model outputs to obtain a Czech script. For the second play, we would like to instead generate the script directly in Czech language, for which we need a variant of the GPT2 model able to generate in Czech. We are thus trying to obtain such a model by “recycling” the model with large Czech data, using a method developed by de Vries and Nissim from University of Groningen. Generating the scripts directly in Czech should lead to a higher quality of the texts."
Hlavní řešitel: Karel Ondrej
Projekt: Knowledge Grounding for Open-Ended Language Generation
Alokace: 3 836 uzlohodin
Abstrakt: The Covid times transferred many everyday tasks such as communication, shopping, and more into the virtual world. It led to the overloading of various support lines, where the customer had to wait tens of minutes to process a trivial query. But what if we didn't need a person to answer a simple contract question or even switch to another energy provider? In these cases, the systems that can answer a question, conduct a dialogue with the user, and summarize long documents would help.In our project, we want to draw connections between information retrieval and related fields, such as dialogue generation, question answering, and abstractive summarization. It includes employing novel approaches for current research tasks such as knowledge-grounded multi-domain summarization and knowledge-grounded dialogue generation.
Hlavní řešitel: Karel Chvalovský
Projekt: GNNs for Reasoning Tasks
Alokace: 1 843 uzlohodin
Abstrakt: In automated reasoning and logic, one of the critical decision points is representing the input problem. There is a long history of using graph representations because they make it possible to express various relations naturally. However, stronger reasoning systems, which have a bigger expressive power, require complicated graphs with various types of nodes and edges. Recent advances in graph neural networks led to architectures that make it possible to experiment with such representations easily. Nevertheless, it is still open how to represent massive graphs that can quickly arise when we simultaneously handle many related problems. A usual bottleneck is insufficient memory of GPUs. We want to use the Karolina supercomputer to experiment with training various representations using multiple GPUs and nodes.
Hlavní řešitel: Lukáš Soukup
Projekt: Pokrocilé rozpoznávání a vytežování rucne psaných textu s využitím neuronových sítí
Alokace: 8 000 uzlohodin
Abstrakt: The project is focused on hand-written text recognition (HTR). In the research, we will follow the state-of-the-art approaches in HTR based on transformers. We will use an internal large-scale dataset of Czech historical documents. The outcome of the research shall help with the digitization of the hand-written documents.Registration number of the project is CZ.01.1.02/0.0/0.0/20_321/0024760. Official web page of the project https://www.inkcapture.com.
Hlavní řešitel: Petr Kouba
Projekt: Machine Learning for Molecular Dynamics Simulations
Alokace: 4 000 uzlohodin
Abstrakt: Molecular dynamics (MD) simulations allow analyzing the physical movements of biomolecules. The generated data are sequences of frames (in 100 000s) captured at a predefined time step. Each frame consists of positions of all the atoms of a protein (from 100s to 10 000s), which are simulated using a molecular mechanics force field. The analysis of such a massive amount of data is often challenging, especially for molecules with conformational heterogeneity, such as the disordered Abeta peptide and APOE, which are relevant for Alzheimer's disease (AD). Abeta peptide is the hallmark of the disease and adopts diverse conformations. APOE is believed to play an important role in Abeta clearance. Understanding the dynamic properties of both Abeta peptide and APOE is a key to determine the effects of drug candidates for potential AD treatment. The objective of this project is to apply existing machine learning tools (such as the VAMPnets neural network [1]) to analyze the MD simulations of Abeta and APOE systems and understand their dynamics.
Hlavní řešitel: Martin Vastl
Projekt: Deep learning for symbolic regression
Alokace: 5 000 uzlohodin
Abstrakt: Symbolic regression (SR) is a technique to find a model in the form of analytic equations describing given data. Typically, it has been realized using the genetic programming (GP) method, an evolutionary algorithm that evolves solutions through a stochastic process mimicking the natural evolution [1-2]. A disadvantage of this approach is that a new model has to be created from scratch through a time-consuming evolutionary process for each data set. Moreover, no knowledge acquired in these independent runs is reused in subsequent runs. In this project, we will investigate another SR approach based on the idea that one can use transformers, designed for sequence-to-sequence learning, to translate a set of data points into a proper analytic model [3-4]. This translation-based model-producing process can be much less computationally expensive than the evolutionary GP-based one.
Hlavní řešitel: Jan Pichl
Projekt: Alquist: Conversational Artificial Intelligence
Alokace: 2 000 uzlohodin
Abstrakt: The open domain-dialogue system Alquist aims to conduct a coherent and engaging conversation that can be considered one of the benchmarks of social intelligence. One of the critical parts of the system is the set of generative models that can handle various user utterances and generate meaningful responses. The generative models are trained on extensive conversational data with a goal to generate the most suitable response given the conversation context. Optionally, additional input information can be provided, e.g., knowledge graphs. Our research goal is to train conditioned generative models to have more control over the response of the model. We plan to use various additional inputs such as dialogue acts, topics, knowledge graphs to create a more focused generative model.
Hlavní řešitel: Jan Hula
Projekt: SKINNER
Alokace: 1 000 uzlohodin
Abstrakt: This project is focused on training a large neural network to predict the behavior of insects. Recently, after the success of large language models, it became apparent that neural networks with hundreds of millions or even billions of parameters trained on large datasets can predict the behavior of very complicated functions. Such language models trained to predict individual words in a sentence are able to complete a whole paragraph of plausibly looking text from just one sentence. Our goal is to repeat this success in the domain of animal behavior modeling. Instead of predicting individual words, we would be predicting the behavior of a given animal based on past behavior and the environment. Concretely, we want to predict the behavior of individuals from an order of insects called Orthoptera. These insects exhibit repeating behavioral patterns and at the same time are very active which makes them a good target for testing our model. We aim to train a large neural network to predict individual steps in this time series, conditioned on the history and on the current state of the environment. The resulting model will be released and could be used by researchers studying animal behavior to, for example, detect repeating behavioral patterns. It should also serve as an inspiration for similar models trained on other species.
Hlavní řešitel: Martin Kišš
Projekt: Universal text recognition models
Alokace: 2 000 uzlohodin
Abstrakt: Project PERO develops methods and software tools for automatic handwritten text recognition which are actively used by libraries and archives in Czech Republic to transcribe their document collections. The automatic transcriptions enable, for example, full text search of handwritten documents in existing public portals such as digitalniknihovna.cz. The project provides open source OCR package for Python, REST API for large-scale document recognition and free of charge web application (pero-ocr.fit.vutbr.cz).Large multi-GPU computation nodes in Karolina supercomputer allow us to train very large neural networks for universal text transcription – for large number of languages, handwriting scripts, printed font families and alphabets. Such universal models are able to utilize the similarities between languages and visual character representations and thus achieve higher text transcription accuracy especially for poor quality handwritten documents.
Hlavní řešitel: Tomas Soucek
Projekt: Weakly supervised learning for video understanding
Alokace: 8 000 uzlohodin
Abstrakt: Building machines that can automatically understand complex visual inputs is one of the central problems in artificial intelligence with applications in autonomous robotics, automatic manufacturing or healthcare. The problem is difficult due to the large variability of the visual world. The recent successes are, in large part, due to a combination of learnable visual representations based on neural networks, supervised machine learning techniques and large-scale Internet image collections. The next fundamental challenge lies in developing visual representations that do not require full supervision in the form of inputs and target outputs, but are instead learnable from only weak supervision that is noisy and only partially annotated data. This project will address this challenge and will develop video representations learning from only weak annotations. More details are on http://impact.ciirc.cvut.cz/
Hlavní řešitel: Tomáš Jenícek
Projekt: Day-night image retrieval
Alokace: 2 000 uzlohodin
Abstrakt: Image retrieval is an important and active area in computer vision. The task is to query-by-image in a large indexed collection of images, where the search is based purely on the image content. Applications include content-based browsing and search in large image collections, visual localization, image annotation, data collection for 3D reconstruction, and many others. The current state-of-the-art retrieval methods are based on Deep Neural Networks which are trained on a GPU.In our project, we address image retrieval under significant illumination changes, such as between day and night images, where the appearance changes dramatically. This is currently an active field of research because of its applicability for real-world tasks such as autonomous driving.
Hlavní řešitel: Lukáš Neumann
Projekt: Architecture Search for Deep Learning
Alokace: 2 000 uzlohodin
Abstrakt: Thanks to deep learning techniques, Artificial Intelligence field made tremendous progress in the past years, and as a result has been successfully applied to many practical tasks in computer vision, natural language processing (NLP), speech recognition, etc.Whilst training deep networks relies on machine learning algorithms, which given training dataoptimize network parameters for a specific task, the network architecture (i.e. the networkconnectivity pattern and as well as the operations used by individual network nodes) is almost always hand-crafted by human, using trial and error approach. This also applies to the most popular network architectures such as ResNet [1] or Transformers [2], including commercially exploited network architectures like GPT-2 [3].Architecture Search methods aim to overcome this limitation, by proposing algorithms which systematically explore possible deep network architectures and automatically find the most promising architectures. The main challenge of such algorithms is the fact that the search space of all possible deep network architectures is exponentially large, which makes naïve methods such as exhaustive search impossible to use, and therefore an efficient search strategy has to be applied to quickly discard architectures which do not lead to good accuracy, and to focus only on the most promising network architectures.
Hlavní řešitel: Oldrich Plchot
Projekt: Training neural models for speech applications
Alokace: 8 000 uzlohodin
Abstrakt: In the last several years, the whole field of speech processing has benefitted from the rapid progress of machine learning techniques which came hand in hand with continuously increasing computational power. The Speech@FIT group produced excellent results utilizing the FIT BUT computing infrastructure, but the recent shift to big data and large neural models had multiplied needed computational power and negatively impacted the group's ability to compete with peer research institutions or companies. Computational resources provided by IT4I infrastructure will be used to enable and accelerate the research on big data in key areas of speech processing such as automatic speech recognition (ASR), speaker verification (SV), Diarization, Language Identification (LID), source separation, and keyword spotting. The test benchmarks will include training of state-of-the-art models in several fields of speech processing (mostly SV, diarization, ASR).
Hlavní řešitel: Jonáš Kulhánek
Projekt: Neural 3D Scene Representation
Alokace: 8 000 uzlohodin
Abstrakt: In contrast to classical scene representations, which are designed by humans, e.g., in the form of a 3D model, neural scene representations are learned completely from data. Recent work has shown that such representations can be very powerful, enabling highly accurate rendering of complex and intricate 3D geometry and to compute the position and orientation from which a given photo was taken (which has a wide range of applications, including self-driving cars and Augmented Reality). These two tasks (neural rendering and visual localization) are inherently related: given the position and orientation estimated by a localization algorithm, neural rendering can be used to render a virtual view from the predicted viewpoint. If the real and rendered image do not align precisely, this information can be used to refine the viewpoint prediction. Rather than training separate neural scene representations for rendering and localization, this project aims to make use of their inherent connection by training a single representation for both tasks. We expect that jointly training for both tasks will lead to a more powerful representation, which has the added benefit of reducing resource consumption (using a single instead of two representations).
Hlavní řešitel: Luboš Šmídl
Projekt: Acoustic transformers for speech recognition
Alokace: 8 000 uzlohodin
Abstrakt: Motivated by a human brain and by how children learn new skills, deep neural networks became very powerful in solving very hard NLP tasks (such as speech recognition) while being conceptually simpler. The training of such large neural networks is possible only due to a huge amount of unlabeled data available on the Internet together with the stunning computing power of modern GPU clusters.The aim of the project is to use the power of a node with 8 A100 GPU for training acoustic transformers for speech recognition.Currently, the use of transformer technologies (T5, BERT) in the field of natural language processing is popular and challenging. Acoustic transformers are based on a similar principle - with the help of large computing power and a huge amount of untranscribed data, a representation in the latent space (embedding space) is found. This representation can then be used in the tasks of speech recognition, speaker detection, diarization, keyword detection, query by example etc.
Hlavní řešitel: Radim Špetlík
Projekt: Weak Signal Analysis in RGB Images
Alokace: 8 000 uzlohodin
Abstrakt: Glass-reflection removal, or glass-glare removal, is a problem of significant practical importance with applications ranging from license plate reading [1] to digital cleaning of camera optics [2]. Given a single photo, the task is to remove reflections, or glares, without affecting the background. Much research in recent years has focused on instances of reflection removal constrained by specific qualitative attributes, such as requirement of full-screen reflection [6,7,8]. Glass-glare removal research has assumed constraints by development environment [3,1], or requirements of additional specialized hardware [4,5,2].We address the general problem of reflection, or glare, removal aiming at reducing heavy constrains required by previous work with focus on the whole spectra of the problem – we consider reflections of all sizes and strengths and we only require a single RGB image as an input.
Hlavní řešitel: Jan Lehecka
Projekt: Text Transformers for NLP tasks
Alokace: 8 000 uzlohodin
Abstrakt: In the last few years, deep neural networks known as Transformers have dominated the research field of Natural Language Processing (NLP). These highly sophisticated models benefit from the combination of a huge amount of unlabeled text data available on the Internet, self-supervised training methods motivated by learning skills of a human brain and still increasing computational power of high-end GPU and TPU clusters.The main idea behind text-based Transformers is to let the model read as much text as possible during the pre-training phase in order to learn high-level language representations of individual words while paying sophisticated attention to its context. This is performed by pre-training the model on artificial tasks based on repairing corrupted or perturbed inputs. After that, the model is able to generate contextual embeddings encoding both syntax and semantics of the input text. These embeddings can be easily fine-tuned to solve a large variety of NLP tasks, such as text classification, text generation, chatbots etc.Our research is focused on three types of Transformers: (1) Encoders are suitable for text classification tasks (RoBERTa [1]), (2) Decoders are suitable for generation of text (GPT-2 [2]) and (3) Encoder-Decoder models solve text-to-text problems (T5 [3]). Our models have already scored state-of-the-art results in several Czech NLP tasks, including text classification [4], sentiment analysis [5] or post-processing of ASR output [6].