본문 바로가기

논문 리뷰17

[X:AI] Flamingo 논문 리뷰 Flamingo: a Visual Language Model for Few-Shot Learning 🦩논문 원본 : https://arxiv.org/abs/2204.14198 Flamingo: a Visual Language Model for Few-Shot LearningBuilding models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research. We introduce Flamingo, a family of Visual Language Models (VLM) with this abili.. 2024. 8. 26.
[X:AI] DDPM 논문 리뷰 Denoising Diffusion Probabilistic Models 논문 원본 : https://arxiv.org/abs/2006.11239 Denoising Diffusion Probabilistic ModelsWe present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational boundarxiv.org 1. Abstrac.. 2024. 8. 11.
[X:AI] BYOL 논문 리뷰 Bootstrap Your Own Latent A New Approach to Self-Supervised Learning논문 원본 :  https://arxiv.org/abs/2006.07733 Bootstrap your own latent: A new approach to self-supervised LearningWe introduce Bootstrap Your Own Latent (BYOL), a new approach to self-supervised image representation learning. BYOL relies on two neural networks, referred to as online and target networks, that interact and learn from.. 2024. 8. 5.
[X:AI] NeRF 논문 리뷰 NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis논문 원본 : https://arxiv.org/abs/2003.08934 NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisWe present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorit.. 2024. 7. 28.
[X:AI] Detr 논문 리뷰 End-to-End Object Detection with Transformers Abstract본 논문에서는 Object Detection을 direct set prediction problem으로 보고 있음해당 접근 방식은 Prior Knowledge (NMS or anchor generation)를 사용하지 않아 detection pipeline 간소화DETR(DEtection TRansformer) 주요 요소는 set-based global loss를 기반으로 한  bipartite matching과 Transformer Encoder-Decoder ArchitectureFast R-CNN과 유사한 정확도와 panoptic segmentation에서도 활용할 수 있을 정도로 잘 generalize.. 2024. 7. 23.
[X:AI] MOFA-Video 논문 리뷰 MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model논문 원본 :  https://arxiv.org/abs/2405.20222 MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion ModelWe present MOFA-Video, an advanced controllable image animation method that generates video from the given image using.. 2024. 7. 20.