A Dive into Vision-Language Models | Pasteblog