The Computer Vision and Pattern Recognition (CVPR) conference was last week in Boston. For the sake of the computer vision folk (at least in my group), I created a summary/highlights document of some paper selections here: http://web.mit.edu/zoya/www/CVPR2015brief.pdf
It takes an hour just to read all the titles of all the sessions - over 120 posters/session, 2 sessions a day, 3 days... and workshops. This field is MONSTROUS in terms of output (and this is only the 20% or so of papers that actually make it to the main conference).
Thus, having a selection of papers instead of all of them becomes at least a tiny bit more manageable.
The selections I made are roughly grouped by topic area, although many papers fit in more than one topic, and multiple might not be optimally grouped - but hey, this is how my brain sees it.
The selection includes posters I went to see, so I can vouch that they are at least vaguely interesting. For some of them I also include a few point-form notes, which are likely to help with navigation even more.
Here's my summary of the whole conference:
I saw a few main lines of work throughout this conference: CNNs applied to computer vision problem X, metric for evaluating CNNs applied to computer vision problem X, new dataset for problem X (many times larger than previous, to allow for application of CNNs to problem X), new way of labeling the data for the new dataset for CNNs.
In summary, CNNs are here to stay. At this conference I think everyone realized how many people are actually working on CNNs... there have been arxiv entries popping up all over, but once you actually find yourself in a room full of CNN-related posters, it really hits you. I think many people also realized how many other groups are working on the exact same problems, thinking about the exact same issues, and planning on the exact same approaches and datasets. It's become quite crowded.
So this year was the CNN hammer applied to just about any vision problem you can think of - setting new baselines and benchmarks left and right. You're working on an old/new problem? Have you tried CNNs? No? The crowd moves on to the next poster that has. Many papers have "deep" or "nets" somewhere in the title, with a cute way of naming models applied to some standard problem (ShapeNets, DeepShape, DeepID, DevNet, DeepContour, DeepEdge, segDeep, ActivityNet). See a pattern? Are these people using vastly different approaches to solve similar problems? Who knows.
So what is the field going to do next year? Solve the same problem with the next hottest architecture? R-CNNs? even deeper? Some new networks with memory and attention modules? More importantly, do results get outdated the moment the papers are submitted because the next best architecture has already been released somewhere on arxiv, waiting for new benchmarking efforts? How do we track whether the numbers we are seeing reported are the latest numbers there are? Are papers really the best format to present this information and communicate progress?
These new trends in the computer vision are leaving us to think about a lot of very hard questions. It's becoming increasingly hard to predict where the field's going in a year, let alone a few years from now.
I think there are two emerging trends right now: more industry influence (all the big names seem to be moving to Google and Facebook), and more neuroscience influence (can the networks tell us more about the brain, and what can we learn about the brain to build better networks?). These two forces are beginning to increasingly shape the field. Thus, closely watching what these two forces have at their disposal might offer glimpses into where we might be going with all of this...