papers_0420 - Motion Magnification Reviews are sorted by questions with individual responses separated by dashes. 1) Briefly describe the paper and its contribution to computer graphics and interactive techniques. Please give your assessment of the scope and magnitude of the paper's contribution. ------------ Reviewer 1 ----------------------- This paper describes a technique for exxagerating small motions in video. It is the first paper to seriously address this problem. ------------ Reviewer 2 ----------------------- The paper describes a technique for magnifying motion in a video scene. The technique involves registering and segmenting the images automatically, with the segmentation based on similarity in position, color, and motion. Holes are filled in behind the different segments. The user then specifies a segment whose motion should be magnified, and the system automatically produces the resulting video with exaggerated motion. ------------ Reviewer 3 ----------------------- This paper presents a method to magnify motion in videos. The entire process is fairly automatic, though computationally intensive. This is a good application paper, combining many published techniques to produce the results. There is a fair amount of attention to detail, though some artifacts are still obvious. This is a good start for an interesting application. But, the results are not of high enough quality. ------------ Reviewer 4 ----------------------- This paper shows how to magnify small motions in a video sequence by automatically computing accurate layered motion estimates and then re-synthesizing the sequence with one layer's motion amplified and the other layers copied and filled in, if necessary. The layered motion algorithm is the most sophisticated one to date to be presented in the computer vision literature, and relies on a series of steps that include feature point tracking with adaptive regions of support, motion clustering based on common temporal correlation, and a GrabCut-like image segementation that uses motion and color cues (as well as spatial continuity). The resulting demo of the amplified swing set is quite compelling, although some artifacts do remain. ------------ Reviewer 5 ----------------------- This paper presents a new technique for segmenting a video into coherent motion layers, and "magnifying" the motion of a specific layer. It's a new technique and fits very well into SIGGRAPH 2) Is the exposition clear? How could it be improved? ------------ Reviewer 1 ----------------------- The exposition is mostly clear, though there are alot of pieces to this puzzle and the technical discussion feels like a laundry list. It's not clear why all the pieces were chosen as they were. ------------ Reviewer 2 ----------------------- Yes, the paper is very nicely written. ------------ Reviewer 3 ----------------------- It is very clear. Details are provided. ------------ Reviewer 4 ----------------------- Yes, the paper is clearly written, although there are a lot of details to plow through. ------------ Reviewer 5 ----------------------- Clear to me. But I think too many unnecc. formulars are used for the motion estimation, EM , etc. Or some can be moved into the appendix. I understand those, because I know that literature in-and-out and know where they come from, but for a less familiar reader, maybe a different exposition might be better. 3) Are the references adequate? List any references that are needed. ------------ Reviewer 1 ----------------------- I would reference "Cartoon-style Rendering of Motion from Video", by Collomosse, since they show exxagerations of motions in video. Nothing on this scale, though. ------------ Reviewer 2 ----------------------- Yes. ------------ Reviewer 3 ----------------------- References good enough. ------------ Reviewer 4 ----------------------- Yes, but pleaes give page numbers for ALL of your references. ------------ Reviewer 5 ----------------------- Adequate. 4) Could the work be reproduced by one or more skilled graduate students? Are all important algorithmic or system details discussed adequately? Are the limitations and drawbacks of the work clear? ------------ Reviewer 1 ----------------------- It seems mostly reproducible. The artifacts are partially discussed by the authors. The authors mention histogram equalization, but don't say how they do it; maybe just add a reference here? ------------ Reviewer 2 ----------------------- Yes, yes, and yes. ------------ Reviewer 3 ----------------------- It could be reproduced by a skilled graduate student. Details are discussed enough. There are not enough presentation of limitations and artifacts of the algorithm. Also, only 3 examples are shown. ------------ Reviewer 4 ----------------------- With difficulty. The authors describe a very complex multi-stage motion estimation algorithm with lots of parameters and potential variants (e.g., which spectral clustering method?). However, this should not be a cause of rejection, as other (less reliable) motion algorithms can also be used for the same effect. =20 Some limitations of the algorithm are discussed, although not in great detail. For example, in the swing video, a portion of the yellow awning pulls away from the rest of the awning, and this is not discussed. ------------ Reviewer 5 ----------------------- Yes, think so 7) Explain your rating by discussing the strengths and weaknesses of the submission. Include suggestions for improvement and publication alternatives, if appropriate. Be thorough -- your explanation will be of highest importance for any committee discussion of the paper and will be used by the authors to improve their work. Be fair -- the authors spent a lot of effort to prepare their submission, and your evaluation will be forwarded to them during the rebuttal period. ------------ Reviewer 1 ----------------------- This is very cool idea and paper, but there are some reservations I have that caused me to give it a middling score. I feel bad about that; this project has so much potential! For one, the problem is not well motivated enough in the introduction. At first, I asked myself, "Why would anyone want to do this?". Eventually, I realized that there are so many reasons, but the authors need to do abetter job of explaining it. Exxageration is a common technique in visual communication; you could, for example, reference Lasseter's classic paper on exxageration techniques for computer animation. You could also talk about education; this would be a great learning aid in teaching kids about physics. Finally, you could talk about the history of photography communicating scientifc phenomonena. Photographers like Muybridge and Edgerton captured phenomena too quick to see with the naked eye, like a bullet through an apple or a galloping horse. You are communicating motion too SMALL to see with the naked eye, like the minute adjustments of a person balancing on their hands. Very cool! You should put this work in that context to better motivate it. My second big problem is the prevalence of artifacts in the results. The swing set result especially is poor; the dynamic areas seem to swim over the static areas, and several beams and the yellow awning split and have very visible seams. This is a hard problem and I don't expect perfection. But, I wish the authors had added more interaction to the process, rather than just accepting poor results. SIGGRAPH doesn't mind interaction! For example, the segmentation seems to be the culprit of many of the artifacts. Interactive segmentation techniques are now very powerful and quick. Given that your technique takes 10 hours to compute anyways, I don't think any reviewer would mark you down for letting the user spend a couple minutes authoring/improving the segmentation. I doubt your average consumer, who might expect complete automation, would WANT to use motion magnification. A professional creating motion exxageration for scientific visualization or education, on the other hand, would be willing to trade a few minutes of work for improvement in results. Criticisms aside, I really like the bookcase and hand-stand examples. I really get a sense for the motions that occur here, and these are really great examples to show why one would magnify motion. If this paper doesn't get accepted, I hope you spend time adding interaction, improving results, and improving your introduction to better motivate the problem. Then submit to SIGGRAPH 2006! ------------ Reviewer 2 ----------------------- This is very nice work: an appealing, novel problem, along with a collection of fairly sophisticated techniques, which, when taken together, provide an elegant solution that seems to work pretty well. The results are impressive, though not flawless. On the whole, I think the paper should most probably be accepted. Here are the things I would try to improve though for the final version: * It's a little bit disappointing that there are only three examples; I would like to see at least one more. Also, it would be nice to see the complete result in each case: For the bookshelf example, only a split-screen comparison is shown -- why not show the full result, followed by the comparison? For the handstand, it would be nice to see a longer example, perhaps starting earlier in the handstand and ending later. * There are still quite a few objectionable segmentation artifacts, which are also mentioned in the paper. For SIGGRAPH-quality results, it would be really nice to try to eliminate these somehow, even if you have to resort to some user interaction to achieve it -- in this case, you could show a before- and after-touching-up pair of videos. (Of course, improving the fully automatic segmentation would be even better, if it's possible.) * I think the paper could do a better job of explicitly stating when it is adapting an off-the-shelf technique and when it is introducing a novel approach, for each step of the algorithm. Right now, there is no easy way for the reader to tell unless he/she is deeply versed in the computer vision and machine learning techniques being used. I don't think this argues too strongly against acceptance though because even if all of the techniques are off-the-shelf, their assembly for the purposes of this new problem is still an interesting contribution. But it would be nice to know exactly what is new and what isn't. ------------ Reviewer 3 ----------------------- This paper is mainly talking about small motions. Why focus on small motion alone? The application is more interesting and useful if all the motion layers can be captured. I suppose the main focus on small motion is that the motion segmentation technique cannot deal with larger scale motion. Feature trackings will fail completely for the swinging person in sequence 1. A uniform framework for all types of motion will make this paper much stronger. There are quite a few artifacts in the segmetnation stage. This is particularly obvious in the swing sequence. The swing set is broken into many different parts. The top horizontal bar is segmented into two parts, one glued to the stationary background layer. There is much to be improved in the segmetnation algorithm. The whole filling also has serious artifacts. In the bookshelf sequence, the bottom of the upper shelf is "filled-in" with the wrong color. The shadowing effects are not respected. This unpleasant artifact is fairly obvious. Overall, the paper addresses an interesting application. However, the algorithm and domain is not yet mature for publication in Siggraph. ------------ Reviewer 4 ----------------------- This is a very nice and ambitious piece of work that significantly advances the state of the art in layered motion estimation (especially for small motions) and has a neat application to visual effects, which is a traditional application area at SIGGRAPH. The idea of amplifying motions has not been suggested before, and yet I can think of other appliations (e.g., in visual inspection and analysis) where it may be useful. It would be great to have other applications or examples that have even more "wow" factor than the swing set, and I'm hoping the authors will come up with some before August. The biggest question to decide is probably: is this appropriate for SIGGRAPH? Given the large number of recent papers that are about video and image processing for editing and visual effects applications, I believe that the answer is yes. The quality of the work is definitely high enough, and I think the interest is there for a large segment of the SIGGRAPH community, especially since we seem to have pioneered, and now embraced, the whole field of computational photography and video. ------------ Reviewer 5 ----------------------- I believe the meat of the paper is the new improved layered estimation algorithm. The motion magnification is just one application of this new representation. I am very impressed by the layered results, given they a fully automatic. Given previous systems, they achieved a lot of improvements. I also like the idea of using the "clustering by coherent motion" using [26]. Most previous layered (all of them to best of my knowledge) do not use this clustering, they just apply either affine or projective clustering on 2 consecutive frames. Incorpoarting [26] into the layered framework might have also a lot of use for other applkications. About the all the E/M step and likelihood estimation: Hmm, if you are familiar with the equations from previous papers, then its no problem to understand them. If you are not, maybe they should either skip it, or have a more "gentle" introduction. Now: I said, the layered results are very good, given its automatic. BUT: they are not good compared to other segmentations in siggraph papers that use semi-automatic techniques. That has some influence on the motion maginification. The motion maginification overall looks good too, but there are still some artefacts to see. Other papers that are based on segmentation don't show those artifacts (as I said, they use semi-automatic techniques with some hand-labels). The limited rendering quality due to those small errors are my main concern of the paper. 8) List here any questions that you want answered by the author(s) during the rebuttal period.