¶Today's headache: digital cameras
If you value your sanity, don't accept external files in your own programs. Parsing someone else's output is just going to give you headaches.
Okay, it's not realistic, but I can dream, can't I?
Digital cameras that produce AVI files, particularly Motion JPEG encoded ones, have been a bit of a problem for me because they sometimes produce files that are marginal or non-compliant. One common problem is that the video stream contains JPEG frames with custom Huffman tables (DHT markers), which according to Microsoft's original MJPEG spec you're not supposed to do. Instead, the Huffman tables are omitted and fixed for speed and simplicity. VirtualDub's internal MJPEG decoder was written with this in mind, so it won't decode streams that have custom Huffman tables, and so if you don't have an MJPEG codec installed you'll get decode errors. I haven't gotten around to rewriting the decoder so that it can handle custom tables, since it was sort of meant to be a fallback to begin with.
Anyway, I received a sample file today from another digital camera that has a new problem, producing broken AVI files. This time it isn't the video stream, but the RIFF structure itself: the data after the first video frame (00dc) chunk is just garbage. If you open it in VirtualDub or a standard video player, it plays fine, because the outermost part of the RIFF structure is fine and that's enough to get to the index. Usually, AVI parsers use the index whenever they can, and thus they'll read the file since the index points directly to the frames. Anything that tries walking the RIFF structure, though, will barf because it's invalid within the LIST/movi chunk that holds the frames. Dumping the file, it looks like whoever wrote the camera firmware decided to save five minutes by just seeking to the next sector boundary instead of actually writing a proper JUNK padding chunk. Sigh.
AVI, like many formats, suffers from decay due to the "works well enough" effect. In fact, just about any format in popular use will have this problem when the programs that read it don't do strict validation and the people who use those programs don't care about conformance. It's like having to deal with XML files where the angle brackets don't match because the person that wrote it figured that everyone uses regexes to parse it anyway.