A few days ago I had to deflate some compressed content using Rust. I quickly found the flate2 crate, it is well maintained and has a very high usage. I decided it would be a great fit for my little project, a CLI tool to download, decompress, and store in a local SQLite3 some content from AWS S3.
I wrote my small CLI tool, and it seemed to work perfectly on the first run. It was all nice until some time later. I have noticed that some lines of my compressed file were missing. The code was running fine and didn’t provide any warnings. After a few hours of debugging, I was sure there was a bug in my code somewhere. I went on and wrote a bunch more test to try to isolate in a reproducible way, where the problem was present, without any luck. All tests passing and no indication of problems anywhere.
At some point, I decided to use a sample from the original compressed files, and that is how I managed to reproduce the bug. The sample code in the README of the flate2 crate was what I used in my code:
use std::io::prelude::*;
use flate2::read::GzDecoder;
fn main() {
let mut d = GzDecoder::new("...".as_bytes());
let mut s = String::new();
d.read_to_string(&mut s).unwrap();
println!("{}", s);
}
Unfortunately, the GzDecoder
cannot successfully decompress my files. I went then to the official library repository and found this issue which describes exactly my problem. I had to use the MultiGzDecoder
instead, it all worked as expected after this change and I could decompress successfully my files. The example in the README of flate2 should probably be this:
use std::io::prelude::*;
use flate2::read::GzDecoder;
fn main() {
let mut d = MultiGzDecoder::new("...".as_bytes());
let mut s = String::new();
d.read_to_string(&mut s).unwrap();
println!("{}", s);
}
Finally, I still don’t know why the MultiGzDecoder
works when the GzDecoder
don’t. If you know, I will update this blog with your answer.