Java压缩技术(六) BZIP2——Commons实现,bzip2commons
分享于 点击 29384 次 点评:242
Java压缩技术(六) BZIP2——Commons实现,bzip2commons
相关链接:Java压缩技术(一) ZLib
Java压缩技术(二) ZIP压缩——Java原生实现
Java压缩技术(三) ZIP解压缩——Java原生实现
Java压缩技术(四)
GZIP——Java原生实现
Java压缩技术(五) GZIP相关——浏览器解析
Java压缩技术(六) BZIP2——Commons实现
Java压缩技术(七) TAR——Commons实现
BZip2与GZip有什么渊源,我这里不深究。我要说的只是,这两种算法,你在linux下都可以找到相应的操作命令。
GZip
压缩
gzip <file> 将得到压缩文件<file>.gz,同时删除文件<file>
解压缩
gzip -d <file>.gz 将得到压缩文件<file>,同时删除文件<file>.gz
BZip2与之相当,几乎没有什么差别~~
BZip2
压缩
bzip2 <file> 将得到压缩文件<file>.bz2,同时删除文件<file>
解压缩
bzip2 -d <file>.bz2 将得到压缩文件<file>,同时删除文件<file>.bz2
除了命令不同外,几乎是一样的!
再说实现。GZIP是JDK自带的算法实现,但BZip2则不曾享受这个待遇。 不过,强大的Apache坚决不会让这些个在Linux下如鱼得水的算法在Java世界中销声匿迹。Apache在Commons Compress中提供了相应的实现。同时,还包括众所周知的tar、cpio、zip等算法实现,其中最为丰富的当属zip实现了!
我继续依葫芦画瓢~~~
BZip2CompressorOutputStream类用于压缩
BZip2CompressorInputStream类用于解压缩
先说压缩实现,BZip2CompressorOutputStream只有一个方法用于压缩,就是带定长的write方法。简单调用如下文所示:
Java代码
- /**
- * 数据压缩
- *
- * @param is
- * @param os
- * @throws Exception
- */
- public static void compress(InputStream is, OutputStream os)
- throws Exception {
- BZip2CompressorOutputStream gos = new BZip2CompressorOutputStream(os);
- int count;
- byte data[] = new byte[BUFFER];
- while ((count = is.read(data, 0, BUFFER)) != -1) {
- gos.write(data, 0, count);
- }
- gos.finish();
- gos.flush();
- gos.close();
- }
与GZip实现有何差别?除了换掉了GZIPOutputStream没有任何差别。
解压缩就更不用说了,BZip2CompressorInputStream提供了一个带定长的read方法。简单调用如下文所示:
Java代码
- /**
- * 数据解压缩
- *
- * @param is
- * @param os
- * @throws Exception
- */
- public static void decompress(InputStream is, OutputStream os)
- throws Exception {
- BZip2CompressorInputStream gis = new BZip2CompressorInputStream(is);
- int count;
- byte data[] = new byte[BUFFER];
- while ((count = gis.read(data, 0, BUFFER)) != -1) {
- os.write(data, 0, count);
- }
- gis.close();
- }
嗯,没什么难度!
IT这行就是这样,只要你肯用心,能触类旁通,就能融会贯通!
给一个完整实现:
Java代码
- /**
- * 2010-4-15
- */
- package org.zlex.commons.compress.compress;
- import java.io.ByteArrayInputStream;
- import java.io.ByteArrayOutputStream;
- import java.io.File;
- import java.io.FileInputStream;
- import java.io.FileOutputStream;
- import java.io.InputStream;
- import java.io.OutputStream;
- import org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream;
- import org.apache.commons.compress.compressors.bzip2.BZip2CompressorOutputStream;
- /**
- * BZip2工具
- *
- * @author <a href="mailto:zlex.dongliang@gmail.com">梁栋</a>
- * @since 1.0
- */
- public abstract class BZip2Utils {
- public static final int BUFFER = 1024;
- public static final CharSequence EXT = ".bz2";
- /**
- * 数据压缩
- *
- * @param data
- * @return
- * @throws Exception
- */
- public static byte[] compress(byte[] data) throws Exception {
- ByteArrayInputStream bais = new ByteArrayInputStream(data);
- ByteArrayOutputStream baos = new ByteArrayOutputStream();
- // 压缩
- compress(bais, baos);
- byte[] output = baos.toByteArray();
- baos.flush();
- baos.close();
- bais.close();
- return output;
- }
- /**
- * 文件压缩
- *
- * @param file
- * @throws Exception
- */
- public static void compress(File file) throws Exception {
- compress(file, true);
- }
- /**
- * 文件压缩
- *
- * @param file
- * @param delete
- * 是否删除原始文件
- * @throws Exception
- */
- public static void compress(File file, boolean delete) throws Exception {
- FileInputStream fis = new FileInputStream(file);
- FileOutputStream fos = new FileOutputStream(file.getPath() + EXT);
- compress(fis, fos);
- fis.close();
- fos.flush();
- fos.close();
- if (delete) {
- file.delete();
- }
- }
- /**
- * 数据压缩
- *
- * @param is
- * @param os
- * @throws Exception
- */
- public static void compress(InputStream is, OutputStream os)
- throws Exception {
- BZip2CompressorOutputStream gos = new BZip2CompressorOutputStream(os);
- int count;
- byte data[] = new byte[BUFFER];
- while ((count = is.read(data, 0, BUFFER)) != -1) {
- gos.write(data, 0, count);
- }
- gos.finish();
- gos.flush();
- gos.close();
- }
- /**
- * 文件压缩
- *
- * @param path
- * @throws Exception
- */
- public static void compress(String path) throws Exception {
- compress(path, true);
- }
- /**
- * 文件压缩
- *
- * @param path
- * @param delete
- * 是否删除原始文件
- * @throws Exception
- */
- public static void compress(String path, boolean delete) throws Exception {
- File file = new File(path);
- compress(file, delete);
- }
- /**
- * 数据解压缩
- *
- * @param data
- * @return
- * @throws Exception
- */
- public static byte[] decompress(byte[] data) throws Exception {
- ByteArrayInputStream bais = new ByteArrayInputStream(data);
- ByteArrayOutputStream baos = new ByteArrayOutputStream();
- // 解压缩
- decompress(bais, baos);
- data = baos.toByteArray();
- baos.flush();
- baos.close();
- bais.close();
- return data;
- }
- /**
- * 文件解压缩
- *
- * @param file
- * @throws Exception
- */
- public static void decompress(File file) throws Exception {
- decompress(file, true);
- }
- /**
- * 文件解压缩
- *
- * @param file
- * @param delete
- * 是否删除原始文件
- * @throws Exception
- */
- public static void decompress(File file, boolean delete) throws Exception {
- FileInputStream fis = new FileInputStream(file);
- FileOutputStream fos = new FileOutputStream(file.getPath().replace(EXT,
- ""));
- decompress(fis, fos);
- fis.close();
- fos.flush();
- fos.close();
- if (delete) {
- file.delete();
- }
- }
- /**
- * 数据解压缩
- *
- * @param is
- * @param os
- * @throws Exception
- */
- public static void decompress(InputStream is, OutputStream os)
- throws Exception {
- BZip2CompressorInputStream gis = new BZip2CompressorInputStream(is);
- int count;
- byte data[] = new byte[BUFFER];
- while ((count = gis.read(data, 0, BUFFER)) != -1) {
- os.write(data, 0, count);
- }
- gis.close();
- }
- /**
- * 文件解压缩
- *
- * @param path
- * @throws Exception
- */
- public static void decompress(String path) throws Exception {
- decompress(path, true);
- }
- /**
- * 文件解压缩
- *
- * @param path
- * @param delete
- * 是否删除原始文件
- * @throws Exception
- */
- public static void decompress(String path, boolean delete) throws Exception {
- File file = new File(path);
- decompress(file, delete);
- }
- }
对应再来个测试用例,测试用例如下所示:
Java代码
- /**
- * 2010-4-13
- */
- package org.zlex.commons.compress.compress;
- import static org.junit.Assert.assertEquals;
- import java.io.DataInputStream;
- import java.io.File;
- import java.io.FileInputStream;
- import java.io.FileOutputStream;
- import org.junit.Test;
- /**
- * BZip2
- *
- * @author <a href="mailto:zlex.dongliang@gmail.com">梁栋</a>
- * @since 1.0
- */
- public class BZip2UtilsTest {
- private String inputStr = "zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org";
- @Test
- public final void testDataCompress() throws Exception {
- byte[] input = inputStr.getBytes();
- System.err.println("原文:\t" + inputStr);
- System.err.println("长度:\t" + input.length);
- byte[] data = BZip2Utils.compress(input);
- System.err.println("压缩后:\t");
- System.err.println("长度:\t" + data.length);
- byte[] output = BZip2Utils.decompress(data);
- String outputStr = new String(output);
- System.err.println("解压缩后:\t" + outputStr);
- System.err.println("长度:\t" + output.length);
- assertEquals(inputStr, outputStr);
- }
- @Test
- public final void testFileCompress() throws Exception {
- FileOutputStream fos = new FileOutputStream("d:/f.txt");
- fos.write(inputStr.getBytes());
- fos.flush();
- fos.close();
- BZip2Utils.compress("d:/f.txt");
- BZip2Utils.decompress("d:/f.txt.bz2");
- File file = new File("d:/f.txt");
- FileInputStream fis = new FileInputStream(file);
- DataInputStream dis = new DataInputStream(fis);
- byte[] data = new byte[(int) file.length()];
- dis.readFully(data);
- fis.close();
- String outputStr = new String(data);
- assertEquals(inputStr, outputStr);
- }
- }
虽然,两种算法在代码实现上几乎没有什么差别,但在压缩上想要看到效果,还真让我费了点事!
控制台输出如下所示:
引用
原文: zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org
长度: 529
压缩后:
长度: 76
解压缩后: zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org,zlex@zlex.org,snowolf@zlex.org,zlex.snowolf@zlex.org
长度: 529
529字节-->76字节!
GZIP本身不需要太长的内容,经过压缩就能体现出压缩效果,而BZip2则需要压缩很长的内容时,才能体现其压缩效果,这说明BZip2更适合大数据压缩?!
Commons Compress不仅支持BZip2算法实现,同时也支持GZip算法实现。对于GZip算法实现,与Java原生实现基本上没有什么差别。其源代码分析,仅仅是做了简单的包装。
不过有必要提及的一点是,Commons Compress为压缩(GZip和BZip2)构建了压缩算法工厂类CompressorStreamFactory。通过这个类可以方便的构建GZip和BZip2的输入输出流,关键字分别为“gz”和“bzip2”。
GZip
Java代码
- // GzipCompressorInputStream
- CompressorInputStream gzipIn = new CompressorStreamFactory()
- .createCompressorInputStream("gz", is);
- // GzipCompressorOutputStream
- CompressorOutputStream gzipOut = new CompressorStreamFactory()
- .createCompressorOutputStream("gz", os);
BZip2
Java代码
- // BZip2CompressorInputStream
- CompressorInputStream bzip2In = new CompressorStreamFactory()
- .createCompressorInputStream("bzip2", is);
- // BZip2CompressorOutputStream
- CompressorOutputStream bzip2Out = new CompressorStreamFactory()
- .createCompressorOutputStream("bzip2", os);
GZip和BZip2在算法实现步骤上基本上没有什么差别,如果有必要统一,可按上述代码实现!
相关文章
- 暂无相关文章
用户点评