NIO读取小文件,还是那么快吗?

1,348 阅读3分钟

  都知道NIO在读取大文件的时候都比较快。但是在小文件的写入就不是这样了(这个例子源于使用1G的内存如何找到10G大小的文件出现频率最高的数字,后来觉得NIO读写大文件有优势,那么小文件读写也应该比较快吧!)。

  下面是一个使用NIO来像文件写入String的例子:

package nio;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;

public class NIOWriter {

    private FileChannel fileChannel;
    private ByteBuffer buf;

    @SuppressWarnings("resource")
    public NIOWriter(File file, int capacity) {
        try {
            TimeMonitor.start();
            fileChannel = new FileOutputStream(file).getChannel();
            TimeMonitor.end("fileChannel = new FileOutputStream(file).getChannel()");
            buf = ByteBuffer.allocate(capacity);

        } catch (Exception e) {
            e.printStackTrace();
        }

    }

    /**
     * 不采用递归是因为如果字符串过大而缓存区过小引发StackOverflowException
     * 
     * @param str
     * @throws IOException
     */
    public void write(String str) throws IOException {
        TimeMonitor.start();
        int length = str.length();
        byte[] bytes = str.getBytes();
        int startPosition = 0;
        do {
            startPosition = write0(bytes, startPosition);
        } while (startPosition < length);
        TimeMonitor.end("write(String str)");
    }

    public int write0(byte[] bytes, int position) throws IOException {
        if (position >= bytes.length) {
            return position;
        }
        while (buf.hasRemaining()) {
            if (position < bytes.length) {
                buf.put(bytes[position]);
                position++;
            } else {
                break;
            }
        }
        buf.flip();
        fileChannel.write(buf);

        buf.clear();
        return position;
    }
    /**
     * 强制写入数据。并且关闭连接
     */
    public void close() {
        try {
            fileChannel.close();
        } catch (IOException e) {

            e.printStackTrace();
        }
    }    
}

  然后使用JMH与其他小文件写入方式来作比较。按理说,使用NIO的方式每次都有缓存,速度应该会很快,但实际却不是这样。下面做了一个比较,分别使用FileWriter,buffer,直接写,以及NIO的方式:

package nio;

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FileWriter;
import java.io.IOException;
import java.util.concurrent.TimeUnit;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

public class WriterTest {

    static int writeCount = 1;
    static String inputStr = "very woman is a" + " treasure but way too often we forget"
            + " how precious they are. We get lost in daily "
            + "chores and stinky diapers, in work deadlines and dirty dishes, in daily errands and occasional breakdowns.";

    public static void main(String[] args) throws Exception {
        Options opt = new OptionsBuilder().include(WriterTest.class.getSimpleName()).forks(1).build();
        new Runner(opt).run();

    }

    @Benchmark
    @BenchmarkMode(Mode.SingleShotTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    public void TestFileWriter() throws IOException {
        File file = new File("/Users/xujianxing/Desktop/fileWithWriter.txt");
        FileWriter fileWriter = new FileWriter(file);
        for (int i = 0; i < writeCount; i++) {
            fileWriter.write("JAVA TEST");
        }

    }

    @Benchmark
    @BenchmarkMode(Mode.SingleShotTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    public void TestBuffer() throws IOException {
        File file = new File("/Users/xujianxing/Desktop/buffer.txt");
        BufferedOutputStream buffer = new BufferedOutputStream(new FileOutputStream(file));
        for (int i = 0; i < writeCount; i++) {
            buffer.write(inputStr.getBytes());
        }
        buffer.flush();
        buffer.close();
    }

    @Benchmark
    @BenchmarkMode(Mode.SingleShotTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    public void TestNormal() throws IOException {
        File file = new File("/Users/xujianxing/Desktop/normal.txt");
        FileOutputStream out = new FileOutputStream(file);
        for (int i = 0; i < writeCount; i++) {
            out.write(inputStr.getBytes());

        }
        out.close();

    }

    @Benchmark
    @BenchmarkMode(Mode.SingleShotTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    public void TestNIO() throws IOException {
        File file = new File("/Users/xujianxing/Desktop/nio.txt");
        NIOWriter nioWriter = new NIOWriter(file, 2048);
        for (int i = 0; i < writeCount; i++) {
            nioWriter.write(inputStr);

        }
       nioWriter.close();

    }

}

  然而测试的结果却大跌眼镜,看一看测试的结果:

Benchmark                  Mode  Cnt  Score   Error  Units
WriterTest.TestBuffer        ss       0.265          ms/op
WriterTest.TestFileWriter    ss       0.232          ms/op
WriterTest.TestNIO           ss       8.479          ms/op
WriterTest.TestNormal        ss       0.217          ms/op

  NIO的测试结果是最差的,结果却花了8ms之多!而且写次数越多,差距越大。

  这到底是为什么呢?

  写了一个简单的测试类来记录花费的时间:

package nio;

public class TimeMonitor {
    private static long start = 0;

    private static long end = 0;

    public static void start() {

        start = System.currentTimeMillis();
        end = 0;

    }

    public static void end(String  tag) {
        end = System.currentTimeMillis();
        System.out.println("time  coast:"+tag+"---------->" + (end - start));
        end = 0;
        start = 0;
    }

}

  然后在觉得可能会耗时的地方上(比如channel的获得,channel的写入等)输出结果是这样的:

time  coast:fileChannel = new FileOutputStream(file).getChannel()---------->5
time  coast:write0---------->0
time  coast:fileChannel.write(buf)---------->2

  从输出的结果看,果然是在创建channel的时候已经写channel的时候花费了大量的时间。但是多余的一毫秒多哪去了?

  改正一下测试类,结果是这样的:

@Benchmark
    @BenchmarkMode(Mode.SingleShotTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    public void TestNIO() throws IOException {
        TimeMonitor.start();
        File file = new File("/Users/xujianxing/Desktop/nio.txt");
        NIOWriter nioWriter = new NIOWriter(file, 2048);
//        TimeMonitor.end("NIOWriter nioWriter = new NIOWriter(file, 2048);");
        for (int i = 0; i < writeCount; i++) {
//            TimeMonitor.start();
            nioWriter.write(inputStr);
//            TimeMonitor.end("nioWriter.write(inputStr)");
        }
//        TimeMonitor.start();
        nioWriter.close();
        TimeMonitor.end("    nioWriter.close()");
    }

}

  输出结果是这样的(不同的机器可能会不一样,每次耗费的时间也不一样):

 time  coast:    nioWriter.close()---------->6

  发现,耗费多出的1毫秒多应该是JMH自己本身耗费的时间。但奇怪的是测试其他普通方式读写却没有发现这种情况,为什么会这样尚不得而知。不过,不要使用NIO读取小文件肯定是正确的。