这是我参与8月更文挑战的第12天,活动详情查看:8月更文挑战
一、通过 URL 下载文件
通过 URL 下载文件,主要步骤:
- 打开
URL连接,连接输入流 - 下载文件
方式有:
Java IO方式Java NIO方式HttpClient方式Apache Commons IO方式- 可恢复断点下载
(1)Java IO 方式
最基础的 API:利用 Java IO。
代码如下:
@Test
public void test() {
String FILE_NAME = "";
String FILE_URL = "";
try (BufferedInputStream in = new BufferedInputStream(new URL(FILE_URL).openStream());
FileOutputStream fileOutputStream = new FileOutputStream(FILE_NAME)
) {
byte[] dataBuffer = new byte[1024];
int bytesRead;
while ((bytesRead = in.read(dataBuffer, 0, 1024)) != -1) {
fileOutputStream.write(dataBuffer, 0, bytesRead);
}
} catch (IOException e) {
// handle exception
}
}
下载 99.1 MB,花费了 90564 ms,即 90s。
当然,我们这里用了缓冲池 byte[] dataBuffer = new byte[1024];,可以不用 BufferedInputStream 了。
(2)Java NIO 方式
Java NIO 通过打开两个 Channel,让数据在两个 Channel 之间传输,而不是把数据先缓存在应用内存中。
-
比一般在缓冲池读取流高效
-
直接从文件系统缓冲到指定文件,不用再缓存一份在应用内存
-
在
Linux和UNIX系统里,会使用零拷贝(zero-copy)来减少内核态到用户态的切换
代码如下:
@Test
public void test() {
String FILE_NAME = "";
String FILE_URL = "";
ReadableByteChannel readableByteChannel =
Channels.newChannel(new URL(FILE_URL).openStream());
FileOutputStream fileOutputStream = new FileOutputStream(FILE_NAME);
fileOutputStream.getChannel()
.transferFrom(readableByteChannel, 0, Long.MAX_VALUE);
}
下载 99.1 MB,花费了 84359 ms,即 84s。
(3)HttpClient 方式
AsyncHttpClient 是异步 HTTP 请求,底层使用 Netty。
代码如下:
@Test
public void test() {
AsyncHttpClient client = Dsl.asyncHttpClient();
FileOutputStream stream = new FileOutputStream(FILE_NAME);
client.prepareGet(FILE_URL).execute(new AsyncCompletionHandler<FileOutputStream>() {
@Override
public State onBodyPartReceived(HttpResponseBodyPart bodyPart)
throws Exception {
// 通过 FileChannel 直接将这部分数据写入指定文件中
stream.getChannel().write(bodyPart.getBodyByteBuffer());
return State.CONTINUE;
}
@Override
public FileOutputStream onCompleted(Response response) {
System.out.println(System.currentTimeMillis() - now);
return stream;
}
});
}
(4)Apache Commons IO 方式
这是 apache commons 包里方法,也是工作中最常用到的。
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-io</artifactId>
<version>1.3.2</version>
</dependency>
最长用的:IOUtils.copy(input, output)。
这里需要紧记:文件流的关闭
- 可以搭配
Java 7:Try-with-resources方式,资源自动关闭 - 这种方式跟方式1 是一样的,它底层也是开辟了缓冲池,池子大小为 4098
FileUtils.copyURLToFile():底层调用也走 IOUtils.copy(input, output)。
代码如下:
@Test
public void test() {
FileUtils.copyURLToFile(
new URL(FILE_URL),
new File(FILE_NAME));
}
加入了超时机制:
URLConnection connection = source.openConnection();
connection.setConnectTimeout(connectionTimeout);
connection.setReadTimeout(readTimeout);
(5)可恢复断点下载
网络连接可能失败,那么可恢复断点下载就很有用了。
@Test
public void test() {
File outputFile = new File(FILE_NAME);
URLConnection downloadFileConnection =
addFileResumeFunctionality(FILE_URL, outputFile);
long size = transferDataAndGetBytesDownloaded(downloadFileConnection, outputFile);
System.out.println("size : " + size);
}
private URLConnection addFileResumeFunctionality(String downloadUrl, File outputFile)
throws IOException, URISyntaxException {
long existingFileSize = 0L;
URLConnection downloadFileConnection = new URI(downloadUrl).toURL()
.openConnection();
if (outputFile.exists() && downloadFileConnection instanceof HttpURLConnection) {
HttpURLConnection httpFileConnection = (HttpURLConnection) downloadFileConnection;
HttpURLConnection tmpFileConn = (HttpURLConnection) downloadFileConnection;
tmpFileConn.setRequestMethod("HEAD");
long fileLength = tmpFileConn.getContentLengthLong();
existingFileSize = outputFile.length();
if (existingFileSize < fileLength) {
httpFileConnection.setRequestProperty("Range", "bytes=" + existingFileSize + "-" + fileLength);
} else {
throw new IOException("File Download already completed.");
}
}
return downloadFileConnection;
}
private long transferDataAndGetBytesDownloaded(URLConnection downloadFileConnection, File outputFile)
throws IOException {
long bytesDownloaded = 0;
try (InputStream is = downloadFileConnection.getInputStream();
OutputStream os = new FileOutputStream(outputFile, true)
) {
byte[] buffer = new byte[1024];
int bytesCount;
while ((bytesCount = is.read(buffer)) > 0) {
os.write(buffer, 0, bytesCount);
bytesDownloaded += bytesCount;
}
}
return bytesDownloaded;
}
输出:
# 第一次上传
size : 99139960
# 第二次上传
java.io.IOException: File Download already completed.
at com.donaldy.file.DownloadFileTest.addFileResumeFunctionality(DownloadFileTest.java:116)
at com.donaldy.file.DownloadFileTest.tesResumableDownload(DownloadFileTest.java:94)