Java开发中字符集的使用

373 阅读1分钟

实际代码开发中,经常会用到字符集,UTF-8GBK

如:

1,String类里获取字节数组的方法:

public byte[] getBytes(String charsetName) throws UnsupportedEncodingException {
    // ...
}
public byte[] getBytes(Charset charset) {
    // ...
}

2,HttpServletRequestHttpServletResponse中设置字符集的方法:

// HttpServletRequest
public void setCharacterEncoding(String env) throws java.io.UnsupportedEncodingException;
// HttpServletResponse
public void setCharacterEncoding(String charset);

问题:

很多人遇到这种情况就在参数中直接用硬编码的方式。这样虽不会造成程序运行错误,但也不好维护,建议使用常量的方式。

其实JDK和一些第三方的包中已经为我们预定义了很多常用的字符集。我们直接拿来使用即可,还不用自己在项目中定义:

JDK自带的StandardCharsets类:

注意:该类从JDK1.7才开始引入。

public final class StandardCharsets {
    
    public static final Charset US_ASCII = Charset.forName("US-ASCII");
    
    public static final Charset ISO_8859_1 = Charset.forName("ISO-8859-1");

    public static final Charset UTF_8 = Charset.forName("UTF-8");

    public static final Charset UTF_16BE = Charset.forName("UTF-16BE");

    public static final Charset UTF_16LE = Charset.forName("UTF-16LE");

    public static final Charset UTF_16 = Charset.forName("UTF-16");
}

Apachecomons-compress包中的CharsetNames类:

public class CharsetNames {

    public static final String ISO_8859_1 = "ISO-8859-1";
    
    public static final String US_ASCII = "US-ASCII";
    
    public static final String UTF_16 = "UTF-16";
    
    public static final String UTF_16BE = "UTF-16BE";
    
    public static final String UTF_16LE = "UTF-16LE";
    
    public static final String UTF_8 = "UTF-8";
}